
Welcome to Kaldi Pybind's documentation!¶
About Kaldi Pybind¶

Kaldi Pybind is a Python wrapper for Kaldi using Pybind11. It is still under active development.
Everything related to Kaldi Pybind is put in the pybind11 branch.
Getting Started¶
Compiling Kaldi Pybind¶
First, you have to install Kaldi. You can find detailed information for Kaldi installation from http://kaldi-asr.org/doc/install.html.
Note
Kaldi Pybind is still under active development and has not
yet been merged into the master branch. You should checkout
the pybind11
branch before compilation.
Note
We support ONLY Python3. If you are still using Python2, please upgrade to Python3. Python3.5 is known to work.
The following is a quick start:
git clone https://github.com/kaldi-asr/kaldi.git
cd kaldi
git checkout pybind11
cd tools
extras/check_dependencies.sh
make -j4
cd ../src
./configure --shared
make -j4
cd pybind
pip install pybind11
make
make test
After a successful compilation, you have to modify the environment
variable PYTHONPATH
:
export KALDI_ROOT=/path/to/your/kaldi
export PYTHONPATH=$KALDI_ROOT/src/pybind:$PYTHONPATH
Hint
There is no make install
. Once compiled, you are ready to
use Kaldi Pybind.
Working with Kaldi's Matrices¶
This tutorial demonstrates how to use Kaldi's matrices in Python.
The following table summarizes the matrix types in Kaldi that have been wrapped to Python.
Kaldi Types | Python Types |
---|---|
Vector<float> |
FloatVector |
SubVector<float> |
FloatSubVector |
Matrix<float> |
FloatMatrix |
SubMatrix<float> |
FloatSubMatrix |
All of the Python types above can be converted
to Numpy arrays without copying the underlying memory
buffers. In addition, FloatSubVector
and FloatSubMatrix
can be constructed directly from Numpy arrays without data copying.
Note
Only the single precision floating point type has been wrapped to Python.
FloatVector¶
The following code shows how to use FloatVector
in Python.
FloatVector
¶1 2 3 4 5 6 7 8 9 10 11 | #!/usr/bin/env python3
import kaldi
f = kaldi.FloatVector(3)
f[0] = 10
print(f)
g = f.numpy()
g[1] = 20
print(f)
|
Its output is
[ 10 0 0 ]
[ 10 20 0 ]
1 | #!/usr/bin/env python3
|
This is a hint that it needs python3
. At present, we support
only python3
in Kaldi Pybind.
3 | import kaldi
|
This imports the Kaldi Pbyind package. If you encounter an import error, please
make sure that PYTHONPATH
has been set to point to KALDI_ROOT/src/pybind
.
5 | f = kaldi.FloatVector(3)
|
This creates an object of FloatVector
containing 3 elements
which are by default initialized to zero.
7 | print(f)
|
This prints the value of the FloatVector
object to the console.
Note that you use operator ()
in C++ to access the elements of
a Vector<float>
object; Python code uses []
.
9 | g = f.numpy()
|
This creates a Numpy array object g
from f
. No memory is copied here.
g
shares the underlying memory with f
.
10 | g[1] = 20
|
This also changes f
since it shares the same memory with g
.
You can verify that f
is changed from the output.
Hint
We recommend that you invoke the numpy()
method of a FloatVector
object to get a Numpy ndarray object and to manipulate this Numpy object.
Since it shares the underlying memory with the FloatVector
object,
every operation you perform to this Numpy ndarray object is visible to
the FloatVector
object.
FloatSubVector¶
The following code shows how to use FloatSubVector
in Python.
FloatSubVector
¶1 2 3 4 5 6 7 8 9 10 11 12 13 14 | #!/usr/bin/env python3
import kaldi
import numpy as np
v = np.array([10, 20, 30], dtype=np.float32)
f = kaldi.FloatSubVector(v)
f[0] = 0
print(v)
g = f.numpy()
g[1] = 100
print(v)
|
Its output is
[ 0. 20. 30.]
[ 0. 100. 30. ]
6 7 | v = np.array([10, 20, 30], dtype=np.float32)
f = kaldi.FloatSubVector(v)
|
This creates a FloatSubVector
object f
from a Numpy ndarray object v
.
No memory is copied here. f
shares the underlying memory with v
.
Note that the dtype
of v
has to be np.float32
; otherwise, you will
get a runtime error when creating f
.
9 | f[0] = 0
|
This uses []
to access the elements of f
. It also changes v
since f
shares the same memory with v
.
12 | g = f.numpy()
|
This create a Numpy ndarray object g
from f
. No memory is copied here.
g
shares the same memory with f
.
13 | g[1] = 100
|
This also changes v
because of memory sharing.
FloatMatrix¶
The following code shows how to use FloatMatrix
in Python.
FloatMatrix
¶1 2 3 4 5 6 7 8 9 10 11 | #!/usr/bin/env python3
import kaldi
f = kaldi.FloatMatrix(2, 3)
f[1, 2] = 100
print(f)
g = f.numpy()
g[0, 0] = 200
print(f)
|
Its output is
[
0 0 0
0 0 100 ]
[
200 0 0
0 0 100 ]
5 | f = kaldi.FloatMatrix(2, 3)
|
This creates an object f
of FloatMatrix
with
2 rows and 3 columns.
6 | f[1, 2] = 100
|
This uses []
to access the elements of f
.
7 | print(f)
|
This prints the value of f
to the console.
9 | g = f.numpy()
|
This creates a Numpy ndarray object g
from f
.
No memory is copied here. g
shares the underlying memory
with f
.
10 | g[0, 0] = 200
|
This also changes f
due to memory sharing.
FloatSubMatrix¶
The following code shows how to use FloatSubMatrix
in Python.
FloatSubMatrix
¶1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | #!/usr/bin/env python3
import kaldi
import numpy as np
m = np.array([[1, 2, 3], [10, 20, 30]], dtype=np.float32)
f = kaldi.FloatSubMatrix(m)
f[1, 2] = 100
print(m)
print()
g = f.numpy()
g[0, 0] = 200
print(m)
|
Its output is
[[ 1. 2. 3.]
[ 10. 20. 100.]]
[[200. 2. 3.]
[ 10. 20. 100.]]
6 7 | m = np.array([[1, 2, 3], [10, 20, 30]], dtype=np.float32)
f = kaldi.FloatSubMatrix(m)
|
This creates an object f
of FloatSubMatrix
from
a Numpy ndarray object m
. f
shares the underlying
memory with m
. Note that the dtype
of m
has
to be np.float32
. Otherwise you will get a runtime error.
9 10 | f[1, 2] = 100
print(m)
|
This uses []
to access the elements of f
. Note that
m
is also changed due to memory sharing.
13 | g = f.numpy()
|
This creates a Numpy ndarray object from f
. No memory
is copied here. g
shares the underlying memory with f
.
14 | g[0, 0] = 200
|
This changes f
and m
.
Working with Kaldi's IO¶
This tutorial shows how to read and write ark/scp files in Python.
Reading and Writing Alignment Information¶
The following class can be used to write alignment information to files:
IntVectorWriter
And the following classes can be used to read alignment information from files:
SequentialIntVectorReader
RandomAccessIntVectorReader
The following code shows how to write and read alignment information.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | #!/usr/bin/env python3
import kaldi
wspecifier = 'ark,scp:/tmp/ali.ark,/tmp/ali.scp'
writer = kaldi.IntVectorWriter(wspecifier)
writer.Write(key='foo', value=[1, 2, 3])
writer.Write('bar', [10, 20])
writer.Close()
rspecifier = 'scp:/tmp/ali.scp'
reader = kaldi.SequentialIntVectorReader(rspecifier)
for key, value in reader:
print(key, value)
reader.Close()
reader = kaldi.RandomAccessIntVectorReader(rspecifier)
value1 = reader['foo']
print(value1)
value2 = reader['bar']
print(value2)
reader.Close()
|
Its output is
foo [1, 2, 3]
bar [10, 20]
[1, 2, 3]
[10, 20]
The output of the following command
$ copy-int-vector scp:/tmp/ali.scp ark,t:-
is
copy-int-vector scp:/tmp/ali.scp ark,t:-
foo 1 2 3
bar 10 20
LOG (copy-int-vector[5.5.792~1-f5875b]:main():copy-int-vector.cc:83) Copied 2 vectors of int32.
5 | wspecifier = 'ark,scp:/tmp/ali.ark,/tmp/ali.scp'
|
It creates a write specifier wspecifier
indicating that
the alignment information is going to be written into files
/tmp/ali.ark
and /tmp/ali.scp
.
8 | writer.Write(key='foo', value=[1, 2, 3])
|
It writes a list [1, 2, 3]
to file with key == foo
.
Note that you can use keyword arguments while writing.
9 | writer.Write('bar', [10, 20])
|
It writes a list [10, 20]
to file with key == bar
.
10 | writer.Close()
|
It closes the writer.
Note
It is a best practice to close the file when it is no longer needed.
12 13 | rspecifier = 'scp:/tmp/ali.scp'
reader = kaldi.SequentialIntVectorReader(rspecifier)
|
It creates a sequential reader.
15 16 | for key, value in reader:
print(key, value)
|
It uses a for
loop to iterate the reader.
18 | reader.Close()
|
It closes the reader.
20 | reader = kaldi.RandomAccessIntVectorReader(rspecifier)
|
It creates a random access reader.
21 22 | value1 = reader['foo']
print(value1)
|
It reads the value of foo
and prints it out.
24 25 | value2 = reader['bar']
print(value2)
|
It reads the value of bar
and prints it out.
26 | reader.Close()
|
Finally, it closes the reader.
The following code example achieves the same effect as the above one except that you do not need to close the file manually.
with
¶1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | #!/usr/bin/env python3
import kaldi
wspecifier = 'ark,scp:/tmp/ali.ark,/tmp/ali.scp'
with kaldi.IntVectorWriter(wspecifier) as writer:
writer.Write(key='foo', value=[1, 2, 3])
writer.Write('bar', [10, 20])
# Note that you do NOT need to close the file.
rspecifier = 'scp:/tmp/ali.scp'
with kaldi.SequentialIntVectorReader(rspecifier) as reader:
for key, value in reader:
print(key, value)
rspecifier = 'scp:/tmp/ali.scp'
with kaldi.RandomAccessIntVectorReader(rspecifier) as reader:
value1 = reader['foo']
print(value1)
value2 = reader['bar']
print(value2)
|
Reading and Writing Matrices¶
Using xfilename¶
The following code demonstrates how to read and write
FloatMatrix
using xfilename
.
1 2 3 4 5 6 7 8 9 10 11 12 13 | #!/usr/bin/env python3
import kaldi
m = kaldi.FloatMatrix(2, 2)
m[0, 0] = 10
m[1, 1] = 20
xfilename = '/tmp/lda.mat'
kaldi.write_mat(m, xfilename, binary=True)
g = kaldi.read_mat(xfilename)
print(g)
|
The output of the above program is
[
10 0
0 20 ]
5 6 7 | m = kaldi.FloatMatrix(2, 2)
m[0, 0] = 10
m[1, 1] = 20
|
It creates a FloatMatrix
and sets its diagonal to [10, 20]
.
9 10 | xfilename = '/tmp/lda.mat'
kaldi.write_mat(m, xfilename, binary=True)
|
It writes the matrix to /tmp/lda.mat
in binary format.
kaldi.write_mat
is used to write the matrix
to the specified file. You can specify whether it is
written in binary format or text format.
12 13 | g = kaldi.read_mat(xfilename)
print(g)
|
It reads the matrix back and prints it to the console.
Note that you do not need to specify whether the file to
read is in binary or not. kaldi.read_mat
will figure
out the format automatically.
Using specifier¶
The following code demonstrates how to read and write
FloatMatrix
using specifier
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | #!/usr/bin/env python3
import numpy as np
import kaldi
wspecifier = 'ark,scp:/tmp/feats.ark,/tmp/feats.scp'
writer = kaldi.MatrixWriter(wspecifier)
m = np.arange(6).reshape(2, 3).astype(np.float32)
writer.Write(key='foo', value=m)
g = kaldi.FloatMatrix(2, 2)
g[0, 0] = 10
g[1, 1] = 20
writer.Write('bar', g)
writer.Close()
rspecifier = 'scp:/tmp/feats.scp'
reader = kaldi.SequentialMatrixReader(rspecifier)
for key, value in reader:
assert key in ['foo', 'bar']
if key == 'foo':
np.testing.assert_array_equal(value.numpy(), m)
else:
np.testing.assert_array_equal(value.numpy(), g.numpy())
reader.Close()
reader = kaldi.RandomAccessMatrixReader(rspecifier)
assert 'foo' in reader
assert 'bar' in reader
np.testing.assert_array_equal(reader['foo'].numpy(), m)
np.testing.assert_array_equal(reader['bar'].numpy(), g.numpy())
reader.Close()
|
6 7 8 | wspecifier = 'ark,scp:/tmp/feats.ark,/tmp/feats.scp'
writer = kaldi.MatrixWriter(wspecifier)
|
This creates a matrix writer.
10 11 | m = np.arange(6).reshape(2, 3).astype(np.float32)
writer.Write(key='foo', value=m)
|
It creates a Numpy array object of type np.float32
and
writes it to file with the key foo
. Note that the type
of the Numpy array has to be of type np.float32
.
The program throws if the type is not np.float32
.
13 14 15 16 | g = kaldi.FloatMatrix(2, 2)
g[0, 0] = 10
g[1, 1] = 20
writer.Write('bar', g)
|
It creates a FloatMatrix
and writes it to file
with the key bar
.
Hint
kaldi.MatrixWriter
accepts Numpy array objects of
type np.float32
as well as kaldi.FloatMatrix
objects.
18 | writer.Close()
|
It closes the writer.
20 21 | rspecifier = 'scp:/tmp/feats.scp'
reader = kaldi.SequentialMatrixReader(rspecifier)
|
It creates a sequential matrix reader.
21 22 23 24 25 26 27 | reader = kaldi.SequentialMatrixReader(rspecifier)
for key, value in reader:
assert key in ['foo', 'bar']
if key == 'foo':
np.testing.assert_array_equal(value.numpy(), m)
else:
np.testing.assert_array_equal(value.numpy(), g.numpy())
|
It uses a for
loop to iterate the sequential reader.
29 | reader.Close()
|
It closes the sequential reader.
31 | reader = kaldi.RandomAccessMatrixReader(rspecifier)
|
It creates a random access matrix reader.
32 33 | assert 'foo' in reader
assert 'bar' in reader
|
It uses in
to test whether the reader contains a given key.
34 35 | np.testing.assert_array_equal(reader['foo'].numpy(), m)
np.testing.assert_array_equal(reader['bar'].numpy(), g.numpy())
|
It uses []
to read the value of a specified key.
36 | reader.Close()
|
It closes the random access reader.
The following code example achieves the same effect as the above one except that you do not need to close the file manually.
with
¶1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | #!/usr/bin/env python3
import numpy as np
import kaldi
wspecifier = 'ark,scp:/tmp/feats.ark,/tmp/feats.scp'
with kaldi.MatrixWriter(wspecifier) as writer:
m = np.arange(6).reshape(2, 3).astype(np.float32)
writer.Write(key='foo', value=m)
g = kaldi.FloatMatrix(2, 2)
g[0, 0] = 10
g[1, 1] = 20
writer.Write('bar', g)
rspecifier = 'scp:/tmp/feats.scp'
with kaldi.SequentialMatrixReader(rspecifier) as reader:
for key, value in reader:
assert key in ['foo', 'bar']
if key == 'foo':
np.testing.assert_array_equal(value.numpy(), m)
else:
np.testing.assert_array_equal(value.numpy(), g.numpy())
with kaldi.RandomAccessMatrixReader(rspecifier) as reader:
assert 'foo' in reader
assert 'bar' in reader
np.testing.assert_array_equal(reader['foo'].numpy(), m)
np.testing.assert_array_equal(reader['bar'].numpy(), g.numpy())
|