Kaldi logo

Welcome to Kaldi Pybind's documentation!

About Kaldi Pybind

Kaldi logo

Kaldi Pybind is a Python wrapper for Kaldi using Pybind11. It is still under active development.

Everything related to Kaldi Pybind is put in the pybind11 branch.

Getting Started

Compiling Kaldi Pybind

First, you have to install Kaldi. You can find detailed information for Kaldi installation from http://kaldi-asr.org/doc/install.html.

Note

Kaldi Pybind is still under active development and has not yet been merged into the master branch. You should checkout the pybind11 branch before compilation.

Note

We support ONLY Python3. If you are still using Python2, please upgrade to Python3. Python3.5 is known to work.

The following is a quick start:

git clone https://github.com/kaldi-asr/kaldi.git
cd kaldi
git checkout pybind11
cd tools
extras/check_dependencies.sh
make -j4
cd ../src
./configure --shared
make -j4
cd pybind
pip install pybind11
make
make test

After a successful compilation, you have to modify the environment variable PYTHONPATH:

export KALDI_ROOT=/path/to/your/kaldi
export PYTHONPATH=$KALDI_ROOT/src/pybind:$PYTHONPATH

Hint

There is no make install. Once compiled, you are ready to use Kaldi Pybind.

Working with Kaldi's Matrices

This tutorial demonstrates how to use Kaldi's matrices in Python.

The following table summarizes the matrix types in Kaldi that have been wrapped to Python.

Kaldi Types Python Types
Vector<float> FloatVector
SubVector<float> FloatSubVector
Matrix<float> FloatMatrix
SubMatrix<float> FloatSubMatrix

All of the Python types above can be converted to Numpy arrays without copying the underlying memory buffers. In addition, FloatSubVector and FloatSubMatrix can be constructed directly from Numpy arrays without data copying.

Note

Only the single precision floating point type has been wrapped to Python.

FloatVector

The following code shows how to use FloatVector in Python.

Example usage of FloatVector
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#!/usr/bin/env python3

import kaldi

f = kaldi.FloatVector(3)
f[0] = 10
print(f)

g = f.numpy()
g[1] = 20
print(f)

Its output is

[ 10 0 0 ]

[ 10 20 0 ]
1
#!/usr/bin/env python3

This is a hint that it needs python3. At present, we support only python3 in Kaldi Pybind.

3
import kaldi

This imports the Kaldi Pbyind package. If you encounter an import error, please make sure that PYTHONPATH has been set to point to KALDI_ROOT/src/pybind.

5
f = kaldi.FloatVector(3)

This creates an object of FloatVector containing 3 elements which are by default initialized to zero.

7
print(f)

This prints the value of the FloatVector object to the console. Note that you use operator () in C++ to access the elements of a Vector<float> object; Python code uses [].

9
g = f.numpy()

This creates a Numpy array object g from f. No memory is copied here. g shares the underlying memory with f.

10
g[1] = 20

This also changes f since it shares the same memory with g. You can verify that f is changed from the output.

Hint

We recommend that you invoke the numpy() method of a FloatVector object to get a Numpy ndarray object and to manipulate this Numpy object. Since it shares the underlying memory with the FloatVector object, every operation you perform to this Numpy ndarray object is visible to the FloatVector object.

FloatSubVector

The following code shows how to use FloatSubVector in Python.

Example usage of FloatSubVector
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
#!/usr/bin/env python3

import kaldi
import numpy as np

v = np.array([10, 20, 30], dtype=np.float32)
f = kaldi.FloatSubVector(v)

f[0] = 0
print(v)

g = f.numpy()
g[1] = 100
print(v)

Its output is

[ 0. 20. 30.]
[  0. 100.  30. ]
6
7
v = np.array([10, 20, 30], dtype=np.float32)
f = kaldi.FloatSubVector(v)

This creates a FloatSubVector object f from a Numpy ndarray object v. No memory is copied here. f shares the underlying memory with v. Note that the dtype of v has to be np.float32; otherwise, you will get a runtime error when creating f.

9
f[0] = 0

This uses [] to access the elements of f. It also changes v since f shares the same memory with v.

12
g = f.numpy()

This create a Numpy ndarray object g from f. No memory is copied here. g shares the same memory with f.

13
g[1] = 100

This also changes v because of memory sharing.

FloatMatrix

The following code shows how to use FloatMatrix in Python.

Example usage of FloatMatrix
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#!/usr/bin/env python3

import kaldi

f = kaldi.FloatMatrix(2, 3)
f[1, 2] = 100
print(f)

g = f.numpy()
g[0, 0] = 200
print(f)

Its output is

[
 0 0 0
 0 0 100 ]

[
 200 0 0
 0 0 100 ]
5
f = kaldi.FloatMatrix(2, 3)

This creates an object f of FloatMatrix with 2 rows and 3 columns.

6
f[1, 2] = 100

This uses [] to access the elements of f.

7
print(f)

This prints the value of f to the console.

9
g = f.numpy()

This creates a Numpy ndarray object g from f. No memory is copied here. g shares the underlying memory with f.

10
g[0, 0] = 200

This also changes f due to memory sharing.

FloatSubMatrix

The following code shows how to use FloatSubMatrix in Python.

Example usage of FloatSubMatrix
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#!/usr/bin/env python3

import kaldi
import numpy as np

m = np.array([[1, 2, 3], [10, 20, 30]], dtype=np.float32)
f = kaldi.FloatSubMatrix(m)

f[1, 2] = 100
print(m)
print()

g = f.numpy()
g[0, 0] = 200
print(m)

Its output is

[[  1.   2.   3.]
 [ 10.  20. 100.]]

[[200.   2.   3.]
 [ 10.  20. 100.]]
6
7
m = np.array([[1, 2, 3], [10, 20, 30]], dtype=np.float32)
f = kaldi.FloatSubMatrix(m)

This creates an object f of FloatSubMatrix from a Numpy ndarray object m. f shares the underlying memory with m. Note that the dtype of m has to be np.float32. Otherwise you will get a runtime error.

 9
10
f[1, 2] = 100
print(m)

This uses [] to access the elements of f. Note that m is also changed due to memory sharing.

13
g = f.numpy()

This creates a Numpy ndarray object from f. No memory is copied here. g shares the underlying memory with f.

14
g[0, 0] = 200

This changes f and m.

Working with Kaldi's IO

This tutorial shows how to read and write ark/scp files in Python.

Reading and Writing Alignment Information

The following class can be used to write alignment information to files:

  • IntVectorWriter

And the following classes can be used to read alignment information from files:

  • SequentialIntVectorReader
  • RandomAccessIntVectorReader

The following code shows how to write and read alignment information.

Example of reading and writing align information
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#!/usr/bin/env python3

import kaldi

wspecifier = 'ark,scp:/tmp/ali.ark,/tmp/ali.scp'

writer = kaldi.IntVectorWriter(wspecifier)
writer.Write(key='foo', value=[1, 2, 3])
writer.Write('bar', [10, 20])
writer.Close()

rspecifier = 'scp:/tmp/ali.scp'
reader = kaldi.SequentialIntVectorReader(rspecifier)

for key, value in reader:
    print(key, value)

reader.Close()

reader = kaldi.RandomAccessIntVectorReader(rspecifier)
value1 = reader['foo']
print(value1)

value2 = reader['bar']
print(value2)
reader.Close()

Its output is

foo [1, 2, 3]
bar [10, 20]
[1, 2, 3]
[10, 20]

The output of the following command

$ copy-int-vector scp:/tmp/ali.scp ark,t:-

is

copy-int-vector scp:/tmp/ali.scp ark,t:-
foo 1 2 3
bar 10 20
LOG (copy-int-vector[5.5.792~1-f5875b]:main():copy-int-vector.cc:83) Copied 2 vectors of int32.
5
wspecifier = 'ark,scp:/tmp/ali.ark,/tmp/ali.scp'

It creates a write specifier wspecifier indicating that the alignment information is going to be written into files /tmp/ali.ark and /tmp/ali.scp.

8
writer.Write(key='foo', value=[1, 2, 3])

It writes a list [1, 2, 3] to file with key == foo. Note that you can use keyword arguments while writing.

9
writer.Write('bar', [10, 20])

It writes a list [10, 20] to file with key == bar.

10
writer.Close()

It closes the writer.

Note

It is a best practice to close the file when it is no longer needed.

12
13
rspecifier = 'scp:/tmp/ali.scp'
reader = kaldi.SequentialIntVectorReader(rspecifier)

It creates a sequential reader.

15
16
for key, value in reader:
    print(key, value)

It uses a for loop to iterate the reader.

18
reader.Close()

It closes the reader.

20
reader = kaldi.RandomAccessIntVectorReader(rspecifier)

It creates a random access reader.

21
22
value1 = reader['foo']
print(value1)

It reads the value of foo and prints it out.

24
25
value2 = reader['bar']
print(value2)

It reads the value of bar and prints it out.

26
reader.Close()

Finally, it closes the reader.


The following code example achieves the same effect as the above one except that you do not need to close the file manually.

Example of reading and writing align information using with
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#!/usr/bin/env python3

import kaldi

wspecifier = 'ark,scp:/tmp/ali.ark,/tmp/ali.scp'

with kaldi.IntVectorWriter(wspecifier) as writer:
    writer.Write(key='foo', value=[1, 2, 3])
    writer.Write('bar', [10, 20])

# Note that you do NOT need to close the file.

rspecifier = 'scp:/tmp/ali.scp'
with kaldi.SequentialIntVectorReader(rspecifier) as reader:
    for key, value in reader:
        print(key, value)

rspecifier = 'scp:/tmp/ali.scp'
with kaldi.RandomAccessIntVectorReader(rspecifier) as reader:
    value1 = reader['foo']
    print(value1)

    value2 = reader['bar']
    print(value2)

Reading and Writing Matrices

Using xfilename

The following code demonstrates how to read and write FloatMatrix using xfilename.

Example of reading and writing matrices with xfilename
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
#!/usr/bin/env python3

import kaldi

m = kaldi.FloatMatrix(2, 2)
m[0, 0] = 10
m[1, 1] = 20

xfilename = '/tmp/lda.mat'
kaldi.write_mat(m, xfilename, binary=True)

g = kaldi.read_mat(xfilename)
print(g)

The output of the above program is

[
 10 0
 0 20 ]
5
6
7
m = kaldi.FloatMatrix(2, 2)
m[0, 0] = 10
m[1, 1] = 20

It creates a FloatMatrix and sets its diagonal to [10, 20].

 9
10
xfilename = '/tmp/lda.mat'
kaldi.write_mat(m, xfilename, binary=True)

It writes the matrix to /tmp/lda.mat in binary format. kaldi.write_mat is used to write the matrix to the specified file. You can specify whether it is written in binary format or text format.

12
13
g = kaldi.read_mat(xfilename)
print(g)

It reads the matrix back and prints it to the console. Note that you do not need to specify whether the file to read is in binary or not. kaldi.read_mat will figure out the format automatically.

Using specifier

The following code demonstrates how to read and write FloatMatrix using specifier.

Example of reading and writing matrices with specifier
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#!/usr/bin/env python3

import numpy as np
import kaldi

wspecifier = 'ark,scp:/tmp/feats.ark,/tmp/feats.scp'

writer = kaldi.MatrixWriter(wspecifier)

m = np.arange(6).reshape(2, 3).astype(np.float32)
writer.Write(key='foo', value=m)

g = kaldi.FloatMatrix(2, 2)
g[0, 0] = 10
g[1, 1] = 20
writer.Write('bar', g)

writer.Close()

rspecifier = 'scp:/tmp/feats.scp'
reader = kaldi.SequentialMatrixReader(rspecifier)
for key, value in reader:
    assert key in ['foo', 'bar']
    if key == 'foo':
        np.testing.assert_array_equal(value.numpy(), m)
    else:
        np.testing.assert_array_equal(value.numpy(), g.numpy())

reader.Close()

reader = kaldi.RandomAccessMatrixReader(rspecifier)
assert 'foo' in reader
assert 'bar' in reader
np.testing.assert_array_equal(reader['foo'].numpy(), m)
np.testing.assert_array_equal(reader['bar'].numpy(), g.numpy())
reader.Close()
6
7
8
wspecifier = 'ark,scp:/tmp/feats.ark,/tmp/feats.scp'

writer = kaldi.MatrixWriter(wspecifier)

This creates a matrix writer.

10
11
m = np.arange(6).reshape(2, 3).astype(np.float32)
writer.Write(key='foo', value=m)

It creates a Numpy array object of type np.float32 and writes it to file with the key foo. Note that the type of the Numpy array has to be of type np.float32. The program throws if the type is not np.float32.

13
14
15
16
g = kaldi.FloatMatrix(2, 2)
g[0, 0] = 10
g[1, 1] = 20
writer.Write('bar', g)

It creates a FloatMatrix and writes it to file with the key bar.

Hint

kaldi.MatrixWriter accepts Numpy array objects of type np.float32 as well as kaldi.FloatMatrix objects.

18
writer.Close()

It closes the writer.

20
21
rspecifier = 'scp:/tmp/feats.scp'
reader = kaldi.SequentialMatrixReader(rspecifier)

It creates a sequential matrix reader.

21
22
23
24
25
26
27
reader = kaldi.SequentialMatrixReader(rspecifier)
for key, value in reader:
    assert key in ['foo', 'bar']
    if key == 'foo':
        np.testing.assert_array_equal(value.numpy(), m)
    else:
        np.testing.assert_array_equal(value.numpy(), g.numpy())

It uses a for loop to iterate the sequential reader.

29
reader.Close()

It closes the sequential reader.

31
reader = kaldi.RandomAccessMatrixReader(rspecifier)

It creates a random access matrix reader.

32
33
assert 'foo' in reader
assert 'bar' in reader

It uses in to test whether the reader contains a given key.

34
35
np.testing.assert_array_equal(reader['foo'].numpy(), m)
np.testing.assert_array_equal(reader['bar'].numpy(), g.numpy())

It uses [] to read the value of a specified key.

36
reader.Close()

It closes the random access reader.


The following code example achieves the same effect as the above one except that you do not need to close the file manually.

Example of reading and writing FloatMatrix using with
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#!/usr/bin/env python3

import numpy as np
import kaldi

wspecifier = 'ark,scp:/tmp/feats.ark,/tmp/feats.scp'

with kaldi.MatrixWriter(wspecifier) as writer:
    m = np.arange(6).reshape(2, 3).astype(np.float32)
    writer.Write(key='foo', value=m)

    g = kaldi.FloatMatrix(2, 2)
    g[0, 0] = 10
    g[1, 1] = 20
    writer.Write('bar', g)

rspecifier = 'scp:/tmp/feats.scp'
with kaldi.SequentialMatrixReader(rspecifier) as reader:
    for key, value in reader:
        assert key in ['foo', 'bar']
        if key == 'foo':
            np.testing.assert_array_equal(value.numpy(), m)
        else:
            np.testing.assert_array_equal(value.numpy(), g.numpy())

with kaldi.RandomAccessMatrixReader(rspecifier) as reader:
    assert 'foo' in reader
    assert 'bar' in reader
    np.testing.assert_array_equal(reader['foo'].numpy(), m)
    np.testing.assert_array_equal(reader['bar'].numpy(), g.numpy())