Working with Kaldi's Matrices

This tutorial demonstrates how to use Kaldi's matrices in Python.

The following table summarizes the matrix types in Kaldi that have been wrapped to Python.

Kaldi Types Python Types
Vector<float> FloatVector
SubVector<float> FloatSubVector
Matrix<float> FloatMatrix
SubMatrix<float> FloatSubMatrix

All of the Python types above can be converted to Numpy arrays without copying the underlying memory buffers. In addition, FloatSubVector and FloatSubMatrix can be constructed directly from Numpy arrays without data copying.

Note

Only the single precision floating point type has been wrapped to Python.

FloatVector

The following code shows how to use FloatVector in Python.

Example usage of FloatVector
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#!/usr/bin/env python3

import kaldi

f = kaldi.FloatVector(3)
f[0] = 10
print(f)

g = f.numpy()
g[1] = 20
print(f)

Its output is

[ 10 0 0 ]

[ 10 20 0 ]
1
#!/usr/bin/env python3

This is a hint that it needs python3. At present, we support only python3 in Kaldi Pybind.

3
import kaldi

This imports the Kaldi Pbyind package. If you encounter an import error, please make sure that PYTHONPATH has been set to point to KALDI_ROOT/src/pybind.

5
f = kaldi.FloatVector(3)

This creates an object of FloatVector containing 3 elements which are by default initialized to zero.

7
print(f)

This prints the value of the FloatVector object to the console. Note that you use operator () in C++ to access the elements of a Vector<float> object; Python code uses [].

9
g = f.numpy()

This creates a Numpy array object g from f. No memory is copied here. g shares the underlying memory with f.

10
g[1] = 20

This also changes f since it shares the same memory with g. You can verify that f is changed from the output.

Hint

We recommend that you invoke the numpy() method of a FloatVector object to get a Numpy ndarray object and to manipulate this Numpy object. Since it shares the underlying memory with the FloatVector object, every operation you perform to this Numpy ndarray object is visible to the FloatVector object.

FloatSubVector

The following code shows how to use FloatSubVector in Python.

Example usage of FloatSubVector
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
#!/usr/bin/env python3

import kaldi
import numpy as np

v = np.array([10, 20, 30], dtype=np.float32)
f = kaldi.FloatSubVector(v)

f[0] = 0
print(v)

g = f.numpy()
g[1] = 100
print(v)

Its output is

[ 0. 20. 30.]
[  0. 100.  30. ]
6
7
v = np.array([10, 20, 30], dtype=np.float32)
f = kaldi.FloatSubVector(v)

This creates a FloatSubVector object f from a Numpy ndarray object v. No memory is copied here. f shares the underlying memory with v. Note that the dtype of v has to be np.float32; otherwise, you will get a runtime error when creating f.

9
f[0] = 0

This uses [] to access the elements of f. It also changes v since f shares the same memory with v.

12
g = f.numpy()

This create a Numpy ndarray object g from f. No memory is copied here. g shares the same memory with f.

13
g[1] = 100

This also changes v because of memory sharing.

FloatMatrix

The following code shows how to use FloatMatrix in Python.

Example usage of FloatMatrix
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#!/usr/bin/env python3

import kaldi

f = kaldi.FloatMatrix(2, 3)
f[1, 2] = 100
print(f)

g = f.numpy()
g[0, 0] = 200
print(f)

Its output is

[
 0 0 0
 0 0 100 ]

[
 200 0 0
 0 0 100 ]
5
f = kaldi.FloatMatrix(2, 3)

This creates an object f of FloatMatrix with 2 rows and 3 columns.

6
f[1, 2] = 100

This uses [] to access the elements of f.

7
print(f)

This prints the value of f to the console.

9
g = f.numpy()

This creates a Numpy ndarray object g from f. No memory is copied here. g shares the underlying memory with f.

10
g[0, 0] = 200

This also changes f due to memory sharing.

FloatSubMatrix

The following code shows how to use FloatSubMatrix in Python.

Example usage of FloatSubMatrix
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#!/usr/bin/env python3

import kaldi
import numpy as np

m = np.array([[1, 2, 3], [10, 20, 30]], dtype=np.float32)
f = kaldi.FloatSubMatrix(m)

f[1, 2] = 100
print(m)
print()

g = f.numpy()
g[0, 0] = 200
print(m)

Its output is

[[  1.   2.   3.]
 [ 10.  20. 100.]]

[[200.   2.   3.]
 [ 10.  20. 100.]]
6
7
m = np.array([[1, 2, 3], [10, 20, 30]], dtype=np.float32)
f = kaldi.FloatSubMatrix(m)

This creates an object f of FloatSubMatrix from a Numpy ndarray object m. f shares the underlying memory with m. Note that the dtype of m has to be np.float32. Otherwise you will get a runtime error.

 9
10
f[1, 2] = 100
print(m)

This uses [] to access the elements of f. Note that m is also changed due to memory sharing.

13
g = f.numpy()

This creates a Numpy ndarray object from f. No memory is copied here. g shares the underlying memory with f.

14
g[0, 0] = 200

This changes f and m.