|
|
|
Mat<type>, fmat, mat | dense matrix class | |
Col<type>, fcolvec, fvec, colvec, vec | dense column vector class | |
Row<type>, frowvec, rowvec | dense row vector class | |
operators | + − * % / == != <= >= < > && || |
attributes | .n_rows, .n_cols, .n_elem, .n_slices, ... | |
element access | element/object access via (), [] and .at() | |
.zeros | set all elements to zero | |
.ones | set all elements to one | |
.eye | set elements along main diagonal to one and off-diagonal elements to zero | |
.randu / .randn | set all elements to random values | |
.fill | set all elements to specified value | |
.clamp | clamp values to lower and upper limits | |
.set_size | change size without keeping elements (fast) | |
.reshape | change size while keeping elements | |
.resize | change size while keeping elements and preserving layout | |
.reset | change size to empty | |
submatrix views | read/write access to contiguous and non-contiguous submatrices | |
.get_dev_mem() | get underlying raw GPU memory pointer | |
compat. container functions | compatibility container functions | |
.diag | read/write access to matrix diagonals | |
.t / .st | return matrix transpose | |
.min / .max | return extremum value | |
.index_min / .index_max | return index of extremum value | |
.eval | force evaluation of delayed expression | |
.is_empty | check whether object is empty | |
.is_vec | check whether matrix is a vector | |
.is_square | check whether matrix is square sized | |
print object to std::cout or user specified stream | ||
.raw_print | print object without formatting |
linspace | generate vector with linearly spaced elements | |
logspace | generate vector with logarithmically spaced elements | |
eye | generate identity matrix | |
ones | generate object filled with ones | |
zeros | generate object filled with zeros | |
randu | generate object with random values (uniform distribution) | |
randn | generate object with random values (normal distribution) | |
randi | generate object with random integer values in specified interval |
abs | obtain magnitude of each element | |
accu | accumulate (sum) all elements | |
all | check whether all elements are non-zero, or satisfy a relational condition | |
any | check whether any element is non-zero, or satisfies a relational condition | |
approx_equal | approximate equality | |
as_scalar | convert 1x1 matrix to pure scalar | |
clamp | obtain clamped elements according to given limits | |
conv_to | convert/cast between matrix types | |
cross | cross product | |
det | determinant | |
diagmat | generate diagonal matrix from given matrix or vector | |
diagvec | extract specified diagonal | |
dot | dot product | |
find | find indices of non-zero elements, or elements satisfying a relational condition | |
find_finite | find indices of finite elements | |
find_nonfinite | find indices of non-finite elements | |
find_nan | find indices of NaN elements | |
index_min / index_max | indices of extremum values | |
join_rows / join_cols | concatenation of matrices | |
min / max | return extremum values | |
norm | various norms of vectors and matrices | |
normalise | normalise vectors to unit p-norm | |
pow | element-wise power | |
repmat | replicate matrix in block-like fashion | |
reshape | change size while keeping elements | |
resize | change size while keeping elements and preserving layout | |
shuffle | randomly shuffle elements | |
size | obtain dimensions of given object | |
sort | sort elements | |
sort_index | vector describing sorted order of elements | |
sum | sum of elements | |
symmatu / symmatl | generate symmetric matrix from given matrix | |
trace | sum of diagonal elements | |
trans | transpose of matrix | |
vectorise | flatten matrix into vector | |
misc functions | miscellaneous element-wise functions: exp, log, sqrt, round, sign, ... | |
trig functions | trigonometric element-wise functions: cos, sin, tan, ... |
chol | Cholesky decomposition | |
eig_sym | eigen decomposition of dense symmetric/hermitian matrix | |
lu | lower-upper decomposition | |
pinv | pseudo-inverse / generalised inverse | |
solve | solve systems of linear equations | |
svd | singular value decomposition |
stats functions | mean, median, standard deviation, variance | |
cov | covariance | |
cor | correlation |
backend configuration | configuring the use of OpenCL or CUDA backends for Bandicoot | |
output streams | streams for printing warnings and errors | |
uword / sword | shorthand for unsigned and signed integers | |
Matlab/Bandicoot syntax differences | examples of Matlab syntax and conceptually corresponding Bandicoot syntax | |
Armadillo/Bandicoot differences | conceptual differences between Bandicoot and Armadillo | |
example program | short example program | |
config.hpp | configuration options | |
direct linking | guide to linking without using the wrapper library | |
kernel cache | infrastructure for caching compiled GPU kernel functions | |
API additions | API stability and list of API additions |
float
element type)
rather than 64-bit floats (ie. double
element type)
fmat() | ||
fmat(n_rows, n_cols) | ||
fmat(n_rows, n_cols, fill_form) | (elements are initialised according to fill_form) | |
fmat(size(X)) | ||
fmat(size(X), fill_form) | (elements are initialised according to fill_form) | |
fmat(fmat) | ||
fmat(arma::fmat) | (convert from CPU-based Armadillo matrix) | |
fmat(fvec) | ||
fmat(frowvec) |
fill::zeros | ↦ | set all elements to 0 |
fill::ones | ↦ | set all elements to 1 |
fill::eye | ↦ | set the elements on the main diagonal to 1 and off-diagonal elements to 0 |
fill::randu | ↦ | set all elements to random values from a uniform distribution in the [0,1] interval |
fill::randn | ↦ | set all elements to random values from a normal/Gaussian distribution with zero mean and unit variance |
fill::none | ↦ | do not initialise the elements |
fmat(ptr_aux_mem, n_rows, n_cols)
fmat A(5, 5, fill::randu); float x = A(1, 2); // note: try to avoid repeated individual element accesses! fmat B = A + A; fmat C = A * B; fmat D = A % B; B.zeros(); B.set_size(10, 10); B.ones(5, 6); B.print("B:"); // convert from Armadillo arma::fmat C(10, 10, arma::fill::randu); fmat D(C); // advanced constructors // when using the OpenCL backend cl_mem m_cl = clCreateBuffer(get_rt().cl_rt.get_context(), CL_MEM_READ_WRITE, sizeof(float) * 24, NULL, NULL); fmat H(wrap_mem_cl(m_cl), 4, 6); // use auxiliary memory // when using the CUDA backend float* m_cuda; cudaMalloc(&m_cuda, sizeof(float) * 24); fmat J(wrap_mem_cuda(m_cuda), 4, 6); // use auxiliary memory // make an alias of another matrix arma::fmat K(D.get_dev_mem(), D.n_rows, D.n_cols);
fvec
|
= |
fcolvec
|
= |
Col<float>
|
||
vec
|
= |
colvec
|
= |
Col<double>
|
note: not supported on all devices | |
dvec
|
= |
dcolvec
|
= |
Col<double>
|
note: not supported on all devices | |
uvec
|
= |
ucolvec
|
= |
Col<uword>
|
||
ivec
|
= |
icolvec
|
= |
Col<sword>
|
||
u32_vec
|
= |
u32_colvec
|
= |
Col<u32>
|
||
s32_vec
|
= |
s32_colvec
|
= |
Col<s32>
|
||
u64_vec
|
= |
u64_colvec
|
= |
Col<u64>
|
||
s64_vec
|
= |
s64_colvec
|
= |
Col<s64>
|
float
element type)
rather than 64-bit floats (ie. double
element type)
fvec() | ||
fvec(n_elem) | ||
fvec(n_elem, fill_form) | (elements are initialised according to fill_form) | |
fvec(size(X)) | ||
fvec(size(X), fill_form) | (elements are initialised according to fill_form) | |
fvec(fvec) | ||
fvec(arma::fvec) | (convert from CPU-based Armadillo vector) | |
fvec(fmat) | (std::logic_error exception is thrown if the given matrix has more than one column) |
fvec(ptr_aux_mem, number_of_elements)
fvec x(10); fvec y(10, fill::ones); fmat A(10, 10, fill::randu); fvec z = A.col(5); // extract a column vector // convert from Armadillo arma::fvec d(100, arma::fill::randu); fvec e(d);
float
element type)
rather than 64-bit floats (ie. double
element type)
frowvec() | ||
frowvec(n_elem) | ||
frowvec(n_elem, fill_form) | (elements are initialised according to fill_form) | |
frowvec(size(X)) | ||
frowvec(size(X), fill_form) | (elements are initialised according to fill_form) | |
frowvec(frowvec) | ||
frowvec(arma::frowvec) | (convert from CPU-based Armadillo row vector) | |
frowvec(fmat) | (std::logic_error exception is thrown if the given matrix has more than one row) |
frowvec(ptr_aux_mem, number_of_elements)
frowvec x(10); frowvec y(10, fill::ones); fmat A(10, 10, fill::randu); frowvec z = A.row(5); // extract a row vector // convert from Armadillo arma::frowvec d(100, arma::fill::randu); fvec e(d);
+ − * % / == != <= >= < > && ||
==
, !=
, >=
, <=
, >
, <
, &&
, ||
)
each element in the generated object is either 0 or 1, depending on the result of the operation
==
, !=
, >=
, <=
)
are not recommended for matrices of type mat or fmat,
due to the necessarily limited precision of floating-point element types
fmat A(5, 10, fill::randu); fmat B(5, 10, fill::randu); fmat C(10, 5, fill::randu); fmat P = A + B; fmat Q = A - B; fmat R = -B; fmat S = A / 123.0; fmat T = A % B; fmat U = A * C; fmat V = A + B + A + B; imat AA = linspace<imat>(1, 9, 9); imat BB = linspace<imat>(9, 1, 9); // compare elements umat ZZ = (AA >= BB);
.n_rows
|
number of rows; present in Mat, Col, and Row | |
.n_cols
|
number of columns; present in Mat, Col, and Row | |
.n_elem
|
total number of elements; present in Mat, Col, and Row |
fmat X(4,5); cout << "X has " << X.n_cols << " columns" << endl;
(i)
|
|
For fvec and frowvec, access the element stored at index i. For fmat, access the element/object stored at index i under the assumption of a flat layout, with column-major ordering of data (i.e. column by column). An exception is thrown if the requested element is out of bounds. | |
.at(i) or [i]
|
As for (i) , but without a bounds check; not recommended; see the caveats below
|
||
(r,c)
|
For fmat, access the element/object stored at row r and column c. An exception is thrown if the requested element is out of bounds. | ||
.at(r,c)
|
As for (r,c) , but without a bounds check; not recommended; see the caveats below
|
[r,c]
does not work correctly in C++;
instead use (r,c)
// remember that individual element accesses are slow and should be avoided; // when possible, operate in batch instead of on individual elements! fmat M(10, 10, fill::randu); M(9, 9) = 123.0; float x = M(1, 2); fvec v(10, fill::randu); v(9) = 123.0; float y = v(0);
.zeros() | (member function of Mat, Col, and Row) | ||
.zeros( n_elem ) | (member function of Col and Row) | ||
.zeros( n_rows, n_cols ) | (member function of Mat) |
fmat A; A.zeros(5, 10); // or: fmat A(5, 10, fill::zeros); fvec B; B.zeros(100); fmat C(5, 10, fill::randu); C.zeros();
.ones() | (member function of Mat, Col, and Row) | ||
.ones( n_elem ) | (member function of Col and Row) | ||
.ones( n_rows, n_cols ) | (member function of Mat) |
fmat A; A.ones(5, 10); // or: fmat A(5, 10, fill::ones); fvec B; B.ones(100); fmat C(5, 10, fill::randu); C.ones();
fmat A; A.eye(5, 5); // or: fmat A(5, 5, fill::eye); fmat B(5, 5, fill::randu); B.eye();
.randu() | (member function of Mat, Col, and Row) | ||
.randu( n_elem ) | (member function of Col and Row) | ||
.randu( n_rows, n_cols ) | (member function of Mat) |
.randn() | (member function of Mat, Col, and Row) | ||
.randn( n_elem ) | (member function of Col and Row) | ||
.randn( n_rows, n_cols ) | (member function of Mat) |
fmat A; A.randu(5, 10); // or: fmat A(5, 10, fill::randu); fvec B; B.randu(100); fmat C(5, 10, fill::zeros); C.randu(); coot_rng::set_seed_random(); // set the seed to a random value coot_rng::set_seed(42); // set the seed to a specific value
fmat A(5, 6); A.fill(123.0);
fmat A(5, 6); A.randu(); A.clamp(0.2, 0.8);
.set_size( n_elem ) | (member function of Col and Row) | ||
.set_size( n_rows, n_cols ) | (member function of Mat) | ||
.set_size( size(X) ) | (member function of Mat, Col, and Row) |
fmat A; A.set_size(5, 10); // or: mat A(5, 10); fmat B; B.set_size( size(A) ); // or: mat B(size(A)); fvec v; v.set_size(100); // or: vec v(100);
.reshape( n_rows, n_cols ) | (member function of Mat) | ||
.reshape( size(X) ) | (member function of Mat) |
fmat A(4, 5); A.randu(); A.reshape(5, 4);
.resize( n_elem ) | (member function of Col and Row) | ||
.resize( n_rows, n_cols ) | (member function of Mat) | ||
.resize( size(X) ) | (member function of Mat, Col, and Row) |
fmat A(4, 5); A.randu(); A.resize(7, 6);
fmat A(5, 5); A.randu(); A.reset();
fmat A(5, 10, fill::zeros); A.submat( 0,1, 2,3 ) = randu<fmat>(3, 3); A( span(0,2), span(1,3) ) = randu<fmat>(3, 3); A( 0,1, size(3,3) ) = randu<fmat>(3, 3); fmat B = A.submat( 0,1, 2,3 ); fmat C = A( span(0,2), span(1,3) ); fmat D = A( 0,1, size(3,3) ); A.col(1) = randu<fmat>(5,1); A(span::all, 1) = randu<fmat>(5,1); // add 123 to the last 5 elements of vector a vec a(10); a.randu(); a.subvec(a.n_elem - 5, a.n_elem - 1) += 123.0; // add 123 to the first 3 elements of column 2 of X X.col(2).subvec(0, 2) += 123;
dev_mem_t
object that holds raw GPU memory handles
.get_dev_mem().cl_mem_ptr
for the OpenCL backend; this has type coot_cl_mem
dev_mem_t
starts at.get_dev_mem().cuda_mem_ptr
for the CUDA backend; for a matrix type Mat<eT>, this has type eT* (e.g. for fmat the type will be float*)+
, +=
, -
, -=
) and logical operations (==
, !=
) are defined for dev_mem_t
and can be used to make aliases with the advanced constructors
dev_mem_t
pointing outside the bounds of the allocated memory will result in undefined behavior and crashes!
// when using the OpenCL backend fmat A(3, 4, fill::randu); cl_mem A_mem = A.get_dev_mem().cl_mem_ptr.ptr; // when using the CUDA backend fmat B(3, 4, fill::randu); float* B_mem = B.get_dev_mem().cuda_mem_ptr; // when using any backend fmat C(3, 4, fill::randu); dev_mem_t C_mem = C.get_dev_mem(); fmat D(C_mem + 4, 3, 3); // D is an alias starting at the second column of C
.front() | |
access the first element in a vector (cannot be modified) |
.back() | |
access the last element in a vector (cannot be modified) |
.clear() | |
causes an object to have no elements |
.empty() | |
returns true if the object has no elements; returns false if the object has one or more elements |
.size() | |
returns the total number of elements |
.front()
or .back()
involves a transfer from GPU memory to CPU memory;
therefore, for efficiency, avoid repeated calls to either function when possible;
see the Armadillo adaptation guide for more details and suggestions.
mat A(5, 5, fill::randu); cout << A.size() << endl; A.clear(); cout << A.empty() << endl;
fmat X(5, 5); X.randu(); fvec a = X.diag(); fvec b = X.diag(1); fvec c = X.diag(-2); X.diag() = randu<fvec>(5); X.diag() += 6; X.diag().ones();
fmat A(4, 5); A.randu(); fmat B = A.t();
fmat A(5, 5, fill::randu); float max_val = A.max();
fmat A(5, 5, fill::randu); uword i = A.index_max(); float max_val = A(i);
fmat A(4, 4, fill::randu); A.t().eval().print("A.t()");
fmat A(5, 5, fill::randu); cout << A.is_empty() << endl; A.reset(); cout << A.is_empty() << endl;
fmat A(1, 5, fill::randu); fmat B(5, 1, fill::randu); fmat C(5, 5, fill::randu); cout << A.is_vec() << endl; cout << B.is_vec() << endl; cout << C.is_vec() << endl;
fmat A(5, 5, fill::randu); fmat B(6, 7, fill::randu); cout << A.is_square() << endl; cout << B.is_square() << endl;
fmat A(5, 5, fill::randu); fmat B(6, 6, fill::randu); A.print(); // print a transposed version of A A.t().print(); // "B:" is the optional header line B.print("B:"); cout << A << endl; cout << "B:" << endl; cout << B << endl;
fmat A(5, 5, fill::randu); cout.precision(11); cout.setf(ios::fixed); A.raw_print(cout, "A:");
fvec a = linspace(0, 5, 6); frowvec b = linspace<frowvec>(5, 0, 6);
fvec a = logspace(0, 5, 6); frowvec b = logspace<frowvec>(5, 0, 6);
fmat A = eye(5,5); fmat B = 123.0 * eye<fmat>(5,5); imat C = eye<imat>( size(B) );
fvec v = ones(10); uvec u = ones<uvec>(10); frowvec r = ones<frowvec>(10); fmat A = ones(5,6); imat B = ones<imat>(5,6); umat C = ones<umat>(5,6);
fvec v = zeros(10); uvec u = zeros<uvec>(10); frowvec r = zeros<rowvec>(10); fmat A = zeros(5,6); imat B = zeros<imat>(5,6); umat C = zeros<umat>(5,6);
fvec v1 = randu(5); frowvec r1 = randu<rowvec>(5); fmat A1 = randu(5, 6); mat B1 = randu<mat>(5, 6); mat B2 = randu<mat>(5, 6, distr_param(10,20)); coot_rng::set_seed_random(); // set the seed to a random value coot_rng::set_seed(42); // set the seed to a specific value
fvec v1 = randn(5); fvec v2 = randn(5, distr_param(10,5)); frowvec r1 = randn<rowvec>(5); frowvec r2 = randn<rowvec>(5, distr_param(10,5)); fmat A1 = randn(5, 6); fmat A2 = randn(5, 6, distr_param(10,5)); mat B1 = randn<mat>(5, 6); mat B2 = randn<mat>(5, 6, distr_param(10,5)); coot_rng::set_seed_random(); // set the seed to a random value coot_rng::set_seed(42); // set the seed to a specific value
imat A1 = randi(5, 6); imat A2 = randi(5, 6, distr_param(-10, +20)); fmat B1 = randi<fmat>(5, 6); fmat B2 = randi<fmat>(5, 6, distr_param(-10, +20)); coot_rng::set_seed_random(); // set the seed to a random value coot_rng::set_seed(42); // set the seed to a specific value
fmat A = randu<fmat>(5, 5); fmat B = abs(A); fvec X = linspace<fvec>(-5, 5, 11); fvec Y = abs(X);
fmat A(5, 6, fill::randu); fmat B(5, 6, fill::randu); float x = accu(A); float y = accu(A % B);
fvec V(10, fill::randu); fmat X(5, 5, fill::randu); // status1 will be set to true if vector V has all non-zero elements bool status1 = all(V); // status2 will be set to true if vector V has all elements greater than 0.5 bool status2 = all(V > 0.5); // status3 will be set to true if matrix X has all elements greater than 0.6; // note the use of vectorise() bool status3 = all(vectorise(X) > 0.6); // generate a row vector indicating which columns of X have all elements greater than 0.7 umat A = all(X > 0.7);
fvec V(10, fill::randu); fmat X(5, 5, fill::randu); // status1 will be set to true if vector V has any non-zero elements bool status1 = any(V); // status2 will be set to true if vector V has any elements greater than 0.5 bool status2 = any(V > 0.5); // status3 will be set to true if matrix X has any elements greater than 0.6; // note the use of vectorise() bool status3 = any(vectorise(X) > 0.6); // generate a row vector indicating which columns of X have elements greater than 0.7 umat A = any(X > 0.7);
"absdiff" | ↦ | scalars x and y are considered equal if |x − y| ≤ tol |
"reldiff" | ↦ | scalars x and y are considered equal if |x − y| / max( |x|, |y| ) ≤ tol |
"both" | ↦ | scalars x and y are considered equal if |x − y| ≤ abs_tol or |x − y| / max( |x|, |y| ) ≤ rel_tol |
mat A(5, 5, fill::randu); mat B = A + 0.001; bool same1 = approx_equal(A, B, "absdiff", 0.002); mat C = 1000 * randu<mat>(5,5); mat D = C + 1; bool same2 = approx_equal(C, D, "reldiff", 0.1); bool same3 = approx_equal(C, D, "both", 2, 0.1);
frowvec r(5, fill::randu); fcolvec q(5, fill::randu); fmat X(5, 5, fill::randu); // examples of expressions which have optimised implementations float a = as_scalar(r*q); float b = as_scalar(r*X*q); float c = as_scalar(r*diagmat(X)*q); float d = as_scalar(r*inv(diagmat(X))*q);
fmat A(5, 5, fill::randu); fmat B = clamp(A, 0.2, 0.8); fmat C = clamp(A, min(min(A)), 0.8); fmat D = clamp(A, 0.2, max(max(A)));
fmat A(5, 5, fill::randu); mat B = conv_to<mat>::from(A); fmat C(10, 1, fill::randu); fcolvec x = conv_to< fcolvec >::from(C); // convert from Armadillo object arma::fmat D = conv_to<arma::fmat>::from(A);
fvec a(3, fill::randu); fvec b(3, fill::randu); fvec c = cross(a, b);
val = det( A ) | (form 1) | |
det( val, A ) | (form 2) |
fmat A(5, 5, fill::randu); float val1 = det(A); // form 1 float val2; bool success = det(val2, A); // form 2
fmat A = randu<fmat>(5, 5); fmat B = diagmat(A); fmat C = diagmat(A,1); fvec v = randu<fvec>(5); fmat D = diagmat(v); fmat E = diagmat(v,1);
fmat A(5, 5, fill::randu); fvec d = diagvec(A);
fvec a(10, fill::randu); fvec b(10, fill::randu); float x = dot(a,b);
fmat A(5, 5, fill::randu); fmat B(5, 5, fill::randu); uvec q1 = find(A > B); uvec q2 = find(A > 0.5); uvec q3 = find(A > 0.5, 3, "last");
fmat A(5, 5, fill::randu); A(1, 1) = datum::inf; // find only finite elements uvec f = find_finite(A);
fmat A(5, 5, fill::randu); A(1, 1) = datum::inf; A(2, 2) = datum::nan; // return indices of two non-finite elements uvec f = find_nonfinite(A);
fmat A(5, 5, fill::randu); A(2, 3) = datum::nan; // indices will be { 17 } uvec indices = find_nan(A);
index_min( V )
index_min( M ) index_min( M, dim ) index_min( Q ) index_min( Q, dim ) |
index_max( V )
index_max( M ) index_max( M, dim ) index_max( Q ) index_max( Q, dim ) |
fvec v(10, fill::randu); uword i = index_max(v); float max_val_in_v = v(i); fmat M(5, 6, fill::randu); urowvec ii = index_max(M); ucolvec jj = index_max(M,1); float max_val_in_col_2 = M( ii(2), 2 ); float max_val_in_row_4 = M( 4, jj(4) );
join_rows( A, B )
join_rows( A, B, C ) join_rows( A, B, C, D ) join_cols( A, B ) join_cols( A, B, C ) join_cols( A, B, C, D ) |
join_horiz( A, B )
join_horiz( A, B, C ) join_horiz( A, B, C, D ) join_vert( A, B ) join_vert( A, B, C ) join_vert( A, B, C, D ) |
fmat A(4, 5, fill::randu); fmat B(4, 6, fill::randu); fmat C(6, 5, fill::randu); fmat AB = join_rows(A, B); fmat AC = join_cols(A, C);
min( V )
min( M ) min( M, dim ) min( Q ) min( Q, dim ) min( A, B ) |
max( V )
max( M ) max( M, dim ) max( Q ) max( Q, dim ) max( A, B ) |
fcolvec v(10, fill::randu); float x = max(v); fmat M(10, 10, fill::randu); frowvec a = max(M); frowvec b = max(M, 0); fcolvec c = max(M, 1); // element-wise maximum fmat X(5, 6, fill::randu); fmat Y(5, 6, fill::randu); fmat Z = coot::max(X, Y); // use coot:: prefix to distinguish from std::max()
"-inf"
, "inf"
, "fro"
"inf"
, "fro"
"-inf"
is the minimum quasi-norm, "inf"
is the maximum norm, "fro"
is the Frobenius norm
accu(X != 0)
fvec q(5, fill::randu); float x = norm(q, 2); float y = norm(q, "inf");
fvec A(10, fill::randu); fvec B = normalise(A); fvec C = normalise(A, 1); fmat X(5, 6, fill::randu); fmat Y = normalise(X); fmat Z = normalise(X, 2, 1);
pow( A, scalar ) |
fmat A(5, 6, fill::randu); fmat B = pow(A, 3.45); frowvec R(6, fill::randu); frowvec S = pow(R, -1.0);
n_rows | = | num_copies_per_row | × | A.n_rows |
n_cols | = | num_copies_per_col | × | A.n_cols |
fmat A(2, 3, fill::randu); fmat B = repmat(A, 4, 5);
fmat A(10, 5, fill::randu); fmat B = reshape(A, 5, 10);
fmat A(4, 5, fill::randu); fmat B = resize(A, 7, 6);
fmat A(4, 5, fill::randu); fmat B = shuffle(A);
fmat A(5,6); fmat B = zeros<fmat>(size(A)); fmat C; C.randu(size(A)); fmat D = ones<fmat>(size(A)); fmat E = ones<fmat>(10, 20); E(3, 4, size(C)) = C; // access submatrix of E fmat F( size(A) + size(E) ); fmat G( size(A) * 2 ); cout << "size of A: " << size(A) << endl; bool is_same_size = (size(A) == size(E));
"ascend"
or "descend"
; by default "ascend"
is usedfmat A(10, 10, fill::randu); fmat B = sort(A);
"ascend"
or "descend"
; by default "ascend"
is usedfvec q(10, fill::randu); uvec indices = sort_index(q);
fcolvec v(10, fill::randu); float x = sum(v); fmat M(10, 10, fill::randu); frowvec a = sum(M); frowvec b = sum(M, 0); fcolvec c = sum(M, 1); float y = accu(M); // find the overall sum regardless of object type
fmat A(5, 5, fill::randu); fmat B = symmatu(A); fmat C = symmatl(A);
fmat A(5, 5, fill::randu); float x = trace(A);
fmat A(5, 10, fill::randu); fmat B = trans(A); fmat C = A.t(); // equivalent to trans(A), but more compact
fmat X(4, 5, fill::randu); fvec v = vectorise(X);
exp | log | square | floor | erf | sign | |||||||
exp2 | log2 | sqrt | ceil | erfc | lgamma | |||||||
exp10 | log10 | round | ||||||||||
trunc_exp | trunc_log | trunc |
exp(A)
|
base-e exponential: e x | |||||||||||||
exp2(A)
|
base-2 exponential: 2 x | |||||||||||||
exp10(A)
|
base-10 exponential: 10 x | |||||||||||||
trunc_exp(A)
|
base-e exponential, truncated to avoid infinity (only for float and double elements) | |||||||||||||
log(A)
|
natural log: loge x | |||||||||||||
log2(A)
|
base-2 log: log2 x | |||||||||||||
log10(A)
|
base-10 log: log10 x | |||||||||||||
trunc_log(A)
|
natural log, truncated to avoid ±infinity (only for float and double elements) | |||||||||||||
square(A)
|
square: x 2 | |||||||||||||
sqrt(A)
|
square root: √x | |||||||||||||
floor(A)
|
largest integral value that is not greater than the input value | |||||||||||||
ceil(A)
|
smallest integral value that is not less than the input value | |||||||||||||
round(A)
|
round to nearest integer, with halfway cases rounded away from zero | |||||||||||||
trunc(A)
|
round to nearest integer, towards zero | |||||||||||||
erf(A)
|
error function (only for float and double elements) | |||||||||||||
erfc(A)
|
complementary error function (only for float and double elements) | |||||||||||||
lgamma(A)
|
natural log of the absolute value of gamma function (only for float and double elements) | |||||||||||||
sign(A)
|
signum function;
for each element a in A, the corresponding element b in B is:
|
fmat A(5, 5, fill::randu); fmat B = exp(A);
fmat X(5, 5, fill::randu); fmat Y = cos(X);
R = chol( X) | ||
chol(R, X) |
fmat A(5, 5, fill::randu); fmat X = A.t() * A; mat R1 = chol(X); mat R2; bool ok = chol(R2, X);
// for matrices with real elements fmat A(50, 50, fill::randu); fmat B = A.t()*A; // generate a symmetric matrix fvec eigval; fmat eigvec; eig_sym(eigval, eigvec, B);
fmat A(5, 5, fill::randu); fmat L, U, P; lu(L, U, P, A); fmat B = P.t() * L * U;
fmat A(4, 5, fill::randu); fmat B = pinv(A); // use default tolerance fmat C = pinv(A, 0.01); // set tolerance to 0.01
fmat A(5, 5, fill::randu); A.diag() += 3; // ensure positive definite fvec b(5, fill::randu); fvec x1 = solve(A, b); fvec x2; bool status = solve(x2, A, b); fmat B(5, 5, fill::randu); fmat X1 = solve(A, B);
fmat X(5, 5, fill::randu); fmat U; fvec s; fmat V; svd(U, s, V, X);
"full" | = | return the full convolution (default setting), with the size equal to A.n_elem + B.n_elem - 1 |
"same" | = | return the central part of the convolution, with the same size as vector A |
fvec A(256, fill::randu); fvec B(16, fill::randu); fvec C = conv(A, B); fvec D = conv(A, B, "same");
"full" | = | return the full convolution (default setting), with the size equal to size(A) + size(B) - 1 |
"same" | = | return the central part of the convolution, with the same size as matrix A |
fmat A(256, 256, fill::randu); fmat B(16, 16, fill::randu); fmat C = conv2(A, B); fmat D = conv2(A, B, "same");
mean( V )
mean( M ) mean( M, dim ) |
⎫ ⎪ ⎬ mean (average value) ⎪ ⎭ |
|
median( V )
median( M ) median( M, dim ) |
⎫ ⎬ median ⎭ |
|
stddev( V )
stddev( V, norm_type ) stddev( M ) stddev( M, norm_type ) stddev( M, norm_type, dim ) |
⎫ ⎪ ⎬ standard deviation ⎪ ⎭ |
|
var( V )
var( V, norm_type ) var( M ) var( M, norm_type ) var( M, norm_type, dim ) |
⎫ ⎪ ⎬ variance ⎪ ⎭ |
|
range( V )
range( M ) range( M, dim ) |
⎫ ⎬ range (difference between max and min) ⎭ |
fmat A(5, 5, fill::randu); fmat B = mean(A); fmat C = var(A); float m = mean(mean(A)); fvec v(5, fill::randu); float x = var(v);
fmat X(4, 5, fill::randu); fmat Y(4, 5, fill::randu); fmat C = cov(X, Y); fmat D = cov(X, Y, 1);
fmat X(4, 5, fill::randu); fmat Y(4, 5, fill::randu); fmat R = cor(X, Y); fmat S = cor(X, Y, 1);
COOT_USE_CUDA
or COOT_USE_OPENCL
macros in the Bandicoot configurationCOOT_DEFAULT_BACKEND
macro to the desired backend (e.g. #define COOT_DEFAULT_BACKEND CL_BACKEND
)coot_init()
function:
coot_init( ) | default initialization | |
coot_init( print_info ) | initialize to default backend, optionally printing information about the chosen GPU device | |
coot_init( "opencl", print_info ) |
initialize to OpenCL backend; COOT_USE_OPENCL must be enabled
|
|
coot_init( "opencl", print_info, platform_id, dev_id ) | use a specific OpenCL platform ID and device ID | |
coot_init( "cuda", print_info ) |
initialize to CUDA backend; COOT_USE_CUDA must be enabled
|
|
coot_init( "cuda", print_info, dev_id ) | use specific CUDA device ID |
datum::pi
|
π, the ratio of any circle's circumference to its diameter | |
datum::tau
|
τ, the ratio of any circle's circumference to its radius (equivalent to 2π) | |
datum::inf
|
∞, infinity | |
datum::nan
|
“not a number” (NaN); caveat: NaN is not equal to anything, even itself | |
datum::eps
|
machine epsilon; approximately 2.2204e-16; difference between 1 and the next representable value | |
datum::e
|
base of the natural logarithm | |
datum::sqrt2
|
square root of 2 | |
datum::log_min
|
log of minimum non-zero value (type and machine dependent) | |
datum::log_max
|
log of maximum value (type and machine dependent) | |
datum::euler
|
Euler's constant, aka Euler-Mascheroni constant | |
datum::gratio
|
golden ratio | |
datum::m_u
|
atomic mass constant (in kg) | |
datum::N_A
|
Avogadro constant | |
datum::k
|
Boltzmann constant (in joules per kelvin) | |
datum::k_evk
|
Boltzmann constant (in eV/K) | |
datum::a_0
|
Bohr radius (in meters) | |
datum::mu_B
|
Bohr magneton | |
datum::Z_0
|
characteristic impedance of vacuum (in ohms) | |
datum::G_0
|
conductance quantum (in siemens) | |
datum::k_e
|
Coulomb's constant (in meters per farad) | |
datum::eps_0
|
electric constant (in farads per meter) | |
datum::m_e
|
electron mass (in kg) | |
datum::eV
|
electron volt (in joules) | |
datum::ec
|
elementary charge (in coulombs) | |
datum::F
|
Faraday constant (in coulombs) | |
datum::alpha
|
fine-structure constant | |
datum::alpha_inv
|
inverse fine-structure constant | |
datum::K_J
|
Josephson constant | |
datum::mu_0
|
magnetic constant (in henries per meter) | |
datum::phi_0
|
magnetic flux quantum (in webers) | |
datum::R
|
molar gas constant (in joules per mole kelvin) | |
datum::G
|
Newtonian constant of gravitation (in newton square meters per kilogram squared) | |
datum::h
|
Planck constant (in joule seconds) | |
datum::h_bar
|
Planck constant over 2 pi, aka reduced Planck constant (in joule seconds) | |
datum::m_p
|
proton mass (in kg) | |
datum::R_inf
|
Rydberg constant (in reciprocal meters) | |
datum::c_0
|
speed of light in vacuum (in meters per second) | |
datum::sigma
|
Stefan-Boltzmann constant | |
datum::R_k
|
von Klitzing constant (in ohms) | |
datum::b
|
Wien wavelength displacement law constant |
datum::nan
is not equal to anything, even itself;
to check whether a scalar x is finite,
use std::isfinite(x)
cout << "speed of light = " << datum::c_0 << endl; cout << "log_max for floats = "; cout << fdatum::log_max << endl; cout << "log_max for doubles = "; cout << datum::log_max << endl;
.tic()
|
|
start the timer |
.toc()
|
|
return the number of seconds since the last call to .tic()
|
wall_clock timer; timer.tic(); // ... do something ... double n = timer.toc(); cout << "number of seconds: " << n << endl;
std::cout
COOT_COUT_STREAM
define; see config.hpp
std::cerr
COOT_CERR_STREAM
define; see config.hpp
COOT_WARN_LEVEL
define; see config.hpp
#define COOT_WARN_LEVEL 1 #include <bandicoot>
Matlab/Octave | Bandicoot | Notes | ||
A(1, 1)
|
A(0, 0)
|
indexing in Bandicoot starts at 0 | ||
A(k, k)
|
A(k-1, k-1)
|
|||
size(A,1)
|
A.n_rows
|
read only | ||
size(A,2)
|
A.n_cols
|
|||
numel(A)
|
A.n_elem
|
|||
A(:, k)
|
A.col(k)
|
this is a conceptual example only; exact conversion from Matlab/Octave to Bandicoot syntax will require taking into account that indexing starts at 0 | ||
A(k, :)
|
A.row(k)
|
|||
A(:, p:q)
|
A.cols(p, q)
|
|||
A(p:q, :)
|
A.rows(p, q)
|
|||
A(p:q, r:s)
|
A( span(p,q), span(r,s) )
|
A( span(first_row, last_row), span(first_col, last_col) ) | ||
A'
|
A.t() or trans(A)
|
matrix transpose / Hermitian transpose | ||
A = zeros(size(A))
|
A.zeros()
|
|||
A = ones(size(A))
|
A.ones()
|
|||
A = zeros(k)
|
A = zeros<fmat>(k,k)
|
|||
A = ones(k)
|
A = ones<fmat>(k,k)
|
|||
A .* B
|
A % B
|
element-wise multiplication | ||
A ./ B
|
A / B
|
element-wise division | ||
A = A + 1;
|
A++
|
|||
A = A - 1;
|
A--
|
|||
X = A(:)
|
X = vectorise(A)
|
|||
X = [ A B ]
|
X = join_horiz(A,B)
|
|||
X = [ A; B ]
|
X = join_vert(A,B)
|
|||
A
|
cout << A << endl;
or A.print("A =");
|
|||
A = randn(2,3);
|
fmat A = randn(2,3);
|
X(i,j)
has an overhead of transferring between the GPU and CPU; when adapting Armadillo code to Bandicoot, direct element access should be avoided
A += 1
instead of for(uword i=0; i<A.n_elem; ++i) { A[i] += 1; }
#include <iostream> #include <bandicoot> using namespace std; using namespace coot; int main() { fmat A(4, 5, fill::randu); fmat B(4, 5, fill::randu); cout << A * B.t() << endl; return 0; }
g++ example.cpp -o example -std=c++11 -O2 -lbandicoot
COOT_DONT_USE_WRAPPER
|
Disable going through the run-time Bandicoot wrapper library (libbandicoot.so) when calling GPU-specific functions.
Overrides COOT_USE_WRAPPER .
You will need to directly link with GPU libraries (e.g. -lOpenCL -lclBLAS or similar depending on backend configuration)
|
|||||||||||||
COOT_USE_WRAPPER
|
Enable use of Bandicoot wrapper library, which allows linking against all enabled backends with -lbandicoot only.
|
|||||||||||||
COOT_USE_OPENCL
|
Enable use of OpenCL as a GPU backend. Note that either COOT_USE_OPENCL or COOT_USE_CUDA must be enabled.
OpenCL headers and clBLAS headers must be available on the system.
|
|||||||||||||
COOT_DONT_USE_OPENCL
|
Disable use of OpenCL; overrides COOT_USE_OPENCL
|
|||||||||||||
COOT_USE_CUDA
|
Enable use of CUDA as a GPU backend. Note that either COOT_USE_OPENCL or COOT_USE_CUDA must be enabled.
The CUDA toolkit must be available on the system.
|
|||||||||||||
COOT_DONT_USE_CUDA
|
Disable use of CUDA; overrides COOT_USE_CUDA
|
|||||||||||||
COOT_DEFAULT_BACKEND
|
Set the backend that Bandicoot will use. This is only necessary if multiple backends are enabled; that is, when both COOT_USE_OPENCL and COOT_USE_CUDA are enabled. This should be set to either CUDA_BACKEND or CL_BACKEND (e.g. #define COOT_BACKEND CUDA_BACKEND ). See also the backend configuration documentation.
|
|||||||||||||
COOT_BLAS_LONG
|
Use "long" instead of "int" when calling BLAS and LAPACK functions. Only relevant when using OpenCL backend. | |||||||||||||
COOT_BLAS_LONG_LONG
|
Use "long long" instead of "int" when calling BLAS and LAPACK functions. Only relevant when using OpenCL backend. | |||||||||||||
COOT_USE_FORTRAN_HIDDEN_ARGS
|
Use so-called "hidden arguments" when calling BLAS and LAPACK functions. Enabled by default. See Fortran argument passing conventions for more details. Only relevant when using OpenCL backend. | |||||||||||||
COOT_DONT_USE_FORTRAN_HIDDEN_ARGS
|
Disable use of so-called "hidden arguments" when calling BLAS and LAPACK functions.
May be necessary when using Bandicoot in conjunction with broken MKL headers (eg. if you have #include "mkl_lapack.h" in your code).
Only relevant when using OpenCL backend.
|
|||||||||||||
COOT_USE_MKL_TYPES
|
If using the OpenCL backend with LAPACK and BLAS, use Intel MKL types for complex numbers.
You will need to include appropriate MKL headers before the Bandicoot header.
You may also need to enable one or more of the following options:
COOT_BLAS_LONG , COOT_BLAS_LONG_LONG , COOT_DONT_USE_FORTRAN_HIDDEN_ARGS .
Only relevant when using OpenCL backend.
|
|||||||||||||
COOT_USE_OPENMP
|
Use OpenMP for parallelisation of some CPU-based parts of Bandicoot functionalities.
Automatically enabled when using a compiler which has OpenMP 3.1+ active (eg. the -fopenmp option for gcc and clang).
Note: this may not have a noticeable effect on performance since most Bandicoot implementations do not use the CPU heavily or at all.
|
|||||||||||||
COOT_DONT_USE_OPENMP
|
Disable use of OpenMP for parallelisation; overrides COOT_USE_OPENMP .
|
|||||||||||||
COOT_KERNEL_CACHE_DIR
|
If defined, specifies a custom directory to use for the kernel cache.
Distribution packagers may choose to specify COOT_SYSTEM_KERNEL_CACHE_DIR , though it is overridden by COOT_KERNEL_CACHE_DIR if specified.
|
|||||||||||||
COOT_BLAS_CAPITALS
|
Use capitalised (uppercase) BLAS and LAPACK function names (eg. DGEMM vs dgemm) | |||||||||||||
COOT_BLAS_UNDERSCORE
|
Append an underscore to BLAS and LAPACK function names (eg. dgemm_ vs dgemm). Enabled by default. | |||||||||||||
COOT_BLAS_LONG
|
Use "long" instead of "int" when calling BLAS and LAPACK functions | |||||||||||||
COOT_BLAS_LONG_LONG
|
Use "long long" instead of "int" when calling BLAS and LAPACK functions | |||||||||||||
COOT_NO_DEBUG
|
Disable all run-time checks, including size conformance and bounds checks. NOT RECOMMENDED. DO NOT USE UNLESS YOU KNOW WHAT YOU ARE DOING AND ARE WILLING TO RISK THE DOWNSIDES. Keeping run-time checks enabled during development and deployment greatly aids in finding mistakes in your code. | |||||||||||||
COOT_EXTRA_DEBUG
|
Print out the trace of internal functions used for evaluating expressions. Not recommended for normal use. This is mainly useful for debugging the library. | |||||||||||||
COOT_COUT_STREAM
|
The default stream used for printing matrices by .print(). Must be always enabled. By default defined to std::cout | |||||||||||||
COOT_CERR_STREAM
|
The default stream used for printing warnings and errors. Must be always enabled. By default defined to std::cerr | |||||||||||||
COOT_WARN_LEVEL
|
The level of warning messages printed to COOT_CERR_STREAM.
Must be an integer ≥ 0. By default defined to 2.
Example usage: #define COOT_WARN_LEVEL 1 #include <bandicoot> |
COOT_USE_WRAPPER
is not defined (or COOT_DONT_USE_WRAPPER
is defined), then Bandicoot will need to be linked against all dependencies of its backendsCOOT_USE_WRAPPER
is the default and is recommended; when COOT_USE_WRAPPER
is enabled, linking only requires -lbandicoot
-lblas
(CPU BLAS support; can use -lopenblas
instead)-llapack
(CPU LAPACK support; can use -lopenblas
instead)COOT_USE_OPENCL
is set (i.e. the OpenCL backend is enabled), these libraries must be linked against:
-lOpenCL
(core OpenCL support)-lclBLAS
(clBLAS for BLAS operations)COOT_USE_CUDA
is set (i.e. the CUDA backend is enabled), these libraries must be linked against:
-lcuda
(core CUDA support)-lcudart
(CUDA runtime library)-lnvrtc
(runtime compilation of CUDA kernels)-lcublas
(cuBLAS for BLAS operations)-lcusolver
(cuSolverDn for decompositions and factorisations)-lcurand
(cuRand for random number generation)~/.bandicoot/cache/
(e.g. /home/user/.bandicoot/cache/)
%APPDATA%\bandicoot\cache
(e.g. C:\Users\Username\AppData\bandicoot\cache
)COOT_KERNEL_CACHE_DIR
configuration variableshuffle()
.front()
, .back()
, .size()
, .clear()
, .empty()
approx_equal()
COOT_USE_FORTRAN_HIDDEN_ARGS
min()
and max()