C++ library for GPU accelerated linear algebra

[top] API Documentation for Bandicoot 2.1.1

Preamble

Bandicoot is a GPU-focused linear algebra library aiming for API compatibility with Armadillo

Bandicoot can use any OpenCL or CUDA device as a backend; see the backend configuration details

For converting Matlab/Octave programs, see the syntax conversion table

For adapting Armadillo programs to Bandicoot, see the Armadillo/Bandicoot adaptation guide

First time users: please see the short example program

If you discover any bugs or regressions, please report them

History of API additions

Please cite the following report if you use Bandicoot in your research and/or software.
Citations are useful for the continued development and maintenance of the library.

Ryan Curtin, Marcus Edel, and Conrad Sanderson.
Bandicoot: C++ Library for GPU Linear Algebra and Scientific Computing.
arXiv:2308.03120, 2023.

Overview

matrix, vector, and cube classes
member functions & variables

generated vectors / matrices / cubes
functions of vectors / matrices / cubes

decompositions, factorisations, and inverses

signal & image processing
statistics and clustering
miscellaneous (constants, configuration)

Matrix, Vector, and Cube Classes

Mat<type>, fmat, mat		dense matrix class
Col<type>, fcolvec, fvec, colvec, vec		dense column vector class
Row<type>, frowvec, rowvec		dense row vector class

Cube<type>, fcube, cube		dense cube class ("3D matrix")

operators		`+ − * % / == != <= >= < > && \|\|`

Member Functions & Variables

attributes		.n_rows, .n_cols, .n_elem, .n_slices, ...
element access		element/object access via (), [] and .at()
element initialisation		set elements via initialiser lists

.zeros		set all elements to zero
.ones		set all elements to one
.eye		set elements along main diagonal to one and off-diagonal elements to zero
.randu / .randn		set all elements to random values

.fill		set all elements to specified value

.clamp		clamp values to lower and upper limits

.set_size		change size without keeping elements (fast)
.reshape		change size while keeping elements
.resize		change size while keeping elements and preserving layout
.copy_size		change size to be same as given object
.reset		change size to empty

submatrix views		read/write access to contiguous submatrices
subcube views		read/write access to contiguous and non-contiguous subcubes

.get_dev_mem()		get underlying raw GPU memory pointer

compat. container functions		compatibility container functions

.col_as_mat / .row_as_mat		return matrix representation of cube column or cube row

.diag		read/write access to matrix diagonals
.each_col / .each_row		vector operations applied to each column/row of matrix (aka "broadcasting")

.t / .st		return matrix transpose
.min / .max		return extremum value
.index_min / .index_max		return index of extremum value
.eval		force evaluation of delayed expression

.is_empty		check whether object is empty
.is_vec		check whether matrix is a vector

.is_square		check whether matrix is square sized

.is_finite		check whether all elements are finite
.has_inf		check whether any element is ±infinity
.has_nan		check whether any element is NaN

.print		print object to std::cout or user specified stream
.raw_print		print object without formatting

Generated Vectors / Matrices / Cubes

linspace		generate vector with linearly spaced elements
logspace		generate vector with logarithmically spaced elements
regspace		generate vector with regularly spaced elements
eye		generate identity matrix
ones		generate object filled with ones
zeros		generate object filled with zeros
randu		generate object with random values (uniform distribution)
randn		generate object with random values (normal distribution)
randi		generate object with random integer values in specified interval

Functions of Vectors / Matrices / Cubes

abs		obtain magnitude of each element
accu		accumulate (sum) all elements
all		check whether all elements are non-zero, or satisfy a relational condition
any		check whether any element is non-zero, or satisfies a relational condition
approx_equal		approximate equality
as_scalar		convert 1x1 matrix to pure scalar
clamp		obtain clamped elements according to given limits
conv_to		convert/cast between matrix types
cross		cross product
det		determinant
diagmat		generate diagonal matrix from given matrix or vector
diagvec		extract specified diagonal
dot		dot product
find		find indices of non-zero elements, or elements satisfying a relational condition
find_finite		find indices of finite elements
find_nonfinite		find indices of non-finite elements
find_nan		find indices of NaN elements
index_min / index_max		indices of extremum values
join_rows / join_cols		concatenation of matrices
min / max		return extremum values
norm		various norms of vectors and matrices
normalise		normalise vectors to unit p-norm
pow		element-wise power
repmat		replicate matrix in block-like fashion
reshape		change size while keeping elements
resize		change size while keeping elements and preserving layout
shuffle		randomly shuffle elements
size		obtain dimensions of given object
sort		sort elements
sort_index		vector describing sorted order of elements
sum		sum of elements
symmatu / symmatl		generate symmetric matrix from given matrix
trace		sum of diagonal elements
trans		transpose of matrix
vectorise		flatten matrix into vector
misc functions		miscellaneous element-wise functions: exp, log, sqrt, round, sign, ...
trig functions		trigonometric element-wise functions: cos, sin, tan, ...

Decompositions, Factorisations, and Inverses

chol		Cholesky decomposition
eig_sym		eigen decomposition of dense symmetric/hermitian matrix
lu		lower-upper decomposition
pinv		pseudo-inverse / generalised inverse
solve		solve systems of linear equations
svd		singular value decomposition

Signal & Image Processing

conv		1D convolution
conv2		2D convolution

Statistics

stats functions		mean, median, standard deviation, variance
cov		covariance
cor		correlation

Miscellaneous

backend configuration		configuring the use of OpenCL or CUDA backends for Bandicoot
output streams		streams for printing warnings and errors
uword / sword		shorthand for unsigned and signed integers
Matlab/Bandicoot syntax differences		examples of Matlab syntax and conceptually corresponding Bandicoot syntax
Armadillo/Bandicoot differences		conceptual differences between Bandicoot and Armadillo
example program		short example program
config.hpp		configuration options
direct linking		guide to linking without using the wrapper library
kernel cache		infrastructure for caching compiled GPU kernel functions
API additions		API stability and list of API additions

Matrix, Vector, and Cube Classes

Mat<type>
fmat
mat

Classes for dense matrices, with elements stored in column-major ordering (ie. column by column) on the GPU

The root matrix class is Mat<type>, where type is one of:
- float, double, short, int, long, and unsigned versions of short, int, long
- Bandicoot provides convenient u32, u64, s32, and s64 types that can also be used
- Important: not all types are supported on all devices; runtime exceptions will be thrown if a type is not supported

For convenience the following typedefs have been defined:

`fmat`	=	`Mat<float>`
`mat`	=	`Mat<double>`	note: not supported on all devices
`dmat`	=	`Mat<double>`	note: not supported on all devices
`umat`	=	`Mat<uword>`
`imat`	=	`Mat<sword>`
`u32_mat`	=	`Mat<u32>`
`s32_mat`	=	`Mat<s32>`
`u64_mat`	=	`Mat<u64>`
`s64_mat`	=	`Mat<s64>`

Use of matrices with the fmat type is preferred over the mat type, as standard consumer-focused GPUs are considerably more performant with 32-bit floats (ie. float element type) rather than 64-bit floats (ie. double element type)

In this documentation, the type fmat is used for convenience, speed, and portability; it is possible to use other types instead, e.g. mat

Functions which use more complex functionality (generally matrix decompositions) are only valid for the following types: fmat, dmat, mat

Constructors:

`fmat()`
`fmat(n_rows, n_cols)`
`fmat(n_rows, n_cols, fill_form)`		(elements are initialised according to fill_form)
`fmat(size(X))`
`fmat(size(X), fill_form)`		(elements are initialised according to fill_form)
`fmat(initializer_list)`
`fmat(string)`
`fmat(std::vector)`		(treated as a column vector)
`fmat(fmat)`
`fmat(arma::fmat)`		(convert from CPU-based Armadillo matrix)
`fmat(fvec)`
`fmat(frowvec)`

The elements can be explicitly initialised during construction by specifying fill_form, which is one of:

`fill::zeros`	↦	set all elements to 0 (default)
`fill::ones`	↦	set all elements to 1
`fill::eye`	↦	set the elements on the main diagonal to 1 and off-diagonal elements to 0
`fill::randu`	↦	set all elements to random values from a uniform distribution in the [0,1] interval
`fill::randn`	↦	set all elements to random values from a normal/Gaussian distribution with zero mean and unit variance
`fill::none`	↦	do not initialise the elements

Caveats:
- the elements are initialised to zero if fill_form is not specified; use fill::none to disable initialisation of elements
- setting the elements one by one is generally very inefficient and should be avoided if possible

For the mat(string) constructor, the format is elements separated by spaces, and rows denoted by semicolons; for example, the 2x2 identity matrix can be created using "1 0; 0 1"; note that string based initialisation is slower than using element initialisation

Each instance of fmat automatically allocates and releases internal memory on the GPU. All internally allocated memory used by an instance of fmat is automatically released as soon as the instance goes out of scope. For example, if an instance of fmat is declared inside a function, it will be automatically destroyed at the end of the function. To forcefully release memory at any point, use .reset(); note that in normal use this is not required.

Advanced constructors:

Examples:

fmat A(5, 5, fill::randu);

float x = A(1, 2); // note: try to avoid repeated individual element accesses!

fmat B = A + A;
fmat C = A * B;
fmat D = A % B;

B.zeros();
B.set_size(10, 10);
B.ones(5, 6);

B.print("B:");

// convert from Armadillo
arma::fmat C(10, 10, arma::fill::randu);
fmat D(C);

// advanced constructors

// when using the OpenCL backend
cl_mem m_cl = clCreateBuffer(get_rt().cl_rt.get_context(), CL_MEM_READ_WRITE, sizeof(float) * 24, NULL, NULL);
fmat H(wrap_mem_cl(m_cl), 4, 6);  // use auxiliary memory

// when using the CUDA backend
float* m_cuda;
cudaMalloc(&m_cuda, sizeof(float) * 24);
fmat J(wrap_mem_cuda(m_cuda), 4, 6);  // use auxiliary memory

// make an alias of another matrix
arma::fmat K(D.get_dev_mem(), D.n_rows, D.n_cols);

See also:
- matrix attributes
- accessing elements
- initialising elements
- math & relational operators
- submatrix views
- printing matrices
- .get_dev_mem()
- .eval()
- conv_to() (convert between matrix types)
- explanation of typedef (cplusplus.com)
- Col class
- Row class
- Cube class
- config.hpp

Col<type>
fvec
vec

Classes for column vectors (dense matrices with one column)

The Col<type> class is derived from the Mat<type> class and inherits most of the member functions

For convenience the following typedefs have been defined:

`fvec`	=	`fcolvec`	=	`Col<float>`
`vec`	=	`colvec`	=	`Col<double>`	note: not supported on all devices
`dvec`	=	`dcolvec`	=	`Col<double>`	note: not supported on all devices
`uvec`	=	`ucolvec`	=	`Col<uword>`
`ivec`	=	`icolvec`	=	`Col<sword>`
`u32_vec`	=	`u32_colvec`	=	`Col<u32>`
`s32_vec`	=	`s32_colvec`	=	`Col<s32>`
`u64_vec`	=	`u64_colvec`	=	`Col<u64>`
`s64_vec`	=	`s64_colvec`	=	`Col<s64>`

Use of vectors with the fvec type is preferred over the vec type, as standard consumer-focused GPUs are considerably more performant with 32-bit floats (ie. float element type) rather than 64-bit floats (ie. double element type)

Functions which take Mat as input can generally also take Col as input; main exceptions are functions which require square matrices

In this documentation, the types fvec or fcolvec are used for convenience, speed, and portability; it is possible to use other types instead, e.g. vec, colvec; note that the fvec and fcolvec have the same meaning and are used interchangeably

Constructors:

`fvec()`
`fvec(n_elem)`
`fvec(n_elem, fill_form)`		(elements are initialised according to fill_form)
`fvec(size(X))`
`fvec(size(X), fill_form)`		(elements are initialised according to fill_form)
`fvec(initializer_list)`
`fvec(string)`		(elements separated by spaces)
`fvec(std::vector)`
`fvec(fvec)`
`fvec(arma::fvec)`		(convert from CPU-based Armadillo vector)
`fvec(fmat)`		(std::logic_error exception is thrown if the given matrix has more than one column)

Caveats:
- the elements are initialised to zero if fill_form is not specified; see the Mat class for details on fill_form
- setting the elements one by one is generally very inefficient and should be avoided if possible

Advanced constructors:

Examples:

fvec x(10);
fvec y(10, fill::ones);

fmat A(10, 10, fill::randu);
fvec z = A.col(5); // extract a column vector

// convert from Armadillo
arma::fvec d(100, arma::fill::randu);
fvec e(d);

See also:

Row<type>
frowvec
rowvec

Classes for row vectors (dense matrices with one row)

The template Row<type> class is derived from the Mat<type> class and inherits most of the member functions

For convenience the following typedefs have been defined:

`frowvec`	=	`Row<float>`
`rowvec`	=	`Row<double>`	note: not supported on all devices
`drowvec`	=	`Row<double>`	note: not supported on all devices
`urowvec`	=	`Row<uword>`
`irowvec`	=	`Row<sword>`
`u32_rowvec`	=	`Row<u32>`
`s32_rowvec`	=	`Row<s32>`
`u64_rowvec`	=	`Row<u64>`
`s64_rowvec`	=	`Row<s64>`

Use of vectors with the frowvec type is preferred over the rowvec type, as standard consumer-focused GPUs are considerably more performant with 32-bit floats (ie. float element type) rather than 64-bit floats (ie. double element type)

Functions which take Mat as input can generally also take Row as input; main exceptions are functions which require square matrices

In this documentation, the type frowvec is used for convenience, speed, and portability; it is possible to use other types instead, e.g. rowvec

Constructors:

`frowvec()`
`frowvec(n_elem)`
`frowvec(n_elem, fill_form)`		(elements are initialised according to fill_form)
`frowvec(size(X))`
`frowvec(size(X), fill_form)`		(elements are initialised according to fill_form)
`frowvec(initializer_list)`
`frowvec(string)`		(elements separated by spaces)
`frowvec(std::vector)`
`frowvec(frowvec)`
`frowvec(arma::frowvec)`		(convert from CPU-based Armadillo row vector)
`frowvec(fmat)`		(std::logic_error exception is thrown if the given matrix has more than one row)

Caveats:
- the elements are initialised to zero if fill_form is not specified; see the Mat class for details on fill_form
- setting the elements one by one is generally very inefficient and should be avoided if possible

Advanced constructors:

Examples:

frowvec x(10);
frowvec y(10, fill::ones);

fmat    A(10, 10, fill::randu);
frowvec z = A.row(5); // extract a row vector

// convert from Armadillo
arma::frowvec d(100, arma::fill::randu);
fvec e(d);

See also:

Cube<type>
fcube
cube

Classes for cubes (quasi 3rd order tensors), also known as "3D matrices"

Data is stored as a set of slices (matrices) stored contiguously within memory; within each slice, elements are stored with column-major ordering (ie. column by column)

The root cube class is Cube<type>, where type is one of:
- float, double, std::complex<float>, std::complex<double>, short, int, long and unsigned versions of short, int, long

For convenience the following typedefs have been defined:

`fcube`	=	`Cube<float>`
`cube`	=	`Cube<double>`
`dcube`	=	`Cube<double>`
`ucube`	=	`Cube<uword>`
`icube`	=	`Cube<sword>`
`u32_cube`	=	`Cube<u32>`
`s32_cube`	=	`Cube<s32>`
`u64_cube`	=	`Cube<u64>`
`s64_cube`	=	`Cube<s64>`

Use of cubes with the fcube type is preferred over the cube type, as standard consumer-focused GPUs are considerably more performant with 32-bit floats (i.e. float element type) rather than 64-bit floats (i.e. double element type)
In this documentation the fcube type is used for convenience, speed, and portability; it is possible to use other types instead, eg. cube

Constructors:

fcube()

`fcube(n_rows, n_cols, n_slices)`
`fcube(n_rows, n_cols, n_slices, fill_form)`		(elements are initialised according to fill_form)
`fcube(size(X))`
`fcube(size(X), fill_form)`		(elements are initialised according to fill_form)
`fcube(cube)`

The elements can be explicitly initialised during construction by specifying fill_form, which is one of:

`fill::zeros`	↦	set all elements to 0 (default)
`fill::ones`	↦	set all elements to 1
`fill::randu`	↦	set all elements to random values from a uniform distribution in the [0,1] interval
`fill::randn`	↦	set all elements to random values from a normal/Gaussian distribution with zero mean and unit variance
`fill::value(scalar)`	↦	set all elements to specified scalar
`fill::none`	↦	do not initialise the elements (cube may have garbage values)

Caveats:
- the elements are initialised to zero if fill_form is not specified; use fill::none to disable initialisation of elements
- setting the elements one by one is generally very inefficient and should be avoided if possible

Each instance of fcube automatically allocates and releases internal memory on the GPU. All internally allocated memory used by an instance of fcube is automatically released as soon as the instance goes out of scope. For example, if an instance of fcube is declared inside a function, it will be automatically destroyed at the end of the function. To forcefully release memory at any point, use .reset(); note that in normal use this is not required.

Advanced constructors:

Examples:

fcube x(1, 2, 3);
fcube y(4, 5, 6, fill::randu);

fmat A = y.slice(1);  // extract a slice from the cube
                      // (each slice is a matrix)

fmat B(4, 5, fill::randu);
y.slice(2) = B;     // set a slice in the cube

fcube q = y + y;     // cube addition
fcube r = y % y;     // element-wise cube multiplication

y.ones();

Notes:
- Each cube slice can be interpreted as a matrix, hence functions which take Mat as input can generally also take cube slices as input
- The size of individual slices can't be changed. For example, the following will not work:
See also:

operators: + − * % / == != <= >= < > && ||

Overloaded operators for Mat, Col, Row, and Cube classes

Operations:

`+`		addition of two objects
`−`		subtraction of one object from another or negation of an object

`*`		matrix multiplication of two objects; note applicable to the Cube class unless multiplying by a scalar

`%`		element-wise multiplication of two objects (Schur product)
`/`		element-wise division of an object by another object or a scalar

`==`		element-wise equality evaluation of two non-Cube objects; generates a matrix of type umat
`!=`		element-wise non-equality evaluation of two non-Cube objects; generates a matrix of type umat

`>=`		element-wise "greater than or equal to" evaluation of two non-Cube objects; generates a matrix of type umat
`<=`		element-wise "less than or equal to" evaluation of two non-Cube objects; generates a matrix of type umat

`>`		element-wise "greater than" evaluation of two non-Cube objects; generates a matrix of type umat
`<`		element-wise "less than" evaluation of two non-Cube objects; generates a matrix of type umat

`&&`		element-wise logical AND evaluation of two non-Cube objects; generates a matrix of type umat
`\|\|`		element-wise logical OR evaluation of two non-Cube objects; generates a matrix of type umat

For element-wise relational and logical operations (ie. ==, !=, >=, <=, >, <, &&, ||) each element in the generated object is either 0 or 1, depending on the result of the operation

Caveat: operators involving equality comparison (ie. ==, !=, >=, <=) are not recommended for matrices of type mat or fmat, due to the necessarily limited precision of floating-point element types

Broadcasting operations are available via .each_col() and .each_row()

If incompatible object sizes are used, a std::logic_error exception is thrown

Examples:

fmat A(5, 10, fill::randu);
fmat B(5, 10, fill::randu);
fmat C(10, 5, fill::randu);

fmat P = A + B;
fmat Q = A - B;
fmat R = -B;
fmat S = A / 123.0;
fmat T = A % B;
fmat U = A * C;

fmat V = A + B + A + B;

imat AA = linspace<imat>(1, 9, 9);
imat BB = linspace<imat>(9, 1, 9);

// compare elements
umat ZZ = (AA >= BB);

See also:
- pow()
- any()
- all()
- accu()
- as_scalar()
- find()
- .each_col() & .each_row() (vector operations applied to each column or row)
- miscellaneous element-wise functions (exp, log, sqrt, square, round, ...)
- floating point arithmetic in Wikipedia
- floating point representation in MathWorld

Member Functions & Variables

attributes

`.n_rows`		number of rows; present in Mat, Col, Row, and Cube
`.n_cols`		number of columns; present in Mat, Col, Row, and Cube
`.n_elem`		total number of elements; present in Mat, Col, Row, and Cube
`.n_slices`		number of slices; present in Cube

The variables are of type uword

The variables are read-only; to change the size, use .set_size(), .copy_size(), .zeros(), .ones(), or .reset()

For the Col and Row classes, n_elem also indicates vector length

Examples:

fmat X(4,5);
cout << "X has " << X.n_cols << " columns" << endl;

See also:
- .set_size()
- .copy_size()
- .zeros()
- .ones()
- .reset()
- size()

element access via (), [] and .at()

Provide access to individual elements in a Mat, Col, Row, or Cube

`(i)`		For fvec and frowvec, access the element stored at index i. For fmat and fcube, access the element stored at index i under the assumption of a flat layout, with column-major ordering of data (i.e. column by column). An exception is thrown if the requested element is out of bounds.

`.at(i)` or `[i]`		As for `(i)`, but without a bounds check; not recommended; see the caveats below

`(r,c)`		For fmat, access the element stored at row r and column c. An exception is thrown if the requested element is out of bounds.

`.at(r,c)`		As for `(r,c)`, but without a bounds check; not recommended; see the caveats below

`(r,c,s)`		For fcube, access the element stored at row r, column c, and slice s. An exception is thrown if the requested element is out of bounds.

`.at(r,c,s)`		As for `(r,c,s)`, but without a bounds check; not recommended; see the caveats below

Important: every element access involves a transfer from GPU memory to CPU memory; therefore, for efficiency, avoid repeated element access when possible; see the Armadillo adaptation guide for more details and suggestions.

The indices of elements are specified via the uword type, which is a typedef for an unsigned integer type.

Caveats:
- accessing elements without bounds checks is slightly faster, but is not recommended until your code has been thoroughly debugged first
- indexing in C++ starts at 0
- accessing elements via [r,c] and [r,c,s] does not work correctly in C++; instead use (r,c) and (r,c,s)

Examples:

// remember that individual element accesses are slow and should be avoided;
// when possible, operate in batch instead of on individual elements!
fmat M(10, 10, fill::randu);

M(9, 9) = 123.0;
float x = M(1, 2);

fvec v(10, fill::randu);

v(9) = 123.0;
float y = v(0);

See also:

element initialisation

Set elements in Mat, Col, Row via braced initialiser lists

Examples:

fvec v = { 1, 2, 3 };

fmat A = { {1, 3, 5},
           {2, 4, 6} };

See also:

.zeros()			(member function of Mat, Col, Row, and Cube)
.zeros( n_elem )			(member function of Col and Row)
.zeros( n_rows, n_cols )			(member function of Mat)
.zeros( n_rows, n_cols, n_slices )			(member function of Cube)
.zeros( size(X) )			(member function of Mat, Col, Row, and Cube)

Set the elements of an object to zero, optionally first changing the size to specified dimensions

Examples:

fmat A;
A.zeros(5, 10);   // or:  fmat A(5, 10, fill::zeros);

fvec B;
B.zeros(100);

fmat C(5, 10, fill::randu);
C.zeros();

fmat D;
D.zeros( size(C) );

See also:
- zeros() (standalone function)
- .ones()
- .randu()
- .fill()
- .reset()
- .set_size()
- size()

.ones()			(member function of Mat, Col, Row, and Cube)
.ones( n_elem )			(member function of Col and Row)
.ones( n_rows, n_cols )			(member function of Mat)
.ones( n_rows, n_cols, n_slices )			(member function of Cube)
.ones( size(X) )			(member function of Mat, Col, Row, and Cube)

Set all the elements of an object to one, optionally first changing the size to specified dimensions

Examples:

fmat A;
A.ones(5, 10);   // or:  fmat A(5, 10, fill::ones);

fvec B;
B.ones(100);

fmat C(5, 10, fill::randu);
C.ones();

fmat D;
D.ones( size(C) );

See also:
- ones() (standalone function)
- .eye()
- .zeros()
- .fill()
- .randu()
- size()

.eye()
.eye( n_rows, n_cols )
.eye( size(X) )

Member function of Mat

Set the elements along the main diagonal to one and off-diagonal elements to zero, optionally first changing the size to specified dimensions

An identity matrix is generated when n_rows = n_cols

Examples:

fmat A;
A.eye(5, 5);  // or:  fmat A(5, 5, fill::eye);

fmat B;
B.eye( size(A) );

fmat C(5, 5, fill::randu);
C.eye();

See also:
- .ones()
- .diag()
- diagmat()
- diagvec()
- eye() (standalone function)
- size()

.randu()			(member function of Mat, Col, Row, and Cube)
.randu( n_elem )			(member function of Col and Row)
.randu( n_rows, n_cols )			(member function of Mat)
.randu( n_rows, n_cols, n_slices )			(member function of Cube)
.randu( size(X) )			(member function of Mat, Col, Row, and Cube)

.randn()			(member function of Mat, Col, Row, and Cube)
.randn( n_elem )			(member function of Col and Row)
.randn( n_rows, n_cols )			(member function of Mat)
.randn( n_rows, n_cols, n_slices )			(member function of Cube)
.randn( size(X) )			(member function of Mat, Col, Row, Cube)

Set all the elements to random values, optionally first changing the size to specified dimensions

.randu() uses a uniform distribution in the [0,1] interval

.randn() uses a normal/Gaussian distribution with zero mean and unit variance

To change the RNG seed, use coot_rng::set_seed(value) or coot_rng::set_seed_random() functions

Examples:

fmat A;
A.randu(5, 10);   // or:  fmat A(5, 10, fill::randu);

fvec B;
B.randu(100);

fmat C(5, 10, fill::zeros);
C.randu();

fmat D;
D.randn( size(C) );

coot_rng::set_seed_random(); // set the seed to a random value
coot_rng::set_seed(42);      // set the seed to a specific value

See also:
- randu() (standalone function)
- randn() (standalone function with extended functionality)
- .fill()
- .ones()
- .zeros()
- size()
- uniform distribution in Wikipedia
- normal distribution in Wikipedia

.fill( value )

Member function of Mat, Col, Row, and Cube

Sets the elements to a specified value

The type of value must match the type of elements used by the container object (e.g. for fmat the type is float)

Examples:

See also:

.clamp( min_value, max_value )

Member function of Mat, Col, Row, and Cube

Clamp each element to the [min_val, max_val] interval; any value lower than min_val will be set to min_val, and any value higher than max_val will be set to max_val

Examples:

fmat A(5, 6);
A.randu();

A.clamp(0.2, 0.8);

See also:
- clamp() (standalone function)
- .min() & .max()
- relational operators

.set_size( n_elem )			(member function of Col and Row)
.set_size( n_rows, n_cols )			(member function of Mat)
.set_size( n_rows, n_cols, n_slices )			(member function of Cube)
.set_size( size(X) )			(member function of Mat, Col, Row, and Cube)

Change the size of an object, without explicitly preserving data and without initialising the elements (i.e. elements may contain garbage values, including NaN)

To initialise the elements to zero while changing the size, use .zeros() instead

To explicitly preserve data while changing the size, use .reshape() or .resize() instead;
NOTE: .reshape() and .resize() are considerably slower than .set_size()

Examples:

fmat A;
A.set_size(5, 10);      // or:  mat A(5, 10);

fmat B;
B.set_size( size(A) );  // or:  mat B(size(A));

fvec v;
v.set_size(100);        // or:  vec v(100);

See also:
- .reset()
- .copy_size()
- .reshape()
- .resize()
- .zeros()
- size()

.reshape( n_rows, n_cols )			(member function of Mat)
.reshape( size(X) )			(member function of Mat)

Recreate the object according to given size specifications, with the elements taken from the previous version of the object in a column-wise manner; the elements in the generated object are placed column-wise (i.e. the first column is filled up before filling the second column)

The layout of the elements in the recreated object will be different to the layout in the previous version of the object

If the total number of elements in the previous version of the object is less than the specified size, the extra elements in the recreated object are set to zero

If the total number of elements in the previous version of the object is greater than the specified size, only a subset of the elements is taken

Caveats:
- to change the size without preserving data, use .set_size() instead, which is much faster
- to grow/shrink the object while preserving the elements as well as the layout of the elements, use .resize() instead
- to flatten a matrix into a vector, use vectorise() instead

Examples:

fmat A(4, 5);
A.randu();

A.reshape(5, 4);

See also:
- .resize()
- .set_size()
- .copy_size()
- .zeros()
- .reset()
- reshape() (standalone function)
- vectorise()
- size()

.resize( n_elem )			(member function of Col and Row)
.resize( n_rows, n_cols )			(member function of Mat)
.resize( size(X) )			(member function of Mat, Col, and Row)

Recreate the object according to given size specifications, while preserving the elements as well as the layout of the elements

Can be used for growing or shrinking an object (i.e. adding/removing rows, and/or columns)

Caveat: to change the size without preserving data, use .set_size() instead, which is much faster

Examples:

fmat A(4, 5);
A.randu();

A.resize(7, 6);

See also:
- .reshape()
- .set_size()
- .copy_size()
- .zeros()
- .reset()
- resize() (standalone function)
- vectorise()
- size()

.copy_size( A )

Set the size to be the same as object A

Object A must be of the same root type as the object being modified (e.g. the size of a matrix can't be set by providing a cube)

Examples:

fmat A(5, 6, fill::randu);

fmat B;
B.copy_size(A);

cout << B.n_rows << endl;
cout << B.n_cols << endl;

See also:
- .reset()
- .set_size()
- .reshape()
- .resize()
- .zeros()
- size()

.reset()

Reset the size to zero (the object will have no elements)

Examples:

See also:

submatrix views

A collection of member functions of Mat, Col and Row classes that provide read/write access to submatrix views
contiguous views for matrix X:

contiguous views for vector V:

related matrix views (documented separately)

Instances of span(start,end) can be replaced by span::all to indicate the entire range

Examples:

fmat A(5, 10, fill::zeros);

A.submat( 0,1, 2,3 )      = randu<fmat>(3, 3);
A( span(0,2), span(1,3) ) = randu<fmat>(3, 3);
A( 0,1, size(3,3) )       = randu<fmat>(3, 3);

fmat B = A.submat( 0,1, 2,3 );
fmat C = A( span(0,2), span(1,3) );
fmat D = A( 0,1, size(3,3) );

A.col(1)        = randu<fmat>(5,1);
A(span::all, 1) = randu<fmat>(5,1);

// add 123 to the last 5 elements of vector a
vec a(10);
a.randu();
a.subvec(a.n_elem - 5, a.n_elem - 1) += 123.0;

// add 123 to the first 3 elements of column 2 of X
X.col(2).subvec(0, 2) += 123;

See also:
- diagonal views
- .each_col() & .each_row() (vector operations applied to each column or row)
- find()
- join_rows / cols / slices
- size()
- subcube views

subcube views and slices

A collection of member functions of the Cube class that provide subcube views

contiguous views for cube Q:

slice(

)

slices(

)

row(

)

rows(

)

col(

)

cols(

)

subcube(

)

( span(

), span(

) )

(

, size(

) )

(

, size(

) )

[ R is a cube ]

head_slices(

)

tail_slices(

)

tube(

)

tube(

)

tube( span(

), span(

) )

tube(

, size(

) )

Instances of span(a,b) can be replaced by:
- span() or span::all, to indicate the entire range
- span(a), to indicate a particular row, column or slice

An individual slice, accessed via .slice(), is an instance of the Mat class (a reference to a matrix is provided)

All .tube() forms are variants of .subcube(), using first_slice = 0 and last_slice = Q.n_slices-1

The .tube(row,col) form uses row = first_row = last_row, and col = first_col = last_col

Examples:

fcube A(2, 3, 4, fill::randu);

fmat  B = A.slice(1); // each slice is a matrix

A.slice(0) = randu<fmat>(2,3);
A.slice(0)(1,2) = 99.0;

A.subcube(0,0,1,  1,1,2)             = randu<fcube>(2,2,2);
A( span(0,1), span(0,1), span(1,2) ) = randu<fcube>(2,2,2);
A( 0,0,1, size(2,2,2) )              = randu<fcube>(2,2,2);

fcube C = A.head_slices(2);  // get first two slices

A.head_slices(2) += 123.0;

See also:

.get_dev_mem()
.get_dev_mem( synchronise )

Member function of Mat, Col, Row, and Cube

Obtain dev_mem_t object that holds raw GPU memory handles

By default, all asynchronous GPU operations are forced to complete, unless synchronise is passed as false

Depending on backend configuration, underlying GPU memory may be accessed as
- .get_dev_mem().cl_mem_ptr for the OpenCL backend; this has type coot_cl_mem
  - The coot_cl_mem struct has a .ptr member of type cl_mem and an .offset member of type size_t
  - By default .offset is 0, but if not, it represents the number of elements into .ptr that the dev_mem_t starts at
- .get_dev_mem().cuda_mem_ptr for the CUDA backend; for a matrix type Mat<eT>, this has type eT* (e.g. for fmat the type will be float*)

Arithmetic operations (+, +=, -, -=) and logical operations (==, !=) are defined for dev_mem_t and can be used to make aliases with the advanced constructors
- A dev_mem_t pointing outside the bounds of the allocated memory will result in undefined behavior and crashes!

Examples:

// when using the OpenCL backend
fmat A(3, 4, fill::randu);
cl_mem A_mem = A.get_dev_mem().cl_mem_ptr.ptr;

// when using the CUDA backend
fmat B(3, 4, fill::randu);
float* B_mem = B.get_dev_mem().cuda_mem_ptr;

// when using any backend
fmat C(3, 4, fill::randu);
dev_mem_t C_mem = C.get_dev_mem();
fmat D(C_mem + 4, 3, 3); // D is an alias starting at the second column of C

See also:

compatibility container functions

Member functions to mimic the functionality of containers in the C++ standard library:

.front()		access the first element in an object (cannot be modified)
.back()		access the last element in an object (cannot be modified)
.clear()		causes an object to have no elements
.empty()		returns true if the object has no elements; returns false if the object has one or more elements
.size()		returns the total number of elements

Important: calling .front() or .back() involves a transfer from GPU memory to CPU memory; therefore, for efficiency, avoid repeated calls to either function when possible; see the Armadillo adaptation guide for more details and suggestions.

Examples:

mat A(5, 5, fill::randu);
cout << A.size() << endl;

A.clear();
cout << A.empty() << endl;

See also:

.col_as_mat( col_number )
.row_as_mat( row_number )

Member functions of any cube expression

.col_as_mat( col_number ):
- return a matrix representation of the specified cube column
- the number of rows is preserved
- given a cube with size R x C x S, the resultant matrix size is R x S

.row_as_mat( row_number ):
- return a matrix representation of the specified cube row
- the number of columns is preserved
- given a cube with size R x C x S, the resultant matrix size is S x C

Examples:

fcube Q(5, 4, 3, fill::randu);

fmat A = Q.col_as_mat(2);  // size of A: 5x3

fmat B = Q.row_as_mat(2);  // size of B: 3x4

See also:
- .slice()
- vectorise()

.diag()
.diag( k )

Member function of Mat

Read/write access to a diagonal in a matrix

The argument k is optional; by default the main diagonal is accessed (k = 0)

For k > 0, the k-th super-diagonal is accessed (top-right corner)

For k < 0, the k-th sub-diagonal is accessed (bottom-left corner)

The diagonal is interpreted as a column vector within expressions

Note: to calculate only the diagonal elements of a compound expression, use diagvec() or diagmat()

Examples:

fmat X(5, 5);
X.randu();

fvec a = X.diag();
fvec b = X.diag(1);
fvec c = X.diag(-2);

X.diag() = randu<fvec>(5);
X.diag() += 6;
X.diag().ones();

See also:

.each_col()		.each_row()		(form 1)
.each_col( vector_of_indices )		.each_row( vector_of_indices )		(form 2)

Member functions of Mat

Apply a vector operation to each column or row of a matrix

Similar to "broadcasting" in Matlab / Octave

Supported operations:

`+`	addition	`+=`	in-place addition
`−`	subtraction	`−=`	in-place subtraction
`%`	element-wise multiplication	`%=`	in-place element-wise multiplication
`/`	element-wise division	`/=`	in-place element-wise division
`=`	assignment (copy)

the argument vector_of_indices contains a list of indices of the columns/rows to be used; it must evaluate to a vector of type uvec
arithmetic operations as per form 1 are supported

Examples:

mat X(6, 5, fill::ones);
vec v = linspace<vec>(10,15,6);

X.each_col() += v;         // in-place addition of v to each column vector of X

mat Y = X.each_col() + v;  // generate Y by adding v to each column vector of X

// subtract v from columns 0 through to 3 in X
X.cols(0,3).each_col() -= v;


uvec indices(2);
indices(0) = 2;
indices(1) = 4;

X.each_col(indices) = v;   // copy v to columns 2 and 4 in X

See also:

.t()
.st()

Member functions of any matrix or vector expression

.t() and .st() provide transposed copies of the matrix

Examples:

fmat A(4, 5);
A.randu();

fmat B = A.t();

See also:

.min()
.max()

Return the extremum value of any matrix or cube expression

Examples:

fmat A(5, 5, fill::randu);

float max_val = A.max();

See also:
- .index_min() & .index_max()
- min() & max() (standalone functions with extended functionality)
- clamp()

.index_min()
.index_max()

Return the linear index of the extremum value of any matrix or cube expression

The returned index is of type uword

Examples:

fmat A(5, 5, fill::randu);

uword i = A.index_max();

float max_val = A(i);

See also:
- .min() & .max()
- index_min() & index_max() (standalone functions with extended functionality)
- sort_index()
- find()
- element access

.eval()

Member function of any matrix, vector, or cube expression

Explicitly forces the evaluation of a delayed expression and outputs a matrix

This function should be used sparingly and only in cases where it is absolutely necessary; indiscriminate use can degrade performance

Examples:

fmat A(4, 4, fill::randu);

A.t().eval().print("A.t()");

See also:
- as_scalar()
- Mat class

.is_empty()

Returns true if the object has no elements

Returns false if the object has one or more elements

Examples:

fmat A(5, 5, fill::randu);
cout << A.is_empty() << endl;

A.reset();
cout << A.is_empty() << endl;

See also:

.is_vec()
.is_colvec()
.is_rowvec()

Member functions of Mat

.is_vec():
- returns true if the matrix can be interpreted as a vector (either column or row vector)
- returns false if the matrix does not have exactly one column or one row

.is_colvec():
- returns true if the matrix can be interpreted as a column vector
- returns false if the matrix does not have exactly one column

.is_rowvec():
- returns true if the matrix can be interpreted as a row vector
- returns false if the matrix does not have exactly one row

Caveat: do not assume that the vector has elements if these functions return true; it is possible to have an empty vector (eg. 0x1)

Examples:

fmat A(1, 5, fill::randu);
fmat B(5, 1, fill::randu);
fmat C(5, 5, fill::randu);

cout << A.is_vec() << endl;
cout << B.is_vec() << endl;
cout << C.is_vec() << endl;

See also:

.is_square()

Member function of Mat

Returns true if the matrix is square, ie. number of rows is equal to the number of columns

Returns false if the matrix is not square

Examples:

fmat A(5, 5, fill::randu);
fmat B(6, 7, fill::randu);

cout << A.is_square() << endl;
cout << B.is_square() << endl;

See also:

.is_finite()

Member function of Mat, Col, Row, and Cube

Returns true if all elements of the object are finite

Returns false if at least one of the elements of the object is non-finite (±infinity or NaN)

Examples:

fmat A(5, 5, fill::randu);
fmat B(5, 5, fill::randu);

B(1,1) = fdatum::inf;

cout << A.is_finite() << endl;
cout << B.is_finite() << endl;

See also:

.has_inf()

Member function of Mat, Col, Row, and Cube

Returns true if at least one of the elements of the object is ±infinity

Returns false otherwise

Examples:

fmat A(5, 5, fill::randu);
fmat B(5, 5, fill::randu);

B(1,1) = fdatum::inf;

cout << A.has_inf() << endl;
cout << B.has_inf() << endl;

See also:

.has_nan()

Member function of Mat, Col, Row, and Cube

Returns true if at least one of the elements of the object is NaN (not-a-number)

Returns false otherwise

Caveat: NaN is not equal to anything, even itself

Examples:

fmat A(5, 5, fill::randu);
fmat B(5, 5, fill::randu);

B(1,1) = fdatum::nan;

cout << A.has_nan() << endl;
cout << B.has_nan() << endl;

See also:

.print()
.print( header )

.print( stream )
.print( stream, header )

Member functions of Mat, Col, Row, and Cube

Print the contents of an object to the std::cout stream (default), or a user specified stream, with an optional header string

Objects can also be printed using the << stream operator

Examples:

fmat A(5, 5, fill::randu);
fmat B(6, 6, fill::randu);

A.print();

// print a transposed version of A
A.t().print();

// "B:" is the optional header line
B.print("B:");

cout << A << endl;

cout << "B:" << endl;
cout << B << endl;

See also:

.raw_print()
.raw_print( header )

.raw_print( stream )
.raw_print( stream, header )

Member functions of Mat, Col, Row, and Cube

Similar to the .print() member function, with the difference that no formatting of the output is done; the stream's parameters such as precision, cell width, etc. can be set manually

If the cell width is set to zero, a space is printed between the elements

Examples:

fmat A(5, 5, fill::randu);

cout.precision(11);
cout.setf(ios::fixed);

A.raw_print(cout, "A:");

See also:
- .print()
- std::ios_base::fmtflags (cppreference.com)
- std::ios_base::fmtflags (cplusplus.com)

Generated Vectors / Matrices

linspace( start, end )
linspace( start, end, N )

Generate a vector with N elements; the values of the elements are linearly spaced from start to (and including) end

The argument N is optional; by default N = 100

Usage:
- vector_type v = linspace<vector_type>(start, end, N)
- vec v = linspace(start, end, N)

Caveat: for N = 1, the generated vector will have a single element equal to end

Examples:

    vec a = linspace(0, 5, 6);

frowvec b = linspace<frowvec>(5, 0, 6);

See also:

logspace( A, B )
logspace( A, B, N )

Generate a vector with N elements; the values of the elements are logarithmically spaced from 10^A to (and including) 10^B

The argument N is optional; by default N = 50

Usage:
- vector_type v = logspace<vector_type>(A, B, N)
- vec v = logspace(A, B, N)

Examples:

    vec a = logspace(0, 5, 6);

frowvec b = logspace<frowvec>(5, 0, 6);

See also:
- linspace()
- regspace()

regspace( start, end )
regspace( start, delta, end )

Generate a vector with regularly spaced elements:
[ (start + 0*delta), (start + 1*delta), (start + 2*delta), ⋯, (start + M*delta) ]
where M = floor((end-start) / delta), so that (start + M*delta) ≤ end

Similar in operation to the Matlab/Octave colon operator, ie. start:end and start:delta:end

If delta is not specified:
- delta = +1, if start ≤ end
- delta = −1, if start > end (caveat: this is different than Matlab/Octave)

An empty vector is generated when one of the following conditions is true:
- start < end, and delta < 0
- start > end, and delta > 0
- delta = 0

Usage:
- vector_type v = regspace<vector_type>(start, end)
- vector_type v = regspace<vector_type>(start, delta, end)
- vec v = regspace(start, end)
- vec v = regspace(start, delta, end)

Examples:

 vec a = regspace(0,  9);             // 0,  1, ...,   9

fvec b = regspace<fvec>(2,  2,  10);  // 2,  4, ...,  10

ivec c = regspace<ivec>(0, -1, -10);  // 0, -1, ..., -10

Caveat: do not use regspace() to specify ranges for contiguous submatrix views; use span() instead

See also:
- linspace()
- logspace()

eye( n_rows, n_cols )
eye( size(X) )

Generate a matrix with the elements along the main diagonal set to one and off-diagonal elements set to zero

An identity matrix is generated when n_rows = n_cols

Usage:
- mat X = eye( n_rows, n_cols )
- matrix_type X = eye<matrix_type>( n_rows, n_cols )
- matrix_type Y = eye<matrix_type>( size(X) )

Examples:

  fmat A = eye(5,5);

  fmat B = 123.0 * eye<fmat>(5,5);

  imat C = eye<imat>( size(B) );

See also:
- .eye() (member function of Mat)
- .diag()
- ones()
- diagmat()
- diagvec()
- size()

ones( n_elem )
ones( n_rows, n_cols )
ones( n_rows, n_cols, n_slices )
ones( size(X) )

Generate a vector, matrix, or cube with all elements set to one

Usage:
- vector_type v = ones<vector_type>( n_elem )
- matrix_type X = ones<matrix_type>( n_rows, n_cols )
- matrix_type Y = ones<matrix_type>( size(X) )
- cube_type Q = ones<cube_type>( n_rows, n_cols, n_slices )
- cube_type R = ones<cube_type>( size(Q) )

Examples:

   fvec v = ones(10);
   uvec u = ones<uvec>(10);
frowvec r = ones<frowvec>(10);

fmat A = ones(5,6);
imat B = ones<imat>(5,6);
umat C = ones<umat>(5,6);

fcube Q = ones(5,6,7);
icube R = ones<icube>(5,6,7);

See also:
- .ones() (member function of Mat, Col, and Row)
- .fill()
- eye()
- linspace()
- regspace()
- zeros()
- randu()

zeros( n_elem )
zeros( n_rows, n_cols )
zeros( n_rows, n_cols, n_slices )
zeros( size(X) )

Generate a vector, matrix, or cube with the elements set to zero

Usage:
- vector_type v = zeros<vector_type>( n_elem )
- matrix_type X = zeros<matrix_type>( n_rows, n_cols )
- matrix_type Y = zeros<matrix_type>( size(X) )
- cube_type Q = zeros<cube_type>( n_rows, n_cols, n_slices )
- cube_type R = zeros<cube_type>( size(Q) )

Examples:

   fvec v = zeros(10);
   uvec u = zeros<uvec>(10);
frowvec r = zeros<rowvec>(10);

fmat A = zeros(5,6);
imat B = zeros<imat>(5,6);
umat C = zeros<umat>(5,6);

fcube Q = zeros(5,6,7);
icube R = zeros<icube>(5, 6, 7);

See also:
- .zeros() (member function of Mat, Col, and Row)
- .fill()
- ones()
- randu()
- size()

randu( n_elem )
randu( n_rows, n_cols )
randu( n_rows, n_cols, n_slices )
randu( size(X) )

Generate a vector, matrix, or cube with the elements set to random floating point values uniformly distributed in the [0,1] interval

Usage:
- vector_type v = randu<vector_type>( n_elem )
- matrix_type X = randu<matrix_type>( n_rows, n_cols )
- matrix_type Y = randu<matrix_type>( size(X) )
- cube_type Q = randu<cube_type>( n_rows, n_cols, n_slices )
- cube_type R = randu<cube_type>( size(Q) )

To change the RNG seed, use coot_rng::set_seed(value) or coot_rng::set_seed_random() functions

Caveat: to generate a matrix with random integer values instead of floating point values, use randi() instead

Examples:

fvec v1 = randu(5);

frowvec r1 = randu<rowvec>(5);

fmat A1 = randu(5, 6);

mat B1 = randu<mat>(5, 6);
mat B2 = randu<mat>(5, 6, distr_param(10,20));

fcube C1 = randu<fcube>(5, 6, 7);

coot_rng::set_seed_random(); // set the seed to a random value
coot_rng::set_seed(42);      // set the seed to a specific value

See also:
- .randu() (member function)
- randn()
- randi()
- ones()
- zeros()
- size()
- uniform distribution in Wikipedia

randn( n_elem )
randn( n_elem, distr_param(mu,sd) )

randn( n_rows, n_cols )
randn( n_rows, n_cols, distr_param(mu,sd) )

randn( n_rows, n_cols, n_slices )
randn( n_rows, n_cols, n_slices, distr_param(mu,sd) )

randn( size(X) )
randn( size(X), distr_param(mu,sd) )

Generate a vector, matrix, or cube with the elements set to random values with normal / Gaussian distribution, parameterised by mean mu and standard deviation sd

The default distribution parameters are mu = 0 and sd = 1

Usage:
- vector_type v = randn<vector_type>( n_elem )
- vector_type v = randn<vector_type>( n_elem, distr_param(mu,sd) )
- matrix_type X = randn<matrix_type>( n_rows, n_cols )
- matrix_type X = randn<matrix_type>( n_rows, n_cols, distr_param(mu,sd) )
- cube_type Q = randn<cube_type>( n_rows, n_cols, n_slices )
- cube_type Q = randn<cube_type>( n_rows, n_cols, n_slices, distr_param(mu,sd) )

To change the RNG seed, use coot_rng::set_seed(value) or coot_rng::set_seed_random() functions

Examples:

fvec v1 = randn(5);
fvec v2 = randn(5, distr_param(10,5));

frowvec r1 = randn<rowvec>(5);
frowvec r2 = randn<rowvec>(5, distr_param(10,5));

fmat A1 = randn(5, 6);
fmat A2 = randn(5, 6, distr_param(10,5));

mat B1 = randn<mat>(5, 6);
mat B2 = randn<mat>(5, 6, distr_param(10,5));

coot_rng::set_seed_random(); // set the seed to a random value
coot_rng::set_seed(42);      // set the seed to a specific value

See also:
- .randn() (member function)
- randu()
- randi()
- size()
- normal distribution in Wikipedia

randi( n_elem )
randi( n_elem, distr_param(a,b) )

randi( n_rows, n_cols )
randi( n_rows, n_cols, distr_param(a,b) )

randi( n_rows, n_cols, n_slices )
randi( n_rows, n_cols, n_slices, distr_param(a,b) )

randi( size(X) )
randi( size(X), distr_param(a,b) )

Generate a vector, matrix, or cube with the elements set to random integer values uniformly distributed in the [a,b] interval

The default distribution parameters are a = 0 and b = maximum_int

Usage:
- vector_type v = randi<vector_type>( n_elem )
- vector_type v = randi<vector_type>( n_elem, distr_param(a,b) )
- matrix_type X = randi<matrix_type>( n_rows, n_cols )
- matrix_type X = randi<matrix_type>( n_rows, n_cols, distr_param(a,b) )
- cube_type Q = randi<cube_type>( n_rows, n_cols, n_slices )
- cube_type Q = randi<cube_type>( n_rows, n_cols, n_slices, distr_param(a,b) )

To change the RNG seed, use coot_rng::set_seed(value) or coot_rng::set_seed_random() functions

Caveat: to generate a matrix with random floating point values (ie. float or double) instead of integers, use randu() instead

Examples:

imat A1 = randi(5, 6);
imat A2 = randi(5, 6, distr_param(-10, +20));

fmat B1 = randi<fmat>(5, 6);
fmat B2 = randi<fmat>(5, 6, distr_param(-10, +20));

coot_rng::set_seed_random(); // set the seed to a random value
coot_rng::set_seed(42);      // set the seed to a specific value

See also:
- randu()
- ones()
- zeros()
- size()

Functions of Vectors / Matrices / Cubes

abs( X )

Obtain the magnitude of each element

X and Y must have the same matrix or cube type, such as fmat or ivec or fcube

Examples:

fmat A = randu<fmat>(5, 5);
fmat B = abs(A);

fvec X = linspace<fvec>(-5, 5, 11);
fvec Y = abs(X);

See also:
- pow()
- miscellaneous element-wise functions

accu( X )

Accumulate (sum) all elements of a vector or matrix

Examples:

fmat A(5, 6, fill::randu);
fmat B(5, 6, fill::randu);

float x = accu(A);

float y = accu(A % B);

See also:
- sum()
- trace()
- mean()
- dot()
- as_scalar()

all( V )
all( X )
all( X, dim )

For vector V, return true if all elements of the vector are non-zero or satisfy a relational condition

For matrix X and
- dim = 0, return a row vector (of type urowvec or umat), with each element (0 or 1) indicating whether the corresponding column of X has all non-zero elements
- dim = 1, return a column vector (of type ucolvec or umat), with each element (0 or 1) indicating whether the corresponding row of X has all non-zero elements

The dim argument is optional; by default dim = 0 is used

Relational operators can be used instead of V or X, eg. A > 0.5

Examples:

fvec V(10, fill::randu);
fmat X(5, 5, fill::randu);

// status1 will be set to true if vector V has all non-zero elements
bool status1 = all(V);

// status2 will be set to true if vector V has all elements greater than 0.5
bool status2 = all(V > 0.5);

// status3 will be set to true if matrix X has all elements greater than 0.6;
// note the use of vectorise()
bool status3 = all(vectorise(X) > 0.6);

// generate a row vector indicating which columns of X have all elements greater than 0.7
umat A = all(X > 0.7);

See also:
- any()
- find()
- conv_to() (convert between matrix/vector types)
- vectorise()

any( V )
any( X )
any( X, dim )

For vector V, return true if any element of the vector is non-zero or satisfies a relational condition

For matrix X and
- dim = 0, return a row vector (of type urowvec or umat), with each element (0 or 1) indicating whether the corresponding column of X has any non-zero elements
- dim = 1, return a column vector (of type ucolvec or umat), with each element (0 or 1) indicating whether the corresponding row of X has any non-zero elements

The dim argument is optional; by default dim = 0 is used

Relational operators can be used instead of V or X, eg. A > 0.9

Examples:

fvec V(10, fill::randu);
fmat X(5, 5, fill::randu);

// status1 will be set to true if vector V has any non-zero elements
bool status1 = any(V);

// status2 will be set to true if vector V has any elements greater than 0.5
bool status2 = any(V > 0.5);

// status3 will be set to true if matrix X has any elements greater than 0.6;
// note the use of vectorise()
bool status3 = any(vectorise(X) > 0.6);

// generate a row vector indicating which columns of X have elements greater than 0.7
umat A = any(X > 0.7);

See also:
- all()
- find()
- conv_to() (convert between matrix/vector types)
- vectorise()

approx_equal( A, B, method, tol )
approx_equal( A, B, method, abs_tol, rel_tol )

Return true if all corresponding elements in A and B are approximately equal

Return false if any of the corresponding elements in A and B are not approximately equal, or if A and B have different dimensions

The argument method controls how the approximate equality is determined; it is one of:

`"absdiff"`	↦	scalars x and y are considered equal if \|x − y\| ≤ tol
`"reldiff"`	↦	scalars x and y are considered equal if \|x − y\| / max( \|x\|, \|y\| ) ≤ tol
`"both"`	↦	scalars x and y are considered equal if \|x − y\| ≤ abs_tol or \|x − y\| / max( \|x\|, \|y\| ) ≤ rel_tol

mat A(5, 5, fill::randu);
mat B = A + 0.001;

bool same1 = approx_equal(A, B, "absdiff", 0.002);


mat C = 1000 * randu<mat>(5,5);
mat D = C + 1;

bool same2 = approx_equal(C, D, "reldiff", 0.1);

bool same3 = approx_equal(C, D, "both", 2, 0.1);

See also:

as_scalar( expression )

Evaluate an expression that results in a 1x1 matrix, followed by converting the 1x1 matrix to a pure scalar

Optimised expression evaluations are automatically used when a binary or trinary expression is given (ie. 2 or 3 terms)

Examples:

frowvec r(5, fill::randu);
fcolvec q(5, fill::randu);

fmat X(5, 5, fill::randu);

// examples of expressions which have optimised implementations

float a = as_scalar(r*q);
float b = as_scalar(r*X*q);
float c = as_scalar(r*diagmat(X)*q);
float d = as_scalar(r*inv(diagmat(X))*q);

See also:
- vectorise()
- accu()
- trace()
- dot()
- norm()
- conv_to()

clamp( X, min_val, max_val )

Create a copy of X with each element clamped to the [min_val, max_val] interval;
any value lower than min_val will be set to min_val, and any value higher than max_val will be set to max_val

Examples:

fmat A(5, 5, fill::randu);

fmat B = clamp(A, 0.2,         0.8);
fmat C = clamp(A, min(min(A)), 0.8);
fmat D = clamp(A, 0.2, max(max(A)));

See also:
- .clamp() (member function)
- .min() & .max()
- find()

conv_to< type >::from( X )

Convert (cast) from one matrix type to another (eg. fmat to imat), or one cube type to another (e.g. fcube to icube)

Conversion between Armadillo and Bandicoot vectors/matrices/cubes is also possible

Conversion of a fmat object into fcolvec or frowvec is possible if the object can be interpreted as a vector

When conv_to is applied to an expression, the conversion operation will be fused with the expression computation when possible

Examples:

fmat A(5, 5, fill::randu);
 mat B = conv_to<mat>::from(A);

fmat C(10, 1, fill::randu);
fcolvec x = conv_to< fcolvec >::from(C);

// convert from Armadillo object
arma::fmat D = conv_to<arma::fmat>::from(A);

See also:

cross( A, B )

Calculate the cross product between A and B, under the assumption that A and B are 3 dimensional vectors

Examples:

fvec a(3, fill::randu);
fvec b(3, fill::randu);

fvec c = cross(a, b);

See also:

val = det( A )		(form 1)
det( val, A )		(form 2)

Calculate the determinant of square matrix A, based on LU decomposition

form 1: return the determinant

form 2: store the calculated determinant in val and return a bool indicating success

If A is not square sized, a std::logic_error exception is thrown

If the calculation fails:
- val = det(A) throws a std::runtime_error exception
- det(val,A) returns a bool set to false (exception is not thrown)

Examples:

fmat A(5, 5, fill::randu);

float val1 = det(A);         // form 1

float val2;
bool success = det(val2, A); // form 2

See also:
- determinant in MathWorld
- determinant in Wikipedia

diagmat( V )
diagmat( V, k )

diagmat( X )
diagmat( X, k )

Generate a diagonal matrix from vector V or matrix X

Given vector V, generate a square matrix with the k-th diagonal containing a copy of the vector; all other elements are set to zero

Given matrix X, generate a matrix with the k-th diagonal containing a copy of the k-th diagonal of X; all other elements are set to zero

If X is an expression, the evaluation of the expression aims to calculate only the diagonal elements

The argument k is optional; by default the main diagonal is used (k = 0)

For k > 0, the k-th super-diagonal is used (above main diagonal, towards top-right corner)

For k < 0, the k-th sub-diagonal is used (below main diagonal, towards bottom-left corner)

Examples:

fmat A = randu<fmat>(5, 5);
fmat B = diagmat(A);
fmat C = diagmat(A,1);

fvec v = randu<fvec>(5);
fmat D = diagmat(v);
fmat E = diagmat(v,1);

See also:

diagvec( X )
diagvec( X, k )

Extract the k-th diagonal from matrix X

If X is an expression, the evaluation of the expression aims to calculate only the diagonal elements

The argument k is optional; by default the main diagonal is extracted (k = 0)

For k > 0, the k-th super-diagonal is extracted (top-right corner)

For k < 0, the k-th sub-diagonal is extracted (bottom-left corner)

The extracted diagonal is interpreted as a column vector

Examples:

fmat A(5, 5, fill::randu);

fvec d = diagvec(A);

See also:
- .diag()
- diagmat()
- trace()
- vectorise()

dot( A, B )

Dot product of A and B, treating A and B as vectors

Caveat: norm() is more robust for calculating the norm, as it handles underflows and overflows

Examples:

fvec a(10, fill::randu);
fvec b(10, fill::randu);

float x = dot(a,b);

See also:

find( X )
find( X, k )
find( X, k, s )

Return a column vector containing the indices of elements of X that are non-zero or satisfy a relational condition

The output vector must have the type uvec (i.e. the indices are stored as unsigned integers of type uword)

X is interpreted as a vector, with column-by-column ordering of the elements of X

Relational operators can be used instead of X, eg. A > 0.5

If k = 0 (default), return the indices of all non-zero elements, otherwise return at most k of their indices

If s = "first" (default), return at most the first k indices of the non-zero elements

If s = "last", return at most the last k indices of the non-zero elements

Caveats:
- to clamp values to an interval, clamp() is more efficient

Examples:

fmat A(5, 5, fill::randu);
fmat B(5, 5, fill::randu);

uvec q1 = find(A > B);
uvec q2 = find(A > 0.5);
uvec q3 = find(A > 0.5, 3, "last");

See also:

find_finite( X )

Return a column vector containing the indices of elements of X that are finite (i.e. not ±Inf and not NaN)

The output vector must have the type uvec (i.e. the indices are stored as unsigned integers of type uword)

X is interpreted as a vector, with column-by-column ordering of the elements of X

Examples:

fmat A(5, 5, fill::randu);

A(1, 1) = datum::inf;

// find only finite elements
uvec f = find_finite(A);

See also:

find_nonfinite( X )

Return a column vector containing the indices of elements of X that are non-finite (i.e. ±Inf or NaN)

The output vector must have the type uvec (i.e. the indices are stored as unsigned integers of type uword)

X is interpreted as a vector, with column-by-column ordering of the elements of X

Examples:

fmat A(5, 5, fill::randu);

A(1, 1) = datum::inf;
A(2, 2) = datum::nan;

// return indices of two non-finite elements
uvec f = find_nonfinite(A);

See also:

find_nan( X )

Return a column vector containing the indices of elements of X that are NaN (not-a-number)

The output vector must have the type uvec (i.e. the indices are stored as unsigned integers of type uword)

X is interpreted as a vector, with column-by-column ordering of the elements of X

Examples:

fmat A(5, 5, fill::randu);

A(2, 3) = datum::nan;

// indices will be { 17 }
uvec indices = find_nan(A);

See also:
- find()
- find_finite()
- find_nonfinite()
- constants (pi, nan, inf, ...)
- NaN in Wikipedia

index_min( V )
index_min( M )
index_min( M, dim )
index_min( Q )
index_min( Q, dim )

index_max( V )
index_max( M )
index_max( M, dim )
index_max( Q )
index_max( Q, dim )

For vector V, return the linear index of the extremum value; the returned index is of type uword

For matrix M and:
- dim = 0, return a row vector (of type urowvec or umat),~ with each column containing the index of the extremum value in the corresponding column of M
- dim = 1, return a column vector (of type uvec or umat), with each row containing the index of the extremum value in the corresponding row of M

For cube Q, return a cube (of type ucube) containing the indices of extremum values of elements along dimension dim, where dim ∈ { 0, 1, 2 }

For each column, row, or slice, the index starts at zero

The dim argument is optional; by default dim = 0 is used

Examples:

fvec v(10, fill::randu);

uword i = index_max(v);
float max_val_in_v = v(i);


fmat M(5, 6, fill::randu);

urowvec ii = index_max(M);
ucolvec jj = index_max(M,1);

float max_val_in_col_2 = M( ii(2), 2 );

float max_val_in_row_4 = M( 4, jj(4) );

See also:
- min() & max()
- .index_min() & .index_max() (member functions)
- sort_index()
- find()

join_rows( A, B )
join_rows( A, B, C )
join_rows( A, B, C, D )

join_cols( A, B )
join_cols( A, B, C )
join_cols( A, B, C, D )

join_horiz( A, B )
join_horiz( A, B, C )
join_horiz( A, B, C, D )

join_vert( A, B )
join_vert( A, B, C )
join_vert( A, B, C, D )

join_rows() and join_horiz(): horizontal concatenation; join the corresponding rows of the given matrices; the given matrices must have the same number of rows

join_cols() and join_vert(): vertical concatenation; join the corresponding columns of the given matrices; the given matrices must have the same number of columns

Examples:

fmat A(4, 5, fill::randu);
fmat B(4, 6, fill::randu);
fmat C(6, 5, fill::randu);

fmat AB = join_rows(A, B);
fmat AC = join_cols(A, C);

See also:
- submatrix views

min( V )
min( M )
min( M, dim )
min( Q )
min( Q, dim )
min( A, B )

max( V )
max( M )
max( M, dim )
max( Q )
max( Q, dim )
max( A, B )

For vector V, return the extremum value

For matrix M, return the extremum value for each column (dim = 0), or each row (dim = 1)

For cube Q, return the extremum values of elements along dimension dim, where dim ∈ { 0, 1, 2 }

The dim argument is optional; by default dim = 0 is used

For two matrices/cubes A and B, return a matrix/cube containing element-wise extremum values

Examples:

fcolvec v(10, fill::randu);
float x = max(v);

fmat M(10, 10, fill::randu);

frowvec a = max(M);
frowvec b = max(M, 0);
fcolvec c = max(M, 1);

// element-wise maximum
fmat X(5, 6, fill::randu);
fmat Y(5, 6, fill::randu);
fmat Z = coot::max(X, Y); // use coot:: prefix to distinguish from std::max()

See also:
- .min() & .max() (member functions)
- clamp()
- .min() & .max() (member functions)
- statistics functions

norm( X )
norm( X, p )

Compute the p-norm of X, where X is a vector or matrix

For vectors, p is an integer ≥ 1, or one of: "-inf", "inf", "fro"

For matrices, p is one of: 1, 2, "inf", "fro"

"-inf" is the minimum quasi-norm, "inf" is the maximum norm, "fro" is the Frobenius norm

The argument p is optional; by default p = 2 is used

For vector norm with p = 2 and matrix norm with p = "fro", a robust algorithm is used to reduce the likelihood of underflows and overflows

Caveats:
- to obtain the zero/Hamming pseudo-norm (the number of non-zero elements), use this expression: accu(X != 0)
- matrix 2-norm (spectral norm) is based on SVD, which is computationally intensive for large matrices

Examples:

fvec q(5, fill::randu);

float x = norm(q, 2);
float y = norm(q, "inf");

See also:

normalise( V )
normalise( V, p )

normalise( X )
normalise( X, p )
normalise( X, p, dim )

For vector V, return its normalised version (ie. having unit p-norm)

For matrix X, return its normalised version, where each column (dim = 0) or row (dim = 1) has been normalised to have unit p-norm

The p argument is optional; by default p = 2 is used

The dim argument is optional; by default dim = 0 is used

Examples:

fvec A(10, fill::randu);
fvec B = normalise(A);
fvec C = normalise(A, 1);

fmat X(5, 6, fill::randu);
fmat Y = normalise(X);
fmat Z = normalise(X, 2, 1);

See also:

pow( A, scalar )

Element-wise power operation: raise all elements in A to the power denoted by the given scalar

Caveat:
- to raise all elements to the power 2, use square() instead

Examples:

fmat A(5, 6, fill::randu);
fmat B = pow(A, 3.45);

frowvec R(6, fill::randu);
frowvec S = pow(R, -1.0);

See also:
- abs()
- miscellaneous element-wise functions

repmat( A, num_copies_per_row, num_copies_per_col )

Generate a matrix by replicating matrix A in a block-like fashion

The generated matrix has the following size:

Examples:

fmat A(2, 3, fill::randu);

fmat B = repmat(A, 4, 5);

See also:
- .each_col() & .each_row() (vector operations applied to each column or row)
- reshape()
- resize()

reshape( X, n_rows, n_cols )
reshape( X, size(Y) )

Generate a vector/matrix with given size specifications, whose elements are taken from the given object in a column-wise manner; the elements in the generated object are placed column-wise (i.e. the first column is filled up before filling the second column)

The layout of the elements in the generated object will be different to the layout in the given object

If the total number of elements in the given object is less than the specified size, the remaining elements in the generated object are set to zero

If the total number of elements in the given object is greater than the specified size, only a subset of elements is taken from the given object

Caveats:
- to change the size without preserving data, use .set_size() instead, which is much faster
- to grow/shrink a matrix while preserving the elements as well as the layout of the elements, use resize() instead
- to flatten a matrix into a vector, use vectorise() instead

Examples:

fmat A(10, 5, fill::randu);

fmat B = reshape(A, 5, 10);

See also:
- .reshape() (member function)
- .set_size()
- resize()
- vectorise()
- as_scalar()
- conv_to()
- diagmat()
- repmat()
- size()

resize( X, n_rows, n_cols )
resize( X, size(Y) )

Generate a vector/matrix with given size specifications, whose elements as well as the layout of the elements are taken from the given object

Caveat: to change the size without preserving data, use .set_size() instead, which is much faster

Examples:

fmat A(4, 5, fill::randu);

fmat B = resize(A, 7, 6);

See also:
- .resize() (member function of Mat)
- .set_size() (member function of Mat)
- reshape()
- vectorise()
- as_scalar()
- conv_to()
- repmat()
- size()

shuffle( V )
shuffle( X )
shuffle( X, dim )

For vector V, generate a copy of the vector with the elements shuffled

For matrix X, generate a copy of the matrix with the elements shuffled in each column (dim = 0), or each row (dim = 1)

The dim argument is optional; by default dim = 0 is used

To change the RNG seed, use coot_rng::set_seed(value) or coot_rng::set_seed_random() functions

Examples:

fmat A(4, 5, fill::randu);
fmat B = shuffle(A);

See also:

size( X )
size( n_rows, n_cols )

Obtain the dimensions of object X, or explicitly specify the dimensions

The dimensions can be used in conjunction with:
- object constructors: Mat, Col, Row
- functions for changing size: set_size(), reshape(), resize(), etc.
- submatrix views

The dimensions support simple arithmetic operations; they can also be printed and compared for equality/inequality

Caveat: to prevent interference from std::size() in C++17, preface Bandicoot's size() with the coot namespace qualification, eg. coot::size(X)

Examples:

fmat A(5,6);

fmat B = zeros<fmat>(size(A));

fmat C;
C.randu(size(A));

fmat D = ones<fmat>(size(A));

fmat E = ones<fmat>(10, 20);
E(3, 4, size(C)) = C;    // access submatrix of E

fmat F( size(A) + size(E) );

fmat G( size(A) * 2 );

cout << "size of A: " << size(A) << endl;

bool is_same_size = (size(A) == size(E));

See also:
- attributes

sort( V )
sort( V, sort_direction )

sort( X )
sort( X, sort_direction )
sort( X, sort_direction, dim )

For vector V, return a vector which is a sorted version of the input vector

For matrix X, return a matrix with the elements of the input matrix sorted in each column (dim = 0), or each row (dim = 1)

The dim argument is optional; by default dim = 0 is used

The sort_direction argument is optional; sort_direction is either "ascend" or "descend"; by default "ascend" is used

The sorting algorithm used is radix sort

Examples:

fmat A(10, 10, fill::randu);
fmat B = sort(A);

See also:
- sort_index()
- randi()

sort_index( X )
sort_index( X, sort_direction )

stable_sort_index( X )
stable_sort_index( X, sort_direction )

Return a vector which describes the sorted order of the elements of X (i.e. it contains the indices of the elements of X)

The output vector must have the type uvec (i.e. the indices are stored as unsigned integers of type uword)

X is interpreted as a vector, with column-by-column ordering of the elements of X

The sort_direction argument is optional; sort_direction is either "ascend" or "descend"; by default "ascend" is used

The stable_sort_index() variant preserves the relative order of elements with equivalent values

The sorting algorithm used is radix sort

Examples:

fvec q(10, fill::randu);

uvec indices = sort_index(q);

See also:
- sort()
- find()

sum( V )
sum( M )
sum( M, dim )

For vector V, return the sum of all elements

For matrix M, return the sum of elements in each column (dim = 0), or each row (dim = 1)

The dim argument is optional; by default dim = 0 is used

Caveat: to get a sum of all the elements regardless of the object type (i.e. vector or matrix), use accu() instead

Examples:

fcolvec v(10, fill::randu);
float x = sum(v);

fmat M(10, 10, fill::randu);

frowvec a = sum(M);
frowvec b = sum(M, 0);
fcolvec c = sum(M, 1);

float y = accu(M);   // find the overall sum regardless of object type

See also:
- accu()
- trace()
- mean()
- as_scalar()

symmatu( A )
symmatl( A )

symmatu(A): generate symmetric matrix from square matrix A, by reflecting the upper triangle to the lower triangle

symmatl(A): generate symmetric matrix from square matrix A, by reflecting the lower triangle to the upper triangle

If A is non-square, a std::logic_error exception is thrown

Examples:

fmat A(5, 5, fill::randu);

fmat B = symmatu(A);
fmat C = symmatl(A);

See also:
- diagmat()
- Symmetric matrix in Wikipedia

trace( X )

Sum of the elements on the main diagonal of matrix X

If X is an expression, the evaluation of the expression aims to calculate only the diagonal elements

Examples:

fmat A(5, 5, fill::randu);

float x = trace(A);

See also:
- accu()
- as_scalar()
- .diag()
- diagvec()
- diagmat()
- sum()

trans( A )

Compute a transposed copy of the matrix

Examples:

fmat A(5, 10, fill::randu);

fmat B = trans(A);
fmat C = A.t();    // equivalent to trans(A), but more compact

See also:

vectorise( X )
vectorise( X, dim )

Generate a flattened version of matrix X

The argument dim is optional; by default dim = 0 is used

For dim = 0, the elements are copied from X column-wise, resulting in a column vector; equivalent to concatenating all the columns of X

For dim = 1, the elements are copied from X row-wise, resulting in a row vector; equivalent to concatenating all the rows of X

Caveat: column-wise vectorisation is faster than row-wise vectorisation

Examples:

fmat X(4, 5, fill::randu);

fvec v = vectorise(X);

See also:

miscellaneous element-wise functions:

exp

log

square

floor

erf

sign

exp2

log2

sqrt

ceil

erfc

lgamma

exp10

log10

round

trunc_exp

trunc_log

trunc

Apply a function to each element

Usage:

B = fn(A), where fn(A) is one of the functions below
A and B must have the same vector, matrix, or cube type, such as fmat or ivec or fcube

exp(A) base-e exponential: e^x

exp2(A) base-2 exponential: 2^x

exp10(A) base-10 exponential: 10^x

trunc_exp(A) base-e exponential, truncated to avoid infinity (only for float and double elements)

log(A) natural log: log_e x

log2(A) base-2 log: log₂ x

log10(A) base-10 log: log₁₀ x

trunc_log(A) natural log, truncated to avoid ±infinity (only for float and double elements)

square(A) square: x²

sqrt(A) square root: √x

floor(A) largest integral value that is not greater than the input value

ceil(A) smallest integral value that is not less than the input value

round(A) round to nearest integer, with halfway cases rounded away from zero

trunc(A) round to nearest integer, towards zero

erf(A) error function (only for float and double elements)

erfc(A) complementary error function (only for float and double elements)

lgamma(A) natural log of the absolute value of gamma function (only for float and double elements)

sign(A)

signum function; for each element a in A, the corresponding element b in B is:

	⎧	−1	if a < 0
b =	⎨	0	if a = 0
	⎩	+1	if a > 0

if a is complex and non-zero, then b = a / abs(a)

Caveat: all of the above functions are applied element-wise, where each element is treated independently

Examples:

fmat A(5, 5, fill::randu);
fmat B = exp(A);

See also:

trigonometric element-wise functions (cos, sin, tan, ...)

For single argument functions, B = trig_fn(A), where trig_fn is applied to each element in A, with trig_fn as one of:
- cos, acos, cosh, acosh
- sin, asin, sinh, asinh
- tan, atan, tanh, atanh
- sinc, defined as sinc(x) = sin(πx) / (πx) for x ≠ 0, and sinc(x) = 1 for x = 0

For dual argument functions, apply the function to each tuple of two corresponding elements in X and Y:
- Z = atan2(Y, X)
- Z = hypot(X, Y)

Note: dual-argument functions can only be used on vectors and matrices

Examples:

fmat X(5, 5, fill::randu);
fmat Y = cos(X);

See also:

Decompositions, Factorisations, and Inverses

R = chol( X)
chol(R, X)

Cholesky decomposition of symmetric/hermitian matrix X into triangular matrix R

By default, R is upper triangular

X is required to be positive definite

The decomposition has the form X = R.t() * R

If the decomposition fails:
- the form R = chol(X) resets R and throws a std::runtime_error exception
- the form chol(X, R) resets R and returns a bool set to false (exception is not thrown)

Caveat: there is no explicit check that X is symmetric or positive definite

Examples:

fmat A(5, 5, fill::randu);
fmat X = A.t() * A;

mat R1 = chol(X);
mat R2;
bool ok = chol(R2, X);

See also:

vec eigval = eig_sym( X )

eig_sym( eigval, X )

eig_sym( eigval, eigvec, X )

Eigendecomposition symmetric/hermitian matrix X

The eigenvalues and corresponding eigenvectors are stored in eigval and eigvec, respectively

The eigenvalues are in ascending order

The eigenvectors are stored as column vectors

If X is not square sized, a std::logic_error exception is thrown

If the decomposition fails:
- eigval = eig_sym(X) resets eigval and throws a std::runtime_error exception
- eig_sym(eigval,X) resets eigval and returns a bool set to false (exception is not thrown)
- eig_sym(eigval,eigvec,X) resets eigval & eigvec and returns a bool set to false (exception is not thrown)

Caveats:
- there is no explicit check whether X is symmetric/hermitian
- if eigenvectors are not necessary, it is more efficient to use a form that does not compute them (i.e. eig_sym(eigval, X))

Examples:

// for matrices with real elements

fmat A(50, 50, fill::randu);
fmat B = A.t()*A;  // generate a symmetric matrix

fvec eigval;
fmat eigvec;

eig_sym(eigval, eigvec, B);

See also:

lu( L, U, P, X )
lu( L, U, X )

Lower-upper decomposition (with partial pivoting) of matrix X

The first form provides a lower-triangular matrix L, an upper-triangular matrix U, and a permutation matrix P, such that P.t()*L*U = X

The second form provides permuted L and U, such that L*U = X; note that in this case L is generally not lower-triangular

If the decomposition fails:
- lu(L,U,P,X) resets L, U, P and returns a bool set to false (exception is not thrown)
- lu(L,U,X) resets L, U and returns a bool set to false (exception is not thrown)

Examples:

fmat A(5, 5, fill::randu);

fmat L, U, P;

lu(L, U, P, A);

fmat B = P.t() * L * U;

See also:

B = pinv( A )
B = pinv( A, tolerance )

pinv( B, A )
pinv( B, A, tolerance )

Moore-Penrose pseudo-inverse (generalised inverse) of matrix A

The computation is based on singular value decomposition

The tolerance argument is optional

The default tolerance is set to max_rc · max_sv · epsilon, where:
- mar_rc = max(A.n_rows, A.n_cols)
- max_sv = maximum singular value of A
- epsilon = difference between 1 and the least value greater than 1 that is representable

Any singular values less than tolerance are treated as zero

If the decomposition fails:
- B = pinv(A) resets B and throws a std::runtime_error exception
- pinv(B,A) resets B and returns a bool set to false (exception is not thrown)

Examples:

fmat A(4, 5, fill::randu);

fmat B = pinv(A);        // use default tolerance

fmat C = pinv(A, 0.01);  // set tolerance to 0.01

See also:

X = solve( A, B )

solve( X, A, B )

Solve a dense system of linear equations, A*X = B, where X is unknown; similar functionality to the \ operator in Matlab/Octave, ie. X = A \ B

A must be square sized; A cannot be rank-deficient

B can be a vector or matrix

The number of rows in A and B must be the same

NOTE: support for solve() is preliminary and many options that are available in Armadillo are not yet available in Bandicoot; future releases will add additional support

The solution is computed using the LU decomposition

If no solution is found:
- X = solve(A, B) resets X and throws a std::runtime_error exception
- solve(X, A, B) resets X and returns a bool set to false (exception is not thrown)

Examples:

fmat A(5, 5, fill::randu);
A.diag() += 3; // ensure positive definite

fvec b(5, fill::randu);
fvec x1 = solve(A, b);

fvec x2;
bool status = solve(x2, A, b);

fmat B(5, 5, fill::randu);
fmat X1 = solve(A, B);

See also:
- lu()
- pinv()
- linear system of equations in MathWorld
- system of linear equations in Wikipedia
- definiteness of a matrix in Wikipedia
- positive definite matrix in MathWorld
- ensmallen - library for solving arbitrary optimisation problems

s = svd( X )

svd( s, X )

svd( U, s, V, X )

Singular value decomposition of matrix X into vector of singular values s and matrices of left/right singular vectors U, V

If X is square, it can be reconstructed using X = U*diagmat(s)*V.t()

The singular values are in descending order

If the decomposition fails, the output objects are reset and:
- s = svd(X) resets s and throws a std::runtime_error exception
- svd(s,X) resets s and returns a bool set to false (exception is not thrown)
- svd(U,s,V,X) resets U, s, V and returns a bool set to false (exception is not thrown)

Examples:

fmat X(5, 5, fill::randu);

fmat U;
fvec s;
fmat V;

svd(U, s, V, X);

See also:

Signal & Image Processing

conv( A, B )
conv( A, B, shape )

1D convolution of vectors A and B

The orientation of the result vector is the same as the orientation of A (ie. either column or row vector)

The shape argument is optional; it is one of:

`"full"`	=	return the full convolution (default setting), with the size equal to A.n_elem + B.n_elem - 1
`"same"`	=	return the central part of the convolution, with the same size as vector A

The convolution operation is also equivalent to FIR filtering

Examples:

fvec A(256, fill::randu);

fvec B(16, fill::randu);

fvec C = conv(A, B);

fvec D = conv(A, B, "same");

See also:

conv2( A, B )
conv2( A, B, shape )

2D convolution of matrices A and B

The shape argument is optional; it is one of:

Examples:

fmat A(256, 256, fill::randu);

fmat B(16, 16, fill::randu);

fmat C = conv2(A, B);

fmat D = conv2(A, B, "same");

See also:

Statistics

mean, median, stddev, var, range

mean( V ) mean( M ) mean( M, dim )		⎫ ⎪ ⎬ mean (average value) ⎪ ⎭
median( V ) median( M ) median( M, dim )		⎫ ⎬ median ⎭
stddev( V ) stddev( V, norm_type ) stddev( M ) stddev( M, norm_type ) stddev( M, norm_type, dim )		⎫ ⎪ ⎬ standard deviation ⎪ ⎭
var( V ) var( V, norm_type ) var( M ) var( M, norm_type ) var( M, norm_type, dim )		⎫ ⎪ ⎬ variance ⎪ ⎭
range( V ) range( M ) range( M, dim )		⎫ ⎬ range (difference between max and min) ⎭

For vector V, return the statistic calculated using all the elements of the vector

For matrix M, find the statistic for each column (dim = 0), or each row (dim = 1)

The dim argument is optional; by default dim = 0 is used

The norm_type argument is optional; by default norm_type = 0 is used

For the var() and stddev() functions:
- the default norm_type = 0 performs normalisation using N-1 (where N is the number of samples), providing the best unbiased estimator
- using norm_type = 1 performs normalisation using N, which provides the second moment around the mean

Caveat: to obtain statistics for integer matrices/vectors (eg. umat, imat, uvec, ivec), convert to a matrix/vector with floating point values (eg. mat, vec) using the conv_to() function

Examples:

fmat A(5, 5, fill::randu);

fmat B  = mean(A);
fmat C  = var(A);
float m = mean(mean(A));

fvec v(5, fill::randu);
float x = var(v);

See also:

cov( X, Y )
cov( X, Y, norm_type )

cov( X )
cov( X, norm_type )

For two matrix arguments X and Y, if each row of X and Y is an observation and each column is a variable, the (i,j)-th entry of cov(X,Y) is the covariance between the i-th variable in X and the j-th variable in Y

For vector arguments, the type of vector is ignored and each element in the vector is treated as an observation

For matrices, X and Y must have the same dimensions

For vectors, X and Y must have the same number of elements

cov(X) is equivalent to cov(X, X)

The norm_type argument is optional; by default norm_type = 0 is used

the norm_type argument controls the type of normalisation used, with N denoting the number of observations:
- for norm_type = 0, normalisation is done using N-1, providing the best unbiased estimation of the covariance matrix (if the observations are from a normal distribution)
- for norm_type = 1, normalisation is done using N, which provides the second moment matrix of the observations about their mean

Examples:

fmat X(4, 5, fill::randu);
fmat Y(4, 5, fill::randu);

fmat C = cov(X, Y);
fmat D = cov(X, Y, 1);

See also:

cor( X, Y )
cor( X, Y, norm_type )

cor( X )
cor( X, norm_type )

For two matrix arguments X and Y, if each row of X and Y is an observation and each column is a variable, the (i,j)-th entry of cor(X,Y) is the correlation coefficient between the i-th variable in X and the j-th variable in Y

For vector arguments, the type of vector is ignored and each element in the vector is treated as an observation

For matrices, X and Y must have the same dimensions

For vectors, X and Y must have the same number of elements

cor(X) is equivalent to cor(X, X)

The norm_type argument is optional; by default norm_type = 0 is used

the norm_type argument controls the type of normalisation used, with N denoting the number of observations:
- for norm_type = 0, normalisation is done using N-1
- for norm_type = 1, normalisation is done using N

Examples:

fmat X(4, 5, fill::randu);
fmat Y(4, 5, fill::randu);

fmat R = cor(X, Y);
fmat S = cor(X, Y, 1);

See also:

Miscellaneous

backend configuration

Bandicoot can use either CUDA or OpenCL as a hardware backend

To enable CUDA or OpenCL, set the COOT_USE_CUDA or COOT_USE_OPENCL macros in the Bandicoot configuration

If both backends are enabled, select the default backend by setting the COOT_DEFAULT_BACKEND macro to the desired backend (e.g. #define COOT_DEFAULT_BACKEND CL_BACKEND)

By default, at the time of first usage, Bandicoot will automatically initialise to use the first available device with the default backend

Bandicoot can also be manually initialised using the coot_init() function:

coot_init( )		default initialization
coot_init( print_info )		initialize to default backend, optionally printing information about the chosen GPU device
coot_init( "opencl", print_info )		initialize to OpenCL backend; `COOT_USE_OPENCL` must be enabled
coot_init( "opencl", print_info, platform_id, dev_id )		use a specific OpenCL platform ID and device ID
coot_init( "cuda", print_info )		initialize to CUDA backend; `COOT_USE_CUDA` must be enabled
coot_init( "cuda", print_info, dev_id )		use specific CUDA device ID

coot_init() returns a boolean indicating whether or not initialisation was successful

if print_info is set to true, information about the selected GPU device will be printed

for the "opencl" initialisations, platform_id and dev_id specify the desired OpenCL platform and device IDs; available platforms and devices can be listed using the clinfo command-line utility, available in most package managers: clinfo -l

for the "cuda" initialisations, dev_id specifies the desired CUDA device; available device IDs can be listed with the nvidia-smi command-line utility

Caveats:
- calling coot_init() manually must be done before any other Bandicoot operations
- coot_init() can only be called once
- if either caveat above is violated when calling coot_init(), a std::runtime_error exception will be thrown
At any time, all asynchronous operations can be forced to complete by calling coot_synchronise()

See also:
- CUDA in Wikipedia
- OpenCL in Wikipedia

constants (pi, inf, eps, ...)

`datum::pi`		π, the ratio of any circle's circumference to its diameter
`datum::tau`		τ, the ratio of any circle's circumference to its radius (equivalent to 2π)
`datum::inf`		∞, infinity
`datum::nan`		“not a number” (NaN); caveat: NaN is not equal to anything, even itself

`datum::eps`		machine epsilon; approximately 2.2204e-16; difference between 1 and the next representable value
`datum::e`		base of the natural logarithm
`datum::sqrt2`		square root of 2

`datum::log_min`		log of minimum non-zero value (type and machine dependent)
`datum::log_max`		log of maximum value (type and machine dependent)
`datum::euler`		Euler's constant, aka Euler-Mascheroni constant

`datum::gratio`		golden ratio
`datum::m_u`		atomic mass constant (in kg)
`datum::N_A`		Avogadro constant

`datum::k`		Boltzmann constant (in joules per kelvin)
`datum::k_evk`		Boltzmann constant (in eV/K)
`datum::a_0`		Bohr radius (in meters)

`datum::mu_B`		Bohr magneton
`datum::Z_0`		characteristic impedance of vacuum (in ohms)
`datum::G_0`		conductance quantum (in siemens)

`datum::k_e`		Coulomb's constant (in meters per farad)
`datum::eps_0`		electric constant (in farads per meter)
`datum::m_e`		electron mass (in kg)

`datum::eV`		electron volt (in joules)
`datum::ec`		elementary charge (in coulombs)
`datum::F`		Faraday constant (in coulombs)

`datum::alpha`		fine-structure constant
`datum::alpha_inv`		inverse fine-structure constant
`datum::K_J`		Josephson constant

`datum::mu_0`		magnetic constant (in henries per meter)
`datum::phi_0`		magnetic flux quantum (in webers)
`datum::R`		molar gas constant (in joules per mole kelvin)

`datum::G`		Newtonian constant of gravitation (in newton square meters per kilogram squared)
`datum::h`		Planck constant (in joule seconds)
`datum::h_bar`		Planck constant over 2 pi, aka reduced Planck constant (in joule seconds)

`datum::m_p`		proton mass (in kg)
`datum::R_inf`		Rydberg constant (in reciprocal meters)
`datum::c_0`		speed of light in vacuum (in meters per second)

`datum::sigma`		Stefan-Boltzmann constant
`datum::R_k`		von Klitzing constant (in ohms)
`datum::b`		Wien wavelength displacement law constant

The constants are stored in the Datum<type> class, where type is either float or double;
for convenience, Datum<double> is typedefed as datum, and Datum<float> is typedefed as fdatum

Caveat: datum::nan is not equal to anything, even itself; to check whether a scalar x is finite, use std::isfinite(x)

The physical constants were mainly taken from NIST 2018 CODATA values, and some from WolframAlpha (as of 2009-06-23)

Examples:

cout << "speed of light = " << datum::c_0 << endl;

cout << "log_max for floats = ";
cout << fdatum::log_max << endl;

cout << "log_max for doubles = ";
cout << datum::log_max << endl;

See also:
- .is_finite()
- .fill()
- NaN in Wikipedia
- physical constant in Wikipedia
- replacement of 2π with τ in Wikipedia
- The Tau Manifesto by Michael Hartl
- std::numeric_limits in cplusplus.com
- std::numeric_limits in cppreference.com

wall_clock

Simple timer class for measuring the number of elapsed seconds

An instance of the class has two member functions:

`.tic()`		start the timer
`.toc()`		return the number of seconds since the last call to `.tic()`

Examples:

wall_clock timer;

timer.tic();

// ... do something ...

double n = timer.toc();

cout << "number of seconds: " << n << endl;

See also:
- elapsed real time in Wikipedia

output streams

The default stream for printing matrices is std::cout
the stream can be changed via the COOT_COUT_STREAM define; see config.hpp

The default stream for printing warnings and errors is std::cerr
the stream can be changed via the COOT_CERR_STREAM define; see config.hpp

The degree of printed warnings is controlled by the COOT_WARN_LEVEL define; see config.hpp

Example of changing the warning level:

#define COOT_WARN_LEVEL 1
#include <bandicoot>

See also:

uword, sword

uword is a typedef for an unsigned integer type; it is used for matrix indices as well as all internal counters and loops

sword is a typedef for a signed integer type

The minimum width of both uword and sword is either 32 or 64 bits:
- the default width is 32 bits on 32-bit platforms
- the default width is 64 bits on 64-bit platforms
- uword is a typedef for std::size_t

Caveat: the Bandicoot uword and sword types are not guaranteed to be the same as the Armadillo arma::uword and arma::sword types

See also:
- C++ variable types
- explanation of typedef
- imat & umat matrix types
- ivec & uvec vector types

Examples of Matlab/Octave syntax and conceptually corresponding Bandicoot syntax

Matlab/Octave	Bandicoot	Notes

`A(1, 1)`	`A(0, 0)`	indexing in Bandicoot starts at 0
`A(k, k)`	`A(k-1, k-1)`

`size(A,1)`	`A.n_rows`	read only
`size(A,2)`	`A.n_cols`
`numel(A)`	`A.n_elem`

`A(:, k)`	`A.col(k)`	this is a conceptual example only; exact conversion from Matlab/Octave to Bandicoot syntax will require taking into account that indexing starts at 0
`A(k, :)`	`A.row(k)`
`A(:, p:q)`	`A.cols(p, q)`
`A(p:q, :)`	`A.rows(p, q)`
`A(p:q, r:s)`	`A( span(p,q), span(r,s) )`	A( span(first_row, last_row), span(first_col, last_col) )
`Q(:, :, k)`	`Q.slice(k)`	Q is a cube (3D array)
`Q(:, :, t:u)`	`Q.slices(t, u)`
`Q(p:q, r:s, t:u)`	`Q( span(p,q), span(r,s), span(t,u) )`

`A'`	`A.t() or trans(A)`	matrix transpose / Hermitian transpose

`A = zeros(size(A))`	`A.zeros()`
`A = ones(size(A))`	`A.ones()`
`A = zeros(k)`	`A = zeros<fmat>(k,k)`
`A = ones(k)`	`A = ones<fmat>(k,k)`

`A .* B`	`A % B`	element-wise multiplication
`A ./ B`	`A / B`	element-wise division
`A = A + 1;`	`A++`
`A = A - 1;`	`A--`

`A = [ 1 2; 3 4; ]`	`A = { { 1, 2 }, { 3, 4 } };`	element initialisation

`X = A(:)`	`X = vectorise(A)`
`X = [ A B ]`	`X = join_horiz(A,B)`
`X = [ A; B ]`	`X = join_vert(A,B)`

`A`	`cout << A << endl;` or `A.print("A =");`
`A = randn(2,3); B = randn(4,5);`	`fmat A = randn(2,3); fmat B = randn(4,5);`

Armadillo/Bandicoot adaptation guide

Bandicoot is a GPU-focused linear algebra library aiming for API compatibility with Armadillo; however, due to inherent architectural differences between GPUs and CPUs, the following caveats apply:
- GPUs are best suited for operations on large matrices, so small matrices (e.g. with size ≤ 100x100) may not obtain speedups
- Individual element access such as X(i,j) has an overhead of transferring between the GPU and CPU; when adapting Armadillo code to Bandicoot, direct element access should be avoided
- Where possible, use batch operations with Bandicoot; e.g., use A += 1 instead of for(uword i=0; i<A.n_elem; ++i) { A[i] += 1; }
- If direct element access cannot be avoided, consider temporarily transferring the entire Bandicoot matrix to CPU-accessible memory by creating an Armadillo matrix via conv_to<arma::fmat>(X)
- Due to the overhead of direct element access, Bandicoot does not provide iterators
- Consumer-level GPUs typically obtain better performance with 32-bit floating point elements rather than 64-bit (e.g. float instead of double), so using fmat instead of mat is preferable
The first run of any Bandicoot program requires compiling all Bandicoot kernel functions for the given device, which can be a time-consuming process; kernels are cached and subsequent runs will use the cache

Upgrading Bandicoot versions may incur recompilation of kernels
Using a new backend for the first time may incur recompilation of kernels
Using a new device may incur recompilation of kernels
For more information see the kernel cache documentation

If a specific Armadillo function is not implemented in Bandicoot for your use case, please file a bug report so that adding the function can be prioritised

example program

#include <iostream>
#include <bandicoot>

using namespace std;
using namespace coot;

int main()
  {
  fmat A(4, 5, fill::randu);
  fmat B(4, 5, fill::randu);

  cout << A * B.t() << endl;

  return 0;
  }

If the above program is stored as example.cpp, under Linux and macOS it can be compiled using:

Bandicoot uses template meta-programming, so it's recommended to enable optimisation when compiling programs (eg. use the -O2 or -O3 options for GCC or clang)

See the Questions page for more info on compiling and linking

If coming from Armadillo, see the Armadillo/Bandicoot differences for advice on writing efficient code

See also the example program that comes with the Bandicoot archive

config.hpp

Bandicoot can be configured via editing the file include/bandicoot_bits/config.hpp

Specific functionality can be enabled or disabled by uncommenting or commenting out a particular #define, listed below.

Some options can also be specified by explicitly defining them before including the bandicoot header.

COOT_DONT_USE_WRAPPER Disable going through the run-time Bandicoot wrapper library (libbandicoot.so) when calling GPU-specific functions. Overrides COOT_USE_WRAPPER. You will need to directly link with GPU libraries (e.g. -lOpenCL -lclBLAS or similar depending on backend configuration)

COOT_USE_WRAPPER Enable use of Bandicoot wrapper library, which allows linking against all enabled backends with -lbandicoot only.

COOT_USE_OPENCL Enable use of OpenCL as a GPU backend. Note that either COOT_USE_OPENCL or COOT_USE_CUDA must be enabled. OpenCL headers and clBLAS headers must be available on the system.

COOT_DONT_USE_OPENCL Disable use of OpenCL; overrides COOT_USE_OPENCL

COOT_USE_CUDA Enable use of CUDA as a GPU backend. Note that either COOT_USE_OPENCL or COOT_USE_CUDA must be enabled. The CUDA toolkit must be available on the system.

COOT_DONT_USE_CUDA Disable use of CUDA; overrides COOT_USE_CUDA

COOT_DEFAULT_BACKEND Set the backend that Bandicoot will use. This is only necessary if multiple backends are enabled; that is, when both COOT_USE_OPENCL and COOT_USE_CUDA are enabled. This should be set to either CUDA_BACKEND or CL_BACKEND (e.g. #define COOT_BACKEND CUDA_BACKEND). See also the backend configuration documentation.

COOT_BLAS_LONG Use "long" instead of "int" when calling BLAS and LAPACK functions. Only relevant when using OpenCL backend.

COOT_BLAS_LONG_LONG Use "long long" instead of "int" when calling BLAS and LAPACK functions. Only relevant when using OpenCL backend.

COOT_USE_FORTRAN_HIDDEN_ARGS Use so-called "hidden arguments" when calling BLAS and LAPACK functions. Enabled by default. See Fortran argument passing conventions for more details. Only relevant when using OpenCL backend.

COOT_DONT_USE_FORTRAN_HIDDEN_ARGS Disable use of so-called "hidden arguments" when calling BLAS and LAPACK functions. May be necessary when using Bandicoot in conjunction with broken MKL headers (eg. if you have #include "mkl_lapack.h" in your code). Only relevant when using OpenCL backend.

COOT_USE_MKL_TYPES If using the OpenCL backend with LAPACK and BLAS, use Intel MKL types for complex numbers. You will need to include appropriate MKL headers before the Bandicoot header. You may also need to enable one or more of the following options: COOT_BLAS_LONG, COOT_BLAS_LONG_LONG, COOT_DONT_USE_FORTRAN_HIDDEN_ARGS. Only relevant when using OpenCL backend.

COOT_USE_OPENMP Use OpenMP for parallelisation of some CPU-based parts of Bandicoot functionalities. Automatically enabled when using a compiler which has OpenMP 3.1+ active (eg. the -fopenmp option for gcc and clang). Note: this may not have a noticeable effect on performance since most Bandicoot implementations do not use the CPU heavily or at all.

COOT_DONT_USE_OPENMP Disable use of OpenMP for parallelisation; overrides COOT_USE_OPENMP.

COOT_KERNEL_CACHE_DIR If defined, specifies a custom directory to use for the kernel cache. Distribution packagers may choose to specify COOT_SYSTEM_KERNEL_CACHE_DIR, though it is overridden by COOT_KERNEL_CACHE_DIR if specified.

COOT_BLAS_CAPITALS Use capitalised (uppercase) BLAS and LAPACK function names (eg. DGEMM vs dgemm)

COOT_BLAS_UNDERSCORE Append an underscore to BLAS and LAPACK function names (eg. dgemm_ vs dgemm). Enabled by default.

COOT_BLAS_LONG Use "long" instead of "int" when calling BLAS and LAPACK functions

COOT_BLAS_LONG_LONG Use "long long" instead of "int" when calling BLAS and LAPACK functions

COOT_NO_DEBUG Disable all run-time checks, including size conformance and bounds checks. NOT RECOMMENDED. DO NOT USE UNLESS YOU KNOW WHAT YOU ARE DOING AND ARE WILLING TO RISK THE DOWNSIDES. Keeping run-time checks enabled during development and deployment greatly aids in finding mistakes in your code.

COOT_EXTRA_DEBUG Print out the trace of internal functions used for evaluating expressions. Not recommended for normal use. This is mainly useful for debugging the library.

COOT_COUT_STREAM The default stream used for printing matrices by .print(). Must be always enabled. By default defined to std::cout

COOT_CERR_STREAM The default stream used for printing warnings and errors. Must be always enabled. By default defined to std::cerr

COOT_WARN_LEVEL

The level of warning messages printed to COOT_CERR_STREAM.
Must be an integer ≥ 0. By default defined to 2.

0	=	no warnings; generally not recommended
1	=	only critical warnings about arguments and/or data which are likely to lead to incorrect results
2	=	as per level 1, and warnings about poorly conditioned systems (low rcond) detected by solve()
3	=	as per level 2, and warnings about failed decompositions

Example usage:

#define COOT_WARN_LEVEL 1
#include <bandicoot>

See also:

direct linking

If COOT_USE_WRAPPER is not defined (or COOT_DONT_USE_WRAPPER is defined), then Bandicoot will need to be linked against all dependencies of its backends

Unfortunately this could be a lot of dependencies depending on configuration options; so, enabling COOT_USE_WRAPPER is the default and is recommended; when COOT_USE_WRAPPER is enabled, linking only requires -lbandicoot

Regardless of backend configuration, these libraries must always be linked against:

-lblas (CPU BLAS support; can use -lopenblas instead)
-llapack (CPU LAPACK support; can use -lopenblas instead)

If COOT_USE_OPENCL is set (i.e. the OpenCL backend is enabled), these libraries must be linked against:
- -lOpenCL (core OpenCL support)
- -lclBLAS (clBLAS for BLAS operations)
If COOT_USE_CUDA is set (i.e. the CUDA backend is enabled), these libraries must be linked against:
- -lcuda (core CUDA support)
- -lcudart (CUDA runtime library)
- -lnvrtc (runtime compilation of CUDA kernels)
- -lcublas (cuBLAS for BLAS operations)
- -lcusolver (cuSolverDn for decompositions and factorisations)
- -lcurand (cuRand for random number generation)

kernel cache

In order to perform GPU-based linear algebra, Bandicoot must first compile kernel functions to a specific GPU

The first time Bandicoot is run on a system, all GPU kernel functions will be compiled; this may take a considerable amount of time, depending on the underlying system (usually less than 5 minutes)

Compiled kernels are stored in disk in the kernel cache for later reuse

Compiled kernels are specific to Bandicoot version, backend, and device; thus, if any of those three factors change, recompilation will be triggered; see the backend configuration documentation for more details

The default location to store the kernel cache is
- on Linux, macOS, and UNIX-like systems: ~/.bandicoot/cache/ (e.g. /home/user/.bandicoot/cache/)
- on Windows: %APPDATA%\bandicoot\cache (e.g. C:\Users\Username\AppData\bandicoot\cache)
Custom locations can be specified with the COOT_KERNEL_CACHE_DIR configuration variable

History of API Additions, Changes and Deprecations

API Stability and Version Policy:
- Each release of Bandicoot has its public API (functions, classes, constants) described in the accompanying API documentation specific to that release.
- Each release of Bandicoot has its full version specified as A.B.C, where A is a major version number, B is a minor version number, and C is a patch level (indicating bug fixes). The version specification has explicit meaning, similar to Semantic Versioning, as follows:
- Caveat: the above policy applies only to the public API described in the documentation. Any functionality within Bandicoot which is not explicitly described in the public API documentation is considered as internal implementation details, and may be changed or removed without notice.

List of additions and changes for each version:
- Version 2.1.0:
  - add .is_finite() member function for matrices and cubes
  - add .has_inf() member function for matrices and cubes
  - add .has_nan() member function for matrices and cubes
  - add .copy_size() member function for matrices and cubes
  - bugfix for copy and move operators for Cube and Mat aliases
  - add min() and max() for cubes
  - add index_min() and index_max() for cubes
  - add constructors for Mat, Col, and Row that accept strings and std::vectors
  - add element initialisation to handle nested initialiser lists
  - bugfix for diagmat() matrix multiplication on subviews
- Version 2.0.0:
  - add Cube class and basic functionality
  - add .each_row() and .each_col() member functions
  - add regspace() for generating vectors with regularly spaced elements
  - better support for Armadillo objects in conv_to
- Version 1.16.2:
  - fix linking issues when system OpenCL version does not match the version used when compiling the Bandicoot library
- Version 1.16.1:
  - adapt dev_mem_t to support arithmetic and aliases
  - CUDA GPU architecture version handling fixes
  - add logspace() function
- Version 1.16.0:
- Version 1.15.1:
  - bugfix for memory leak
- Version 1.15.0:
  - bugfix for subview element access
  - better support for AMD OpenCL drivers
- Version 1.14.0:
  - add shuffle()
- Version 1.13.0:
  - bugfixes for OpenCL backend on Mac OS
- Version 1.12.4:
  - add compatibility container functions: .front(), .back(), .size(), .clear(), .empty()
  - fix bug in two-dimensional CUDA grid size computation
  - add approx_equal()
- Version 1.11.1:
  - additional efficiency improvements for subview handling
  - handle Fortran "hidden arguments"; see COOT_USE_FORTRAN_HIDDEN_ARGS
- Version 1.11.0:
  - significant efficiency improvements for subview handling
  - removed redundant GPU kernels, improving compilation time
  - added element-wise min() and max()
- Version 1.10.0:
  - first stable release!

n_rows	=	num_copies_per_row	×	A.n_rows
n_cols	=	num_copies_per_col	×	A.n_cols