![]() |
Ginkgo Generated from branch based on main. Ginkgo version 1.9.0
A numerical linear algebra library targeting many-core architectures
|
Public Member Functions | |
template<typename... Args> | |
auto | with_max_block_size (Args &&... _value) -> std::decay_t< decltype(*(this->self()))> & |
template<typename... Args> | |
auto | with_max_block_stride (Args &&... _value) -> std::decay_t< decltype(*(this->self()))> & |
template<typename... Args> | |
auto | with_skip_sorting (Args &&... _value) -> std::decay_t< decltype(*(this->self()))> & |
template<typename... Args> | |
auto | with_block_pointers (Args &&... _value) -> std::decay_t< decltype(*(this->self()))> & |
template<typename... Args> | |
auto | with_storage_optimization (Args &&... _value) -> std::decay_t< decltype(*(this->self()))> & |
template<typename... Args> | |
auto | with_accuracy (Args &&... _value) -> std::decay_t< decltype(*(this->self()))> & |
![]() | |
parameters_type & | with_loggers (Args &&... _value) |
Provides the loggers to be added to the factory and its generated objects in a fluent interface. | |
std::unique_ptr< Factory > | on (std::shared_ptr< const Executor > exec) const |
Creates a new factory on the specified executor. | |
Public Attributes | |
uint32 | max_block_size {32u} |
Maximal size of diagonal blocks. | |
uint32 | max_block_stride {0u} |
Stride between two columns of a block (as number of elements). | |
bool | skip_sorting {false} |
true means it is known that the matrix given to this factory will be sorted first by row, then by column index, false means it is unknown or not sorted, so an additional sorting step will be performed during the preconditioner generation (it will not change the matrix given). | |
gko::array< index_type > | block_pointers {nullptr} |
Starting (row / column) indexes of individual blocks. | |
storage_optimization_type | storage_optimization {precision_reduction(0, 0)} |
The precisions to use for the blocks of the matrix. | |
remove_complex< value_type > | accuracy {static_cast<remove_complex<value_type>>(1e-1)} |
The relative accuracy of the adaptive Jacobi variant. | |
Additional Inherited Members | |
![]() | |
using | factory |
remove_complex<value_type> gko::preconditioner::Jacobi< ValueType, IndexType >::parameters_type::accuracy {static_cast<remove_complex<value_type>>(1e-1)} |
The relative accuracy of the adaptive Jacobi variant.
This parameter is only used if the adaptive version of the algorithm is selected (see storage_optimization parameter for more details). The parameter is used when detecting the optimal precisions of blocks whose precision has been set to precision_reduction::autodetect().
The parameter represents the number of correct digits in the result of Jacobi::apply() operation of the adaptive variant, compared to the non-adaptive variant. In other words, the total preconditioning error will be:
where c
is some constant depending on the problem size and roundoff error and dropout
the error introduced by disregarding off-diagonal elements.
Larger values reduce the volume of memory transfer, but increase the error compared to using full precision storage. Thus, tuning the accuracy to a value as close as possible to dropout
will result in optimal memory savings, while not degrading the quality of solution.
gko::array<index_type> gko::preconditioner::Jacobi< ValueType, IndexType >::parameters_type::block_pointers {nullptr} |
Starting (row / column) indexes of individual blocks.
An index past the last block has to be supplied as the last value. I.e. the size of the array has to be the number of blocks plus 1, where the first value is 0, and the last value is the number of rows / columns of the matrix.
n
use Jacobi::get_num_blocks(). The starting indexes of the blocks are stored in the first n+1
values of this array. uint32 gko::preconditioner::Jacobi< ValueType, IndexType >::parameters_type::max_block_size {32u} |
Maximal size of diagonal blocks.
uint32 gko::preconditioner::Jacobi< ValueType, IndexType >::parameters_type::max_block_stride {0u} |
Stride between two columns of a block (as number of elements).
Should be a multiple of cache line size for best performance.
bool gko::preconditioner::Jacobi< ValueType, IndexType >::parameters_type::skip_sorting {false} |
true
means it is known that the matrix given to this factory will be sorted first by row, then by column index, false
means it is unknown or not sorted, so an additional sorting step will be performed during the preconditioner generation (it will not change the matrix given).
The matrix must be sorted for this preconditioner to work.
The system_matrix
, which will be given to this factory, must be sorted (first by row, then by column) in order for the algorithm to work. If it is known that the matrix will be sorted, this parameter can be set to true
to skip the sorting (therefore, shortening the runtime). However, if it is unknown or if the matrix is known to be not sorted, it must remain false
, otherwise, this preconditioner might be incorrect.
storage_optimization_type gko::preconditioner::Jacobi< ValueType, IndexType >::parameters_type::storage_optimization {precision_reduction(0, 0)} |
The precisions to use for the blocks of the matrix.
This parameter can either be a single instance of precision_reduction or an array of precision_reduction values. If set to precision_reduction(0, 0)
(this is the default), a regular full-precision block-Jacobi will be used. Any other value (or an array of values) will map to the adaptive variant.
The best starting point when evaluating the potential of the adaptive version is to set this parameter to precision_reduction::autodetect()
. This option will cause the preconditioner to reduce the memory transfer volume as much as possible, while trying to maintain the quality of the preconditioner similar to that of the full precision block-Jacobi.
For finer control, specific instances of precision_reduction can be used. Supported values are precision_reduction(0, 0)
, precision_reduction(0, 1)
and precision_reduction(0, 2)
. Any other value will have the same effect as precision_reduction(0, 0)
.
If the ValueType template parameter is set to double
(or the complex variant std::complex<double>
), precision_reduction(0, 0)
will use IEEE double precision for preconditioner storage, precision_reduction(0, 1)
will use IEEE single precision, and precision_reduction(0, 2)
will use IEEE half precision.
It ValueType is set to float
(or std::complex<float>
), precision_reduction(0, 0)
will use IEEE single precision for preconditioner storage, and both precision_reduction(0, 1)
and precision_reduction(0, 2)
will use IEEE half precision.
Instead of specifying the same precision for all blocks, the precision of the elements can be specified on per-block basis by passing an array of precision_reduction objects. All values discussed above are supported, with the same meaning. It is worth mentioning that a value of precision_reduction::autodetect()
will cause autodetection on the per-block basis, so blocks whose precisions are autodetected can end up having different precisions once the preconditioner is generated. The detected precision generally depends on the conditioning of the block.
If the number of diagonal blocks is larger than the number of elements in the passed array, the entire array will be replicated until enough values are available. For example, if the original array contained two precisions (x, y)
and the preconditioner contains 5 blocks, the array will be transformed into (x, y, x, y, x)
before generating the preconditioner. As a consequence, specifying a single value for this property is exactly equivalent to specifying an array with a single element set to that value.
Once an instance of the Jacobi linear operator is generated, the precisions used for the blocks can be obtained by reading this property. Whether the parameter was set to a single value or to an array of values can be queried by reading the storage_optimization.is_block_wise
boolean sub-property. If it is set to false
, the precision used for all blocks can be obtained using storage_optimization.of_all_blocks
or by casting storage_optimization
to precision_reduction
. Independently of the value of storage_optimization.is_block_wise
, the storage_optimization.block_wise
property will return an array of precisions used for each block. All values set to precision_reduction::autodetect()
will be replaced with the value representing the precision used for the corresponding block. If the non-adaptive version of Jacobi is used, the storage_optimization.block_wise
array will be empty.