SIMD library
The SIMD library provides portable types for explicitly stating data-parallelism and structuring data for more efficient SIMD access.
An object of type simd<T> behaves analogue to objects of type T. But while T stores and manipulates one value, simd<T> stores and manipulates multiple values (called width but identified as size for consistency with the rest of the standard library; cf. simd_size).
Every operator and operation on simd<T> acts element-wise (except for horizontal operations, which are clearly marked as such). This simple rule expresses data-parallelism and will be used by the compiler to generate SIMD instructions and/or independent execution streams.
The width of the types simd<T> and native_simd<T> is determined by the implementation at compile-time. In contrast, the width of the type fixed_size_simd<T, N> is fixed by the developer to a certain size.
A recommended pattern for using a mix of different SIMD types with high efficiency uses native_simd and rebind_simd:
#include <experimental/simd> namespace stdx = std::experimental; using floatv = stdx::native_simd<float>; using doublev = stdx::rebind_simd_t<double, floatv>; using intv = stdx::rebind_simd_t<int, floatv>;
This ensures that the set of types all have the same width and thus can be interconverted. A conversion with mismatching width is not defined because it would either drop values or have to invent values. For resizing operations, the SIMD library provides the split and concat functions.
| Defined in header
<experimental/simd> |
Main classes
| (parallelism TS v2) |
data-parallel vector type (class template) |
| (parallelism TS v2) |
data-parallel type with the element type bool (class template) |
ABI tags
| Defined in namespace
std::experimental::simd_abi | |
| (parallelism TS v2) |
tag type for storing a single element (typedef) |
| (parallelism TS v2) |
tag type for storing specified number of elements (alias template) |
| (parallelism TS v2) |
tag type that ensures ABI compatibility (alias template) |
| (parallelism TS v2) |
tag type that is most efficient (alias template) |
| (parallelism TS v2) |
the maximum number of elements guaranteed to be supported by fixed (constant) |
| (parallelism TS v2) |
obtains an ABI type for given element type and number of elements (class template) |
Alignment tags
| (parallelism TS v2) |
flag indicating alignment of the load/store address to element alignment (class) |
| (parallelism TS v2) |
flag indicating alignment of the load/store address to vector alignment (class) |
| (parallelism TS v2) |
flag indicating alignment of the load/store address to the specified alignment (class template) |
Where expression
| (parallelism TS v2) |
selected elements with non-mutating operations (class template) |
| (parallelism TS v2) |
selected elements with mutating operations (class template) |
| (parallelism TS v2) |
produces const_where_expression and where_expression (function template) |
Casts
| (parallelism TS v2) |
element-wise static_cast (function template) |
| (parallelism TS v2) |
element-wise ABI cast (function template) |
| (parallelism TS v2) |
splits single simd object to multiple ones (function template) |
| (parallelism TS v2) |
concatenates multiple simd objects to a single one (function template) |
Algorithms
| (parallelism TS v2) |
element-wise min operation (function template) |
| (parallelism TS v2) |
element-wise max operation (function template) |
| (parallelism TS v2) |
element-wise minmax operation (function template) |
| (parallelism TS v2) |
element-wise clamp operation (function template) |
Reduction
| (parallelism TS v2) |
reduces the vector to a single element (function template) |
Mask reduction
| (parallelism TS v2) |
reductions of simd_mask to bool (function template) |
| (parallelism TS v2) |
reduction of simd_mask to the number of true values (function template) |
| (parallelism TS v2) |
reductions of simd_mask to the index of the first or last true value (function template) |
Traits
| (parallelism TS v2) |
checks if a type is a simd or simd_mask type (class template) |
| (parallelism TS v2) |
checks if a type is an ABI tag type (class template) |
| (parallelism TS v2) |
checks if a type is a simd flag type (class template) |
| (parallelism TS v2) |
obtains the number of elements of a given element type and ABI tag (class template) |
| (parallelism TS v2) |
obtains an appropriate alignment for vector_aligned (class template) |
| (parallelism TS v2) |
change element type or the number of elements of simd or simd_mask (class template) |
Math functions
All functions in <cmath>, except for the special math functions, are overloaded for simd.
Example
#include <experimental/simd> #include <iostream> namespace stdx = std::experimental; void print(auto const& a) { for (std::size_t i{}; i != std::size(a); ++i) std::cout << a[i] << ' '; std::cout << '\n'; } template<class A> stdx::simd<int, A> my_abs(stdx::simd<int, A> x) { where(x < 0, x) = -x; return x; } int main() { const stdx::native_simd<int> a = 1; print(a); const stdx::native_simd<int> b([](int i) { return i - 2; }); print(b); const auto c = a + b; print(c); const auto d = my_abs(c); print(d); const auto e = d * d; print(e); const auto inner_product = stdx::reduce(e); std::cout << inner_product << '\n'; }
Output:
1 1 1 1 -2 -1 0 1 -1 0 1 2 1 0 1 2 1 0 1 4 6