-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend the documentation with more information about multidimensional ranges #1569
Changes from 4 commits
f478845
9f51006
9a0fc35
368859a
0534363
7017e59
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -72,4 +72,39 @@ along its longest axis. When used with ``parallel_for``, it causes the | |||||
loop to be "recursively blocked" in a way that improves cache usage. | ||||||
This nice cache behavior means that using ``parallel_for`` over a | ||||||
``blocked_range2d<T>`` can make a loop run faster than the sequential | ||||||
equivalent, even on a single processor. | ||||||
equivalent, even on a single processor. | ||||||
|
||||||
Also, ``blocked_range2d`` allows to use different value types across | ||||||
its first dimension (called "rows") and the second one ("columns"). | ||||||
That allows combining indexes, pointers, and iterators into a joint | ||||||
akukanov marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
iteration space. The method functions ``rows()`` and ``cols()`` return | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To get the range for each dimension, use the |
||||||
corresponding dimensions in the form of a ``blocked_range``. | ||||||
|
||||||
The ``blocked_range3d`` class template extends this approach to 3D by adding | ||||||
``pages()`` as the first dimension, followed by ``rows()`` and ``cols()``. | ||||||
|
||||||
The ``blocked_nd_range<T,N>`` class template represents a blocked iteration | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure that I got it right, but maybe smth like: The |
||||||
space of any dimensionality, but in a slightly different way. All dimensions | ||||||
of ``blocked_nd_range`` must be specified over the same value type, and the | ||||||
constructor takes N instances of ``blocked_range<T>``, not individual boundary | ||||||
values. To indicate the distinctions, the different naming pattern was chosen. | ||||||
|
||||||
|
||||||
An Example of a Multidimensional Iteration Space | ||||||
akukanov marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
------------------------------------------------ | ||||||
|
||||||
The example demonstrates calculation of a 3-dimensional filter over the pack | ||||||
of feature maps, applying a kernel to a subrange of features. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
The ``convolution3d`` function iterates over the output cells and sets cell | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The |
||||||
values to the result of the ``kernel3d`` function, which summarizes values | ||||||
from feature maps. | ||||||
|
||||||
For the computation to be performed in parallel, ``tbb::parallel_for`` is called | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
with ``tbb::blocked_nd_range<int,3>`` as an argument. The body function then | ||||||
iterates over the received 3-dimensional subrange in a loop nest, using | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Within the body function, a nested loop iterates over the 3D subrange received. The |
||||||
the ``dim`` method function to obtain loop boundaries for each dimension. | ||||||
|
||||||
|
||||||
.. literalinclude:: ./snippets/blocked_nd_range_example.h | ||||||
:language: c++ |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
#include "blocked_nd_range_example.h" | ||
#include <vector> | ||
#include <cassert> | ||
|
||
int main() { | ||
const int kernel_length = 9; | ||
const int kernel_width = 5; | ||
const int kernel_height = 5; | ||
|
||
const int feature_maps_length = 128; | ||
const int feature_maps_width = 16; | ||
const int feature_maps_heigth = 16; | ||
|
||
const int out_length = feature_maps_length - kernel_length + 1; | ||
const int out_width = feature_maps_width - kernel_width + 1; | ||
const int out_heigth = feature_maps_heigth - kernel_height + 1; | ||
|
||
// Initializes feature maps with 1 in each cell and out with zeros. | ||
std::vector<std::vector<std::vector<float>>> feature_maps(feature_maps_length, std::vector<std::vector<float>>(feature_maps_width, std::vector<float>(feature_maps_heigth, 1.0f))); | ||
std::vector<std::vector<std::vector<float>>> out(out_length, std::vector<std::vector<float>>(out_width, std::vector<float>(out_heigth, 0.f))); | ||
|
||
// 3D convolution calculates sum of all elements in kernel | ||
akukanov marked this conversation as resolved.
Show resolved
Hide resolved
|
||
convolution3d(feature_maps, out, | ||
out_length, out_width, out_heigth, | ||
kernel_length, kernel_width, kernel_height); | ||
|
||
// Checks correctness of convolution by equality to expected sum of elements | ||
akukanov marked this conversation as resolved.
Show resolved
Hide resolved
|
||
float expected = float(kernel_length * kernel_height * kernel_width); | ||
for (auto i : out) { | ||
for (auto j : i) { | ||
for (auto k : j) { | ||
assert(k == expected && "convolution failed to calculate correctly"); | ||
} | ||
} | ||
} | ||
return 0; | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
#include "oneapi/tbb/blocked_nd_range.h" | ||
#include "oneapi/tbb/parallel_for.h" | ||
|
||
template<typename Features> | ||
float kernel3d(const Features& feature_maps, int i, int j, int k, | ||
int kernel_length, int kernel_width, int kernel_height) { | ||
float result = 0.f; | ||
|
||
for (int feature_i = i; feature_i < i + kernel_length; ++feature_i) | ||
for (int feature_j = j; feature_j < j + kernel_width; ++feature_j) | ||
for (int feature_k = k; feature_k < k + kernel_width; ++feature_k) | ||
result += feature_maps[feature_i][feature_j][feature_k]; | ||
|
||
return result; | ||
} | ||
|
||
template<typename Features, typename Output> | ||
void convolution3d(const Features& feature_maps, Output& out, | ||
int out_length, int out_width, int out_heigth, | ||
int kernel_length, int kernel_width, int kernel_height) { | ||
using range_t = oneapi::tbb::blocked_nd_range<int, 3>; | ||
|
||
oneapi::tbb::parallel_for( | ||
range_t({0, out_length}, {0, out_width}, {0, out_heigth}), | ||
[&](const range_t& out_range) { | ||
auto out_x = out_range.dim(0); | ||
auto out_y = out_range.dim(1); | ||
auto out_z = out_range.dim(2); | ||
|
||
for (int i = out_x.begin(); i < out_x.end(); ++i) | ||
for (int j = out_y.begin(); j < out_y.end(); ++j) | ||
for (int k = out_z.begin(); k < out_z.end(); ++k) | ||
out[i][j][k] = kernel3d(feature_maps, i, j, k, | ||
kernel_length, kernel_width, kernel_height); | ||
} | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
blocked_range2d
allows you to use different value types for its two dimensions: rows (the first dimension) and columns (the second dimension).