Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize CPU and Memory performance for Resize linear mode parser #3731

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 74 additions & 37 deletions src/onnx/parse_resize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -33,49 +33,86 @@ namespace migraphx {
inline namespace MIGRAPHX_INLINE_NS {
namespace onnx {

/*
* Algorithm of calc_neighbor_points():
* Input: vvv_ind, 3-layer vector to compose vector of indices.
* in_s, shape to get space index from, using the composed vector of indices.
* Output: vector contains the result of space index.
*
* From vvv_ind:
* layer-1: size of 1st dimension, caller will pass as n_dim
* layer-2: hardcode to 2 by caller
* layer-3: a vector of out_elements (caller pass) integers.
* vvv_ind = {
* {{...}, {...}},
* {{...}, {...}},
* {{...}, {...}},
* ...
* {{...}, {...}}
* };
*
* To Compose a series of vector of indices, which will further be used to get space index from
* the input shape.
* indices{} has (2^n_dim) * out_elements members, each member is a vector of n_dim indices.
* indices = {
* {...},
* {...},
* {...},
* ...
* {...}
* };
*
* Notate vvv_ind as:
* 0-1
* A B
* C D
* E F
* G H
* Notate A' as A's transpose.
* i.e. A = {0,1,1,0,1};
* A' = {{0},
* {1},
* {1},
* {0},
* {1}
* };
*
* For each number within [0, (2^n_dim)) (outer loop), map each bit (inner loop, MSB to LSB)
* to n_dim, and pick A|B, C|D, E|F, G|H based on bit 0|1.
* i.e. 0110b -> A'D'F'G'
* Transpose A to A' and repeat for each elements within A' (medium loop).
* Use the new crafted vector of n_dim indices, get the mapping from shape in_s.
*
* Outer loop: loop all values within range [0, (2^n_dim))
* Medium loop: loop all elements within layer-3, range [0, m_elements)
* Inner loop: loop all bits of the value of current outer loop
*/

static std::vector<int>
calc_neighbor_points(const std::vector<std::vector<std::vector<std::size_t>>>& vvv_ind,
int i_dim,
std::vector<std::vector<std::size_t>> vec_dims,
const shape& in_s)
{
if(i_dim == vvv_ind.size())
{
std::vector<int> vec_ind(vec_dims.size());
std::transform(vec_dims.begin(), vec_dims.end(), vec_ind.begin(), [&](auto idx) {
return static_cast<int>(in_s.index(idx));
});
return vec_ind;
}
std::size_t n_bits = vvv_ind.size();
std::size_t m_elements = vvv_ind[0][0].size();
std::vector<int> vec_ind;

const auto& vv_lo = vvv_ind[i_dim][0];
std::vector<std::vector<std::size_t>> vec_dims1;
for(std::size_t start = 0; start < vec_dims.size(); start += vv_lo.size())
for(std::size_t val = 0; val < (std::size_t{1} << n_bits); val++)
{
std::transform(vv_lo.begin(),
vv_lo.end(),
vec_dims.begin() + start,
std::back_inserter(vec_dims1),
[](auto i, auto dim) {
dim.push_back(i);
return dim;
});
std::vector<std::size_t> indices(n_bits);
for(std::size_t i_element = 0; i_element < m_elements; i_element++)
{
std::size_t bits_val = val;
indices.clear();
for(std::size_t dim = 0; dim < n_bits; dim++)
{
indices.push_back(vvv_ind[dim][bits_val & std::size_t{1}][i_element]);
bits_val >>= std::size_t{1};
}
vec_ind.push_back(in_s.index(indices));
}
}

const auto& vv_hi = vvv_ind[i_dim][1];
for(std::size_t start = 0; start < vec_dims.size(); start += vv_hi.size())
{
std::transform(vv_hi.begin(),
vv_hi.end(),
vec_dims.begin() + start,
std::back_inserter(vec_dims1),
[](auto i, auto dim) {
dim.push_back(i);
return dim;
});
}
vec_dims.clear();
return calc_neighbor_points(vvv_ind, i_dim + 1, std::move(vec_dims1), in_s);
return vec_ind;
}

static std::string get_coord_trans_mode(const onnx_parser::attribute_map& attr)
Expand Down Expand Up @@ -375,8 +412,8 @@ struct parse_resize : op_parser<parse_resize>
}
});

auto ind = calc_neighbor_points(
vvv_ind, 0, std::vector<std::vector<std::size_t>>(out_elements), in_s);
auto ind = calc_neighbor_points(vvv_ind, in_s);

auto ind_lens = out_lens;
ind_lens[0] *= (std::size_t{1} << n_dim);
shape ind_s{shape::int32_type, ind_lens};
Expand Down
Loading