Skip to content

Commit 2d907bc

Browse files
ethansfngfacebook-github-bot
authored andcommitted
Use Memcpy in copy_utils (pytorch#11430)
Summary: standard elementwise copy in copy_utils is inefficient, use memcpy instead Rollback Plan: Differential Revision: D76061894
1 parent aed9c7e commit 2d907bc

File tree

1 file changed

+9
-3
lines changed

1 file changed

+9
-3
lines changed

kernels/portable/cpu/util/copy_ops_util.h

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,15 @@ void _as_strided_copy(
2828
int64_t dim) {
2929
// the last dimension, copy data
3030
if (dim == static_cast<int64_t>(size.size()) - 1) {
31-
for (const auto i : c10::irange(size.at(dim))) {
32-
output_data[i] = *input_data;
33-
input_data += stride.at(dim);
31+
const size_t num_elements = size.at(dim);
32+
// use memcpy for contiguous memory
33+
if (stride.at(dim) == 1) {
34+
memcpy(output_data, input_data, num_elements * sizeof(CTYPE));
35+
} else {
36+
for (const auto i : c10::irange(num_elements)) {
37+
output_data[i] = *input_data;
38+
input_data += stride.at(dim);
39+
}
3440
}
3541
return;
3642
}

0 commit comments

Comments
 (0)