Skip to content

Commit 67597d1

Browse files
committed
wrote about Tensor and TensorImpl and Type; need to finish the code
generation part
1 parent f5a2862 commit 67597d1

File tree

2 files changed

+50
-2
lines changed

2 files changed

+50
-2
lines changed

.gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
/.vscode/
2+
13
# torchvision datasets cache
24
/data/
35

libtorch/1_libtorch_classes.md

+48-2
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,60 @@
11

22
# The origin of PyTorch methods
33

4+
## Tensor
5+
46
At the top of high-level libtorch C++ API is the `at::Tensor` class. It provides a comprehensive set
57
of methods for tensor storage management, data initialization, auto-differentiation, as well as all
68
basic tensor operations. It is defined at
79
[ATen/core/Tensor.h](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/Tensor.h)
810

9-
All actual work is delegated from `Tensor` to the classes `TensorImpl` and `Type`.
11+
`Tensor` class itself does not implement much functionality. Storage management is done in
12+
`TensorImpl` class, and `Tensor` just holds a reference-counting pointer to it. That allows to have
13+
several tensors referencing to the same storage (or slices of it).
14+
15+
`Tensor` dispatches tensor methods to proper implementation, depending on data type and backend.
16+
The routing happens in the `TensorImpl::type()` method, and at the top level all tensor
17+
operations look like e.g.
18+
19+
```C++
20+
inline Tensor Tensor::log10() const {
21+
return type().log10(*this);
22+
}
23+
```
24+
25+
(implemented in
26+
[ATen/core/TensorMethods.h](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/TensorMethods.h))
27+
28+
## TensorImpl
29+
30+
Source
31+
[ATen/core/TensorImpl.h](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/TensorImpl.h).
1032

33+
`TensorImpl` holds the tensor data (as `at::Storage storage_` member) along with the tensor's
34+
dimensions and strides.
1135

12-
Here's how the `Type` hierarchy looks like for CUDA:
36+
`TensorImpl` also implements the routing for tensor operations in `TensorImpl::type()` method. The
37+
actual routing is defined in the `LegacyTypeDispatch` singleton. It holds a rectangular matrix of
38+
`Type` elements. One dimension of that matrix corresponds to scalar types (e.g. `int` or `float`),
39+
and another one to backends (sparse or dense, CUDA or CPU).
40+
41+
(source at
42+
[ATen/core/LegacyTypeDispatch.h](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/LegacyTypeDispatch.h))
43+
44+
## Type
45+
46+
`Type` is the base class for backend- and type-specific operations on tensors. Its source is a good
47+
reference on what operations a tensor should support:
48+
[ATen/core/Type.h](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/Type.h).
49+
50+
Code for classes that inherit from `Type` are **generated** at the build time. There is a
51+
`Type`-derived class for each combination of the backend and scalar type. Entire `Type` hierarchy
52+
looks like this:
1353

1454
![libtorch Type hierarchy](structat_1_1Type__inherit__graph.png)
55+
56+
## Code generation
57+
58+
Code generation is a stage in CMake. That is, one has to run the build to see the generated files. A
59+
Python script, `gen.py`, takes templates of C++ `.h` and `.cpp` files from `ATen/templates/`
60+
directory, and expands `${...}` placeholders with type- and backend-specific code (and comments!).

0 commit comments

Comments
 (0)