|
1 | 1 |
|
2 | 2 | # The origin of PyTorch methods
|
3 | 3 |
|
| 4 | +## Tensor |
| 5 | + |
4 | 6 | At the top of high-level libtorch C++ API is the `at::Tensor` class. It provides a comprehensive set
|
5 | 7 | of methods for tensor storage management, data initialization, auto-differentiation, as well as all
|
6 | 8 | basic tensor operations. It is defined at
|
7 | 9 | [ATen/core/Tensor.h](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/Tensor.h)
|
8 | 10 |
|
9 |
| -All actual work is delegated from `Tensor` to the classes `TensorImpl` and `Type`. |
| 11 | +`Tensor` class itself does not implement much functionality. Storage management is done in |
| 12 | +`TensorImpl` class, and `Tensor` just holds a reference-counting pointer to it. That allows to have |
| 13 | +several tensors referencing to the same storage (or slices of it). |
| 14 | + |
| 15 | +`Tensor` dispatches tensor methods to proper implementation, depending on data type and backend. |
| 16 | +The routing happens in the `TensorImpl::type()` method, and at the top level all tensor |
| 17 | +operations look like e.g. |
| 18 | + |
| 19 | +```C++ |
| 20 | +inline Tensor Tensor::log10() const { |
| 21 | + return type().log10(*this); |
| 22 | +} |
| 23 | +``` |
| 24 | + |
| 25 | +(implemented in |
| 26 | +[ATen/core/TensorMethods.h](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/TensorMethods.h)) |
| 27 | + |
| 28 | +## TensorImpl |
| 29 | + |
| 30 | +Source |
| 31 | +[ATen/core/TensorImpl.h](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/TensorImpl.h). |
10 | 32 |
|
| 33 | +`TensorImpl` holds the tensor data (as `at::Storage storage_` member) along with the tensor's |
| 34 | +dimensions and strides. |
11 | 35 |
|
12 |
| -Here's how the `Type` hierarchy looks like for CUDA: |
| 36 | +`TensorImpl` also implements the routing for tensor operations in `TensorImpl::type()` method. The |
| 37 | +actual routing is defined in the `LegacyTypeDispatch` singleton. It holds a rectangular matrix of |
| 38 | +`Type` elements. One dimension of that matrix corresponds to scalar types (e.g. `int` or `float`), |
| 39 | +and another one to backends (sparse or dense, CUDA or CPU). |
| 40 | + |
| 41 | +(source at |
| 42 | +[ATen/core/LegacyTypeDispatch.h](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/LegacyTypeDispatch.h)) |
| 43 | + |
| 44 | +## Type |
| 45 | + |
| 46 | +`Type` is the base class for backend- and type-specific operations on tensors. Its source is a good |
| 47 | +reference on what operations a tensor should support: |
| 48 | +[ATen/core/Type.h](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/Type.h). |
| 49 | + |
| 50 | +Code for classes that inherit from `Type` are **generated** at the build time. There is a |
| 51 | +`Type`-derived class for each combination of the backend and scalar type. Entire `Type` hierarchy |
| 52 | +looks like this: |
13 | 53 |
|
14 | 54 | 
|
| 55 | + |
| 56 | +## Code generation |
| 57 | + |
| 58 | +Code generation is a stage in CMake. That is, one has to run the build to see the generated files. A |
| 59 | +Python script, `gen.py`, takes templates of C++ `.h` and `.cpp` files from `ATen/templates/` |
| 60 | +directory, and expands `${...}` placeholders with type- and backend-specific code (and comments!). |
0 commit comments