Phase 1: Core Refactoring β COMPLETE
- All 24 tensor operations extracted to modular files
- Test suite reorganized (unit/cpu/mps)
- 470 tests passing across 32 test files
- Clean, maintainable architecture established
Phase 2: Essential ML Methods π§ IN PROGRESS
- Phase 2A (Shape Operations): β COMPLETE - All 6 shape operations implemented
- Phase 2B (Autograd & Activations): β
COMPLETE - Autograd β
COMPLETE (all 6 utilities), Activations:
relu()β ,sigmoid()β ,tanh()β ,softmax()β ,log_softmax()β complete - Phase 2C (Loss Functions): β
COMPLETE -
mse_loss()β ,cross_entropy()β ,nll_loss()β ,binary_cross_entropy()β complete - Focus on methods needed for real machine learning work
- β
add()- Element-wise addition (tensor + tensor, tensor + scalar) - β
add_()- In-place element-wise addition - β
sub()- Element-wise subtraction - β
sub_()- In-place element-wise subtraction - β
mul()- Element-wise multiplication - β
mul_()- In-place element-wise multiplication - β
div()- Element-wise division - β
div_()- In-place element-wise division
- β
matmul()- Matrix multiplication (dot products, matrix-vector, matrix-matrix)
- β
sum()- Sum all elements to scalar - β
mean()- Arithmetic mean of all elements
- β
cpu()- Move tensor to CPU - β
cuda()- Move tensor to CUDA (with availability check) - β
mps()- Move tensor to Apple Silicon MPS - β
to()- Generic device/dtype conversion
- β
float()- Convert to float32 - β
double()- Convert to float64 - β
int()- Convert to int32 - β
long()- Convert to int64
- β
shape- Get tensor shape - β
dtype- Get tensor dtype - β
device- Get tensor device
- β
toString()- Convert to PyTorch string format - β
toArray()- Convert to JavaScript array
- β
reshape()- Reshape tensor to new shape without copying data - β
flatten()- Flatten tensor to 1D or flatten specific dimensions - β
unsqueeze()- Add dimension of size 1 at specified position - β
squeeze()- Remove dimensions of size 1 - β
transpose()- Swap two dimensions of the tensor - β
permute()- Reorder dimensions according to specified order
- β
requires_grad- Property to mark tensors for gradient tracking - β
backward()- Compute gradients via backpropagation - β
grad- Property to access computed gradients - β
zero_grad()- Clear gradients (sets grad to None) - β
detach()- Detach tensor from computation graph - β
no_grad()- Context function to disable gradient tracking
- β
relu()- ReLU activation function (max(0, x)) - β
sigmoid()- Sigmoid activation function (Ο(x) = 1 / (1 + exp(-x))) - β
tanh()- Hyperbolic tangent activation (tanh(x) = (exp(x) - exp(-x)) / (exp(x) + exp(-x))) - β
softmax()- Softmax activation with optional dim parameter (for classification) - β
log_softmax()- Log-softmax activation (numerically stable log(softmax(x)))
- β
mse_loss()- Mean Squared Error loss (for regression) - β
cross_entropy()- Cross Entropy loss (for classification) - β
nll_loss()- Negative Log-Likelihood loss (for classification with log-probabilities) - β
binary_cross_entropy()- Binary Cross Entropy loss (for binary classification)
These are the most critical methods needed to build, train, and use real ML models.
-
reshape()/view()- Reshape without copying (e.g., flatten for FC layers) -
unsqueeze()- Add dimension (e.g., add batch dimension) -
squeeze()- Remove singleton dimensions -
transpose()- Swap two dimensions (e.g., for convolutions) -
permute()- Arbitrary dimension reordering -
flatten()- Flatten to 1D (convenience method)
Why: Every ML model needs to manipulate tensor shapes for layers, batches, etc.
-
requires_gradproperty - Mark tensors for gradient tracking -
backward()- Compute gradients via backpropagation -
gradproperty - Access computed gradients -
zero_grad()- Clear gradients (sets grad to None) -
detach()- Detach from computation graph -
no_grad()context - Disable gradient tracking
Why: This is THE foundation of training. No autograd = no learning!
-
relu()- ReLU activation (most common) -
sigmoid()- Sigmoid activation (Ο(x) = 1 / (1 + exp(-x))) -
tanh()- Hyperbolic tangent -
softmax()- Softmax (for classification) -
log_softmax()- Log-softmax (numerically stable)
Why: Essential for any neural network layer.
-
mse_loss()- Mean squared error (regression) -
cross_entropy()- Cross-entropy loss (classification) -
nll_loss()- Negative log-likelihood -
binary_cross_entropy()- Binary classification loss
Why: Can't train without computing loss!
-
slice()- Slice tensor along dimension -
index_select()- Select specific indices -
[]operator - Bracket indexing (e.g.,tensor[0:2, :]) -
narrow()- Narrow a dimension
Why: Access batches, specific samples, extract predictions.
-
cat()/concat()- Concatenate along existing dimension -
stack()- Stack tensors along new dimension -
split()- Split tensor into chunks -
chunk()- Split into equal-sized chunks
Why: Combine data batches, build datasets, split for mini-batches.
-
max()- Maximum value (with optional dimension) -
min()- Minimum value -
argmax()- Index of maximum value -
argmin()- Index of minimum value -
sum()with dimension - Sum along specific dimension (already have global sum) -
mean()with dimension - Mean along specific dimension
Why: Model predictions, accuracy computation, batch statistics.
-
pow()- Raise to power -
sqrt()- Square root -
exp()- Exponential -
log()- Natural logarithm -
abs()- Absolute value -
neg()- Negate (multiply by -1) -
clamp()- Clamp values to range
Why: Custom loss functions, normalization, numerical stability.
-
eq()- Element-wise equality -
ne()- Element-wise not-equal -
gt()/ge()- Greater than / greater-or-equal -
lt()/le()- Less than / less-or-equal -
all()- Check if all elements are true -
any()- Check if any element is true
Why: Masking, filtering, conditional operations.
-
clone()- Deep copy tensor -
randint()- Random integers (for labels, indices) -
arange()- Range of values -
linspace()- Linearly spaced values -
eye()- Identity matrix -
full()- Fill with specific value
Why: Data generation, initialization, testing.
These methods are useful but not critical for basic ML workflows.
-
expand()- Expand tensor to larger size -
repeat()- Repeat tensor along dimensions -
tile()- Tile tensor -
gather()- Gather values along dimension -
scatter()- Scatter values
- Convolution operations (
conv1d,conv2d,conv3d) - Pooling operations (
max_pool2d,avg_pool2d) - Normalization (
batch_norm,layer_norm) - Dropout operations
- RNN/LSTM operations
-
inverse()- Matrix inverse -
svd()- Singular value decomposition -
eig()- Eigenvalues and eigenvectors -
qr()- QR decomposition -
cholesky()- Cholesky decomposition
-
std()- Standard deviation -
var()- Variance -
median()- Median value -
mode()- Most common value -
histogram()- Compute histogram
-
where()- Conditional selection -
masked_select()- Select with boolean mask -
nonzero()- Indices of non-zero elements -
unique()- Unique elements -
sort()- Sort tensor values
Goal: Enable basic tensor shape manipulation
- β
Implemented
reshape(),unsqueeze(),squeeze() - β
Implemented
flatten(),transpose(),permute() - β 117 new tests added (27 unit, 23 CPU, 24 MPS for permute; similar for others)
- β All shape operations working across CPU and MPS devices
Next: Phase 2B - Autograd implementation
Goal: Enable model training
- Implement activation functions (
relu,sigmoid,tanh,softmax) - Implement loss functions (
mse_loss,cross_entropy) - Complete autograd (
backward(), gradient tracking)
Goal: Enable real data loading and batching
- Implement indexing/slicing
- Implement
cat(),stack(),split() - Implement
max(),min(),argmax(),argmin()
Goal: Complete training loop support
- Implement element-wise math (
pow,sqrt,exp,log) - Implement comparison operations
- Implement tensor creation utilities (
clone,randint, etc.)
- Convolution and pooling layers
- Normalization layers
- Recurrent operations
- Advanced linear algebra
Priority: HIGH - This is a serious bug that can crash Node.js
32 out of 47 C++ operations are missing try-catch blocks around PyTorch calls. This means:
- Invalid inputs cause uncaught C++ exceptions
- These exceptions terminate the Node.js process instead of throwing JavaScript errors
- Users cannot catch these errors with try-catch in their code
- The application crashes instead of handling errors gracefully
Example of the bug:
// This will CRASH the entire Node.js process:
const x = torch.tensor([1.0, 2.0, 3.0, 4.0]);
const y = x.reshape([2, 3]); // Invalid: 4 elements can't reshape to 6
// Error message:
// libc++abi: terminating due to uncaught exception of type std::runtime_error
// [Node.js process exits]Expected behavior:
// Should throw a catchable JavaScript error:
try {
const x = torch.tensor([1.0, 2.0, 3.0, 4.0]);
const y = x.reshape([2, 3]);
} catch (e) {
console.error('Caught error:', e.message);
// Program continues...
}Arithmetic Operations (8/8 missing):
-
add.cpp -
add_.cpp -
sub.cpp -
sub_.cpp -
mul.cpp -
mul_.cpp -
div.cpp -
div_.cpp
Matrix Operations (1/1 missing):
-
matmul.cpp
Reduction Operations (2/2 missing):
-
sum.cpp -
mean.cpp
Device Management (4/4 missing):
-
cpu.cpp -
cuda.cpp -
mps.cpp -
to.cpp
Dtype Conversions (5/5 missing):
-
float.cpp -
double.cpp -
int.cpp -
long.cpp -
to.cpp(duplicate from device management)
Property Accessors (3/3 missing):
-
shape.cpp -
dtype.cpp -
device.cpp
Utility Methods (2/2 missing):
-
to_string.cpp -
to_array.cpp
Shape Operations (6/6 missing):
-
reshape.cpp -
flatten.cpp -
unsqueeze.cpp -
squeeze.cpp -
transpose.cpp -
permute.cpp
Autograd Operations (2/6 missing):
-
requires_grad.cpp -
grad.cpp -
backward.cppβ Has try-catch -
zero_grad.cppβ Has try-catch -
detach.cppβ Has try-catch -
no_grad.cppβ Has try-catch
Activation Functions (0/5 missing):
-
relu.cppβ Has try-catch -
sigmoid.cppβ Has try-catch -
tanh.cppβ Has try-catch -
softmax.cppβ Has try-catch -
log_softmax.cppβ Has try-catch
Loss Functions (0/4 missing):
-
mse_loss.cppβ Has try-catch -
cross_entropy.cppβ Has try-catch -
nll_loss.cppβ Has try-catch -
binary_cross_entropy.cppβ Has try-catch
For each operation listed above:
-
Open the .cpp file (e.g.,
src/native/ops/reshape.cpp) -
Wrap PyTorch calls in try-catch:
// Before: torch::Tensor result = torch::reshape(self->tensor, new_shape); return Tensor::NewInstance(env, result); // After: try { torch::Tensor result = torch::reshape(self->tensor, new_shape); return Tensor::NewInstance(env, result); } catch (const std::exception& e) { Napi::Error::New(env, e.what()).ThrowAsJavaScriptException(); return env.Undefined(); }
-
Test error handling:
- Add test cases that intentionally cause errors
- Verify errors are catchable with JavaScript try-catch
- Verify error messages are helpful
-
Build and test:
pnpm build pnpm test
Option A: Fix all at once
- Create a single PR fixing all 32 files
- Pros: Comprehensive fix, ensures consistency
- Cons: Large changeset, harder to review
Option B: Fix by category
- Fix one category at a time (e.g., all shape operations)
- Pros: Easier to review, can prioritize high-risk operations
- Cons: Takes longer, inconsistency during transition
Option C: Fix as we go
- Fix operations when we add tests or modify them
- Pros: Minimal disruption, natural progression
- Cons: Bug persists in unfixed operations
Recommendation: Option B - Fix by category, starting with highest-risk operations:
- Shape operations (most likely to get invalid inputs)
- Arithmetic operations (commonly used)
- Device/dtype conversions
- Matrix operations
- Reductions and utilities
- Property accessors (lowest risk - rarely throw)
For each fixed operation, add tests like:
describe('Error Handling', () => {
it('should throw catchable error on invalid input', () => {
expect(() => {
const x = torch.tensor([1.0, 2.0, 3.0, 4.0]);
x.reshape([2, 3]); // Invalid shape
}).toThrow();
});
it('should allow error to be caught', () => {
try {
const x = torch.tensor([1.0, 2.0, 3.0, 4.0]);
x.reshape([2, 3]);
expect(true).toBe(false); // Should not reach here
} catch (e) {
expect(e.message).toContain('invalid');
}
});
});All new operations MUST include try-catch blocks. This is now part of the standard implementation pattern.
Phase 2 Complete When:
- β Can define a simple feedforward neural network
- β Can perform forward pass with activations
- β Can compute loss
- β Can perform backward pass (compute gradients)
- β Can update weights (gradient descent step)
- β Can train on MNIST or similar dataset
- β Can achieve reasonable accuracy (>90% on MNIST)
- Unit Tests (
test/unit/) - Device-agnostic core functionality - CPU Tests (
test/cpu/) - CPU-specific behavior - MPS Tests (
test/mps/) - Apple Silicon GPU tests
pnpm test- Run all tests (unit + cpu + mps)pnpm test:unit- Run unit tests onlypnpm test:cpu- Run CPU tests onlypnpm test:mps- Run MPS tests only
- All methods should have comprehensive tests
- Test edge cases (empty tensors, single elements, large tensors)
- Test device transfers (CPU β MPS)
- Test gradient computation for autograd methods
- Test numerical accuracy against PyTorch
- Clean modular structure: one C++ file per operation
- Operations in
src/native/ops/directory - Consistent TensorOps namespace pattern
- Comprehensive test coverage (430 tests passing across 42 test files)
- REQUIRED: Wrap all PyTorch C++ calls in try-catch blocks (see error handling section above)
- Follow established extraction pattern for new methods
- Add both CPU and MPS tests for each method
- Include error handling tests for invalid inputs
- Ensure all tests pass before committing
- Document complex operations with comments
- Match PyTorch's API as closely as possible
- Use PyTorch's libtorch C++ API as backend
- Match PyTorch's method signatures where possible
- Follow PyTorch's broadcasting rules
- Maintain PyTorch's device model (CPU/CUDA/MPS)