up

geraldb · geraldb · commit cd3bde48f508 · 2024-12-18T17:05:09.000+01:00
diff --git a/NOTES.md b/NOTES.md
@@ -0,0 +1,13 @@
+# Notes
+
+
+
+## Inside Deep Learning Book
+
+7.6. Convolutional Neural Networks (LeNet) @
+<https://d2l.ai/chapter_convolutional-neural-networks/lenet.html>
+
+8.1. Deep Convolutional Neural Networks (AlexNet)
+@ <https://d2l.ai/chapter_convolutional-modern/alexnet.html>
+
+
diff --git a/README.md b/README.md
@@ -7,7 +7,7 @@ Various variants of LeNet v5 & friends
 
 
 
-## LeNet v5
+## LeNet v5 (Anno 1995)
 
 The classic (original) LeNet v5
 has two convolutions layers
@@ -115,6 +115,14 @@ That's it.
 
 The award-winning AlexNet is basically a LeNet5 scaled up 1000x and
 introduces relu activation, dropout layers, and more to the world (of deep neural networks).
+
+![](lenet-to-alexnet.png)
+
+> AlexNet is much deeper than the comparatively small LeNet-5.
+> AlexNet consists of eight layers: five convolutional layers,
+> two fully connected hidden layers, and one fully connected output layer.
+
+
 The summary of the model reads:
 
 
diff --git a/alexnet/SUMMARY.md b/alexnet/SUMMARY.md
@@ -6,7 +6,7 @@ to generate - try
 
 resulting in:
 
-## AlexNet  input_size=(3, 224, 224)
+## AlexNet (via Torchvision)  input_size=(3, 224, 224)
 
 ```
 AlexNet(
@@ -78,5 +78,73 @@ Estimated Total Size (MB): 242.03
 
 
 
+## AlexNet  input_size=(3, 224, 224)
+
+```
+AlexNet(
+  (layers): Sequential(
+    (0): Conv2d(3, 96, kernel_size=(11, 11), stride=(4, 4), padding=(1, 1))
+    (1): ReLU()
+    (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
+    (3): Conv2d(96, 256, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
+    (4): ReLU()
+    (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
+    (6): Conv2d(256, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+    (7): ReLU()
+    (8): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+    (9): ReLU()
+    (10): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+    (11): ReLU()
+    (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
+    (13): Flatten(start_dim=1, end_dim=-1)
+    (14): Linear(in_features=6400, out_features=4096, bias=True)
+    (15): ReLU()
+    (16): Dropout(p=0.5, inplace=False)
+    (17): Linear(in_features=4096, out_features=4096, bias=True)
+    (18): ReLU()
+    (19): Dropout(p=0.5, inplace=False)
+    (20): Linear(in_features=4096, out_features=1000, bias=True)
+  )
+)
+Total number of trainable model parameters: 50844008
+about  192.45 MBs, 197836.61 KBs
+
+----------------------------------------------------------------
+        Layer (type)               Output Shape         Param #
+================================================================
+            Conv2d-1           [-1, 96, 54, 54]          34,944
+              ReLU-2           [-1, 96, 54, 54]               0
+         MaxPool2d-3           [-1, 96, 26, 26]               0
+            Conv2d-4          [-1, 256, 26, 26]         614,656
+              ReLU-5          [-1, 256, 26, 26]               0
+         MaxPool2d-6          [-1, 256, 12, 12]               0
+            Conv2d-7          [-1, 384, 12, 12]         885,120
+              ReLU-8          [-1, 384, 12, 12]               0
+            Conv2d-9          [-1, 384, 12, 12]       1,327,488
+             ReLU-10          [-1, 384, 12, 12]               0
+           Conv2d-11          [-1, 256, 12, 12]         884,992
+             ReLU-12          [-1, 256, 12, 12]               0
+        MaxPool2d-13            [-1, 256, 5, 5]               0
+          Flatten-14                 [-1, 6400]               0
+           Linear-15                 [-1, 4096]      26,218,496
+             ReLU-16                 [-1, 4096]               0
+          Dropout-17                 [-1, 4096]               0
+           Linear-18                 [-1, 4096]      16,781,312
+             ReLU-19                 [-1, 4096]               0
+          Dropout-20                 [-1, 4096]               0
+           Linear-21                 [-1, 1000]       4,097,000
+================================================================
+Total params: 50,844,008
+Trainable params: 50,844,008
+Non-trainable params: 0
+----------------------------------------------------------------
+Input size (MB): 0.57
+Forward/backward pass size (MB): 10.23
+Params size (MB): 193.95
+Estimated Total Size (MB): 204.76
+----------------------------------------------------------------
+```
+
+
 
 
diff --git a/alexnet/eval.py b/alexnet/eval.py
@@ -4,7 +4,7 @@
 
 
 ### local imports
-from model import model
+from models import model
 
 
 
diff --git a/alexnet/model.py b/alexnet/model.py
diff --git a/alexnet/models.py b/alexnet/models.py
@@ -0,0 +1,82 @@
+####
+#  alexnet models
+
+
+import torch
+import torch.nn as nn
+from torchvision import models
+from torchvision.models import AlexNet_Weights
+
+## pre-built via torchvision  including weights
+model = models.alexnet( weights=AlexNet_Weights.IMAGENET1K_V1 )
+
+
+
+####
+#  do-it-yourself version via
+#     https://d2l.ai/chapter_convolutional-modern/alexnet.html
+
+
+class AlexNet(nn.Module):
+    def __init__(self,num_classes=1000):
+        super(AlexNet, self).__init__()
+
+        self.layers = nn.Sequential(
+            nn.LazyConv2d(96, kernel_size=11, stride=4, padding=1), nn.ReLU(),
+            nn.MaxPool2d(kernel_size=3, stride=2),
+            nn.LazyConv2d(256, kernel_size=5, padding=2), nn.ReLU(),
+            nn.MaxPool2d(kernel_size=3, stride=2),
+            nn.LazyConv2d(384, kernel_size=3, padding=1), nn.ReLU(),
+            nn.LazyConv2d(384, kernel_size=3, padding=1), nn.ReLU(),
+            nn.LazyConv2d(256, kernel_size=3, padding=1), nn.ReLU(),
+            nn.MaxPool2d(kernel_size=3, stride=2),
+            nn.Flatten(),
+            nn.LazyLinear(4096), nn.ReLU(),
+            nn.Dropout(p=0.5),
+            nn.LazyLinear(4096), nn.ReLU(),
+            nn.Dropout(p=0.5),
+            nn.LazyLinear(num_classes)
+        )
+
+    def forward(self, x):
+        # Forward pass through the network
+        return self.layers( x )
+
+
+
+
+if __name__ == '__main__':
+    # Print the model summaries
+    from torchsummary import summary
+
+    def print_model( model, input_size ):
+      print(  "="*20,
+             f"\n=  {model.__class__.__name__}  input_size={input_size}" )
+
+      print()
+      print(model)
+
+      num_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
+      print("Total number of trainable model parameters:", num_params)
+      total_bytes = num_params * 4   # assume float32 (4 bytes)
+      print( f"about  {total_bytes / (1028 *1028) :.2f} MBs, {total_bytes / 1028 :.2f} KBs" )
+
+      print( "\nsummary:")
+      summary(model, input_size)
+
+
+    print_model( model,      input_size=(3, 224, 224) )
+
+    model = AlexNet()
+    x = torch.randn(3,224,224).unsqueeze(0)
+    y = model.forward( x )
+    ## note - (auto-)summary not working for now (not working with Lazy) e.g.
+    ##  ValueError: Attempted to use an uninitialized parameter in
+    ##   <method 'numel' of 'torch._C.TensorBase' objects>.
+    ##   This error happens when you are using a `LazyModule`
+    ##   or explicitly manipulating `torch.nn.parameter.UninitializedParameter` objects.
+    ##   When using LazyModules Call `forward` with a dummy batch to initialize
+    ##   the parameters before calling torch functions
+    print_model( model,  input_size=(3, 224, 224) )
+    print("bye")
+
diff --git a/lenet-to-alexnet.png b/lenet-to-alexnet.png
diff --git a/models.py b/models.py
@@ -50,19 +50,15 @@ def __init__(self):
 
         self.layers = nn.Sequential(
             # Layer 1: Convolutional layer with 6 filters of size 5x5
-            nn.Conv2d(1, 6, kernel_size=5),
-            nn.Tanh(),
+            nn.Conv2d(1, 6, kernel_size=5), nn.Tanh(),
             nn.AvgPool2d(kernel_size=2, stride=2),  # Subsampling (avg pooling)
             # Layer 2: Convolutional layer with 16 filters of size 5x5
-            nn.Conv2d(6, 16, kernel_size=5),
-            nn.Tanh(),
+            nn.Conv2d(6, 16, kernel_size=5), nn.Tanh(),
             nn.AvgPool2d(kernel_size=2, stride=2),  # Subsampling (avg pooling)
             nn.Flatten(),
             # Fully connected layers
-            nn.Linear(16 * 5 * 5, 120),  # Flattened output from previous layer
-            nn.Tanh(),
-            nn.Linear(120, 84),
-            nn.Tanh(),
+            nn.Linear(16 * 5 * 5, 120), nn.Tanh(),  # Flattened output from previous layer
+            nn.Linear(120, 84), nn.Tanh(),
             nn.Linear(84, 10)    # Output layer with 10 classes
           )
 
@@ -148,19 +144,15 @@ def __init__(self):
 
         self.layers = nn.Sequential(
             # Layer 1: Convolutional layer with 6 filters of size 5x5
-            nn.Conv2d(1, 6, kernel_size=5, padding=2),
-            nn.ReLU(),
+            nn.Conv2d(1, 6, kernel_size=5, padding=2), nn.ReLU(),
             nn.MaxPool2d(kernel_size=2, stride=2),  # Subsampling (max pooling)
             # Layer 2: Convolutional layer with 16 filters of size 5x5
-            nn.Conv2d(6, 16, kernel_size=5),
-            nn.ReLU(),
+            nn.Conv2d(6, 16, kernel_size=5), nn.ReLU(),
             nn.MaxPool2d(kernel_size=2, stride=2),  # Subsampling (max pooling)
             nn.Flatten(),
             # Fully connected layers
-            nn.Linear(16 * 5 * 5, 120),  # Flattened output from previous layer
-            nn.ReLU(),
-            nn.Linear(120, 84),
-            nn.ReLU(),
+            nn.Linear(16 * 5 * 5, 120), nn.ReLU(),  # Flattened output from previous layer
+            nn.Linear(120, 84), nn.ReLU(),
             nn.Linear(84, 10)    # Output layer with 10 classes
           )
 
diff --git a/prompts/LAZY.md b/prompts/LAZY.md
@@ -0,0 +1,69 @@
+# Q:  pytorchh - what is nn.LazyLinear and nn.LazyConv2D?
+
+
+A: In PyTorch, `nn.LazyLinear` and `nn.LazyConv2d` are layers designed to provide **lazy initialization** of parameters. These layers are part of the `torch.nn` module and are useful when you don't want to specify the exact dimensions of the input at the time of model construction. Instead, the dimensions are inferred dynamically during the first forward pass.
+
+### 1. `nn.LazyLinear`
+`nn.LazyLinear` is a variant of `nn.Linear` where the input size is not explicitly specified during the layer's creation. Instead, it will automatically infer the input size the first time the layer is used in a forward pass.
+
+#### Usage:
+- **Lazy Initialization**: You don’t need to know the input size of the layer in advance.
+- The first time the layer receives an input, it will automatically compute the input size and initialize the weight and bias accordingly.
+
+#### Example:
+```python
+import torch
+import torch.nn as nn
+
+# Define a model with LazyLinear layer
+model = nn.Sequential(
+    nn.LazyLinear(10),  # Lazy Linear layer, output size = 10
+)
+
+# Example input tensor (e.g., batch_size=5, input_features=20)
+x = torch.randn(5, 20)  # (batch_size, input_features)
+output = model(x)
+
+print(output.shape)  # Will print: torch.Size([5, 10])
+```
+
+In this case, the `nn.LazyLinear` layer does not require you to specify the input features in the constructor. Instead, when the model is first run with the input `x`, it infers that the input has `20` features (from the shape of `x`), and automatically initializes the weights accordingly.
+
+### 2. `nn.LazyConv2d`
+`nn.LazyConv2d` is a similar variant of the convolutional layer (`nn.Conv2d`), but again, without the need to specify the number of input channels or kernel size during the initialization. Like `LazyLinear`, the input shape (number of input channels and spatial dimensions) is inferred dynamically when the model first performs a forward pass.
+
+#### Example:
+```python
+import torch
+import torch.nn as nn
+
+# Define a model with LazyConv2d layer
+model = nn.Sequential(
+    nn.LazyConv2d(16, kernel_size=3),  # Lazy Conv2d, output channels = 16, kernel size = 3
+)
+
+# Example input tensor (e.g., batch_size=5, channels=3, height=28, width=28)
+x = torch.randn(5, 3, 28, 28)  # (batch_size, input_channels, height, width)
+output = model(x)
+
+print(output.shape)  # Will print: torch.Size([5, 16, 26, 26])
+```
+
+In this example, the `nn.LazyConv2d` layer does not require the `in_channels` to be specified. It will automatically infer that the input has `3` channels (based on the input `x` shape), and initialize the weights accordingly.
+
+### Key Differences Between Regular and Lazy Layers:
+
+- **Regular Layers (`nn.Linear`, `nn.Conv2d`)**: When using these layers, you need to specify the input size when defining the layer. For example, `nn.Linear(in_features, out_features)` or `nn.Conv2d(in_channels, out_channels, kernel_size)`.
+
+- **Lazy Layers (`nn.LazyLinear`, `nn.LazyConv2d`)**: These layers allow you to define the output size of the layer (e.g., `out_features` or `out_channels`), but the input size is inferred automatically from the first forward pass. This can be useful when you don't know the input shape ahead of time or for more flexible model definitions.
+
+### Use Cases for Lazy Layers:
+- **Flexible architectures**: Lazy layers are helpful when you're designing a model where the input shape may vary or isn't known at the time of defining the architecture.
+- **Preliminary models**: If you're prototyping a model and don’t have a fixed dataset size, lazy layers let you experiment without worrying about the exact input dimensions.
+
+### Summary:
+- `nn.LazyLinear`: A version of `nn.Linear` where the input size is inferred during the first forward pass.
+- `nn.LazyConv2d`: A version of `nn.Conv2d` where the input size is inferred during the first forward pass.
+
+Both layers allow you to build more flexible and adaptable models without having to specify the input size explicitly.
+

Original file line number	Diff line number	Diff line change
`@@ -4,7 +4,7 @@`
`4`	`4`
`5`	`5`
`6`	`6`	`### local imports`
`7`		`-from model import model`
	`7`	`+from models import model`
`8`	`8`
`9`	`9`
`10`	`10`