lorenzbaraldi
diff --git a/‎.github/workflows/tests.yml
Lines changed: 4 additions & 3 deletions b/‎.github/workflows/tests.yml
Lines changed: 4 additions & 3 deletions
diff --git a/‎CONTRIBUTING.md
Lines changed: 107 additions & 0 deletions b/‎CONTRIBUTING.md
Lines changed: 107 additions & 0 deletions
diff --git a/‎README.md
Lines changed: 25 additions & 20 deletions b/‎README.md
Lines changed: 25 additions & 20 deletions
@@ -19,6 +19,7 @@ jobs:
         python: ['3.10']
         torch: ['1.13.0']
         torchvision: ['0.14.0']
+        testmarker: ['-k "not test_models"', '-m base', '-m cfg', '-m torchscript', '-m features', '-m fxforward', '-m fxbackward']
     runs-on: ${{ matrix.os }}
 
     steps:
@@ -30,7 +31,7 @@ jobs:
     - name: Install testing dependencies
       run: |
         python -m pip install --upgrade pip
-        pip install pytest pytest-timeout pytest-xdist pytest-forked expecttest
+        pip install -r requirements-dev.txt
     - name: Install torch on mac
       if: startsWith(matrix.os, 'macOS')
       run: pip install --no-cache-dir torch==${{ matrix.torch }} torchvision==${{ matrix.torchvision }}
@@ -54,10 +55,10 @@ jobs:
         PYTHONDONTWRITEBYTECODE: 1
       run: |
         pytest -vv tests
-    - name: Run tests on Linux / Mac
+    - name: Run '${{ matrix.testmarker }}' tests on Linux / Mac
       if: ${{ !startsWith(matrix.os, 'windows') }}
       env:
         LD_PRELOAD: /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
         PYTHONDONTWRITEBYTECODE: 1
       run: |
-        pytest -vv --forked --durations=0 tests
+        pytest -vv --forked --durations=0 ${{ matrix.testmarker }} tests
@@ -0,0 +1,107 @@
+*This guideline is very much a work-in-progress.*
+
+Contriubtions to `timm` for code, documentation, tests are more than welcome!
+
+There haven't been any formal guidelines to date so please bear with me, and feel free to add to this guide.
+
+# Coding style
+
+Code linting and auto-format (black) are not currently in place but open to consideration. In the meantime, the style to follow is (mostly) aligned with Google's guide: https://google.github.io/styleguide/pyguide.html. 
+
+A few specific differences from Google style (or black)
+1. Line length is 120 char. Going over is okay in some cases (e.g. I prefer not to break URL across lines).
+2. Hanging indents are always prefered, please avoid aligning arguments with closing brackets or braces.
+
+Example, from Google guide, but this is a NO here:
+```
+   # Aligned with opening delimiter.
+   foo = long_function_name(var_one, var_two,
+                            var_three, var_four)
+   meal = (spam,
+           beans)
+
+   # Aligned with opening delimiter in a dictionary.
+   foo = {
+       'long_dictionary_key': value1 +
+                              value2,
+       ...
+   }
+```
+This is YES:
+
+```
+   # 4-space hanging indent; nothing on first line,
+   # closing parenthesis on a new line.
+   foo = long_function_name(
+       var_one, var_two, var_three,
+       var_four
+   )
+   meal = (
+       spam,
+       beans,
+   )
+
+   # 4-space hanging indent in a dictionary.
+   foo = {
+       'long_dictionary_key':
+           long_dictionary_value,
+       ...
+   }
+```
+
+When there is descrepancy in a given source file (there are many origins for various bits of code and not all have been updated to what I consider current goal), please follow the style in a given file.
+
+In general, if you add new code, formatting it with black using the following options should result in a style that is compatible with the rest of the code base:
+
+```
+black --skip-string-normalization --line-length 120 <path-to-file>
+```
+
+Avoid formatting code that is unrelated to your PR though.
+
+PR with pure formatting / style fixes will be accepted but only in isolation from functional changes, best to ask before starting such a change.
+
+# Documentation
+
+As with code style, docstrings style based on the Google guide: guide: https://google.github.io/styleguide/pyguide.html
+
+The goal for the code is to eventually move to have all major functions and `__init__` methods use PEP484 type annotations.
+
+When type annotations are used for a function, as per the Google pyguide, they should **NOT** be duplicated in the docstrings, please leave annotations as the one source of truth re typing.
+
+There are a LOT of gaps in current documentation relative to the functionality in timm, please, document away!
+
+# Installation
+
+Create a Python virtual environment using Python 3.10. Inside the environment, install torch` and `torchvision` using the instructions matching your system as listed on the [PyTorch website](https://pytorch.org/).
+
+Then install the remaining dependencies:
+
+```
+python -m pip install -r requirements.txt
+python -m pip install -r requirements-dev.txt  # for testing
+python -m pip install --no-cache-dir git+https://github.com/mapillary/inplace_abn.git
+python -m pip install -e .
+```
+
+## Unit tests
+
+Run the tests using:
+
+```
+pytest tests/
+```
+
+Since the whole test suite takes a lot of time to run locally (a few hours), you may want to select a subset of tests relating to the changes you made by using the `-k` option of [`pytest`](https://docs.pytest.org/en/7.1.x/example/markers.html#using-k-expr-to-select-tests-based-on-their-name). Moreover, running tests in parallel (in this example 4 processes) with the `-n` option may help:
+
+```
+pytest -k "substring-to-match" -n 4 tests/
+```
+
+## Building documentation
+
+Please refer to [this document](https://github.com/huggingface/pytorch-image-models/tree/main/hfdocs).
+
+# Questions
+
+If you have any questions about contribution, where / how to contribute, please ask in the [Discussions](https://github.com/huggingface/pytorch-image-models/discussions/categories/contributing) (there is a `Contributing` topic).
@@ -24,6 +24,15 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
 * ❗Updates after Oct 10, 2022 are available in 0.8.x pre-releases (`pip install --pre timm`) or cloning main❗
 * Stable releases are 0.6.x and available by normal pip install or clone from [0.6.x](https://github.com/rwightman/pytorch-image-models/tree/0.6.x) branch.
 
+### Feb 26, 2023
+* Add ConvNeXt-XXLarge CLIP pretrained image tower weights for fine-tune & features (fine-tuning TBD) -- see [model card](https://huggingface.co/laion/CLIP-convnext_xxlarge-laion2B-s34B-b82K-augreg-soup)
+* Update `convnext_xxlarge` default LayerNorm eps to 1e-5 (for CLIP weights, improved stability)
+* 0.8.15dev0
+
+### Feb 20, 2023
+* Add 320x320 `convnext_large_mlp.clip_laion2b_ft_320` and `convnext_lage_mlp.clip_laion2b_ft_soup_320` CLIP image tower weights for features & fine-tune
+* 0.8.13dev0 pypi release for latest changes w/ move to huggingface org
+
 ### Feb 16, 2023
 * `safetensor` checkpoint support added
 * Add ideas from 'Scaling Vision Transformers to 22 B. Params' (https://arxiv.org/abs/2302.05442) -- qk norm, RmsNorm, parallel block
@@ -112,7 +121,7 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
 * Finally got around to adding `--model-kwargs` and `--opt-kwargs` to scripts to pass through rare args directly to model classes from cmd line
   * `train.py /imagenet --model resnet50 --amp --model-kwargs output_stride=16 act_layer=silu`
   * `train.py /imagenet --model vit_base_patch16_clip_224 --img-size 240 --amp --model-kwargs img_size=240 patch_size=12`
-* Cleanup some popular models to better support arg passthrough / merge with model configs, more to go. 
+* Cleanup some popular models to better support arg passthrough / merge with model configs, more to go.
 
 ### Jan 5, 2023
 * ConvNeXt-V2 models and weights added to existing `convnext.py`
@@ -142,7 +151,7 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
 | eva_large_patch14_196.in22k_ft_in1k       | 87.9 |       304.1 |  61.6 |  63.5 | [link](https://huggingface.co/BAAI/EVA) |
 
 ### Dec 6, 2022
-* Add 'EVA g', BEiT style ViT-g/14 model weights w/ both MIM pretrain and CLIP pretrain to `beit.py`. 
+* Add 'EVA g', BEiT style ViT-g/14 model weights w/ both MIM pretrain and CLIP pretrain to `beit.py`.
   * original source: https://github.com/baaivision/EVA
   * paper: https://arxiv.org/abs/2211.07636
 
@@ -237,7 +246,7 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
   * `maxxvit_rmlp_small_rw_256` - 84.6 @ 256, 84.9 @ 288 (G) -- could be trained better, hparams need tuning (uses ConvNeXt block, no BN)
   * `coatnet_rmlp_2_rw_224` - 84.6 @ 224, 85 @ 320  (T)
   * NOTE: official MaxVit weights (in1k) have been released at https://github.com/google-research/maxvit -- some extra work is needed to port and adapt since my impl was created independently of theirs and has a few small differences + the whole TF same padding fun.
-  
+
 ### Sept 23, 2022
 * LAION-2B CLIP image towers supported as pretrained backbones for fine-tune or features (no classifier)
   * vit_base_patch32_224_clip_laion2b
@@ -268,7 +277,7 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
   * `coatnet_bn_0_rw_224` - 82.4  (T)
   * `maxvit_nano_rw_256` - 82.9 @ 256  (T)
   * `coatnet_rmlp_1_rw_224` - 83.4 @ 224, 84 @ 320  (T)
-  * `coatnet_1_rw_224` - 83.6 @ 224 (G) 
+  * `coatnet_1_rw_224` - 83.6 @ 224 (G)
   * (T) = TPU trained with `bits_and_tpu` branch training code, (G) = GPU trained
 * GCVit (weights adapted from https://github.com/NVlabs/GCVit, code 100% `timm` re-write for license purposes)
 * MViT-V2 (multi-scale vit, adapted from https://github.com/facebookresearch/mvit)
@@ -283,7 +292,7 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
   * `convnext_atto_ols` - 75.9  @ 224, 77.2 @ 288
 
 ### Aug 5, 2022
-* More custom ConvNeXt smaller model defs with weights 
+* More custom ConvNeXt smaller model defs with weights
   * `convnext_femto` - 77.5 @ 224, 78.7 @ 288
   * `convnext_femto_ols` - 77.9  @ 224, 78.9 @ 288
   * `convnext_pico` - 79.5 @ 224, 80.4 @ 288
@@ -304,7 +313,7 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
   * `cs3sedarknet_x` - 82.2 @ 256, 82.7 @ 288
   * `cs3edgenet_x` - 82.2 @ 256, 82.7 @ 288
   * `cs3se_edgenet_x` - 82.8 @ 256, 83.5 @ 320
-* `cs3*` weights above all trained on TPU w/ `bits_and_tpu` branch. Thanks to TRC program! 
+* `cs3*` weights above all trained on TPU w/ `bits_and_tpu` branch. Thanks to TRC program!
 * Add output_stride=8 and 16 support to ConvNeXt (dilation)
 * deit3 models not being able to resize pos_emb fixed
 * Version 0.6.7 PyPi release (/w above bug fixes and new weighs since 0.6.5)
@@ -337,8 +346,8 @@ More models, more fixes
 * Hugging Face Hub support fixes verified, demo notebook TBA
 * Pretrained weights / configs can be loaded externally (ie from local disk) w/ support for head adaptation.
 * Add support to change image extensions scanned by `timm` datasets/readers. See (https://github.com/rwightman/pytorch-image-models/pull/1274#issuecomment-1178303103)
-* Default ConvNeXt LayerNorm impl to use `F.layer_norm(x.permute(0, 2, 3, 1), ...).permute(0, 3, 1, 2)` via `LayerNorm2d` in all cases. 
-  * a bit slower than previous custom impl on some hardware (ie Ampere w/ CL), but overall fewer regressions across wider HW / PyTorch version ranges. 
+* Default ConvNeXt LayerNorm impl to use `F.layer_norm(x.permute(0, 2, 3, 1), ...).permute(0, 3, 1, 2)` via `LayerNorm2d` in all cases.
+  * a bit slower than previous custom impl on some hardware (ie Ampere w/ CL), but overall fewer regressions across wider HW / PyTorch version ranges.
   * previous impl exists as `LayerNormExp2d` in `models/layers/norm.py`
 * Numerous bug fixes
 * Currently testing for imminent PyPi 0.6.x release
@@ -435,9 +444,7 @@ The work of many others is present here. I've tried to make sure all source mate
 
 ## Models
 
-All model architecture families include variants with pretrained weights. There are specific model variants without any weights, it is NOT a bug. Help training new or better weights is always appreciated. Here are some example [training hparams](https://rwightman.github.io/pytorch-image-models/training_hparam_examples) to get you started.
-
-A full version of the list below with source links can be found in the [documentation](https://rwightman.github.io/pytorch-image-models/models/).
+All model architecture families include variants with pretrained weights. There are specific model variants without any weights, it is NOT a bug. Help training new or better weights is always appreciated.
 
 * Aggregating Nested Transformers - https://arxiv.org/abs/2105.12723
 * BEiT - https://arxiv.org/abs/2106.08254
@@ -538,15 +545,15 @@ Several (less common) features that I often utilize in my projects are included.
 
 * All models have a common default configuration interface and API for
     * accessing/changing the classifier - `get_classifier` and `reset_classifier`
-    * doing a forward pass on just the features - `forward_features` (see [documentation](https://rwightman.github.io/pytorch-image-models/feature_extraction/))
+    * doing a forward pass on just the features - `forward_features` (see [documentation](https://huggingface.co/docs/timm/feature_extraction))
     * these makes it easy to write consistent network wrappers that work with any of the models
-* All models support multi-scale feature map extraction (feature pyramids) via create_model (see [documentation](https://rwightman.github.io/pytorch-image-models/feature_extraction/))
+* All models support multi-scale feature map extraction (feature pyramids) via create_model (see [documentation](https://huggingface.co/docs/timm/feature_extraction))
     * `create_model(name, features_only=True, out_indices=..., output_stride=...)`
     * `out_indices` creation arg specifies which feature maps to return, these indices are 0 based and generally correspond to the `C(i + 1)` feature level.
     * `output_stride` creation arg controls output stride of the network by using dilated convolutions. Most networks are stride 32 by default. Not all networks support this.
     * feature map channel counts, reduction level (stride) can be queried AFTER model creation via the `.feature_info` member
 * All models have a consistent pretrained weight loader that adapts last linear if necessary, and from 3 to 1 channel input if desired
-* High performance [reference training, validation, and inference scripts](https://rwightman.github.io/pytorch-image-models/scripts/) that work in several process/GPU modes:
+* High performance [reference training, validation, and inference scripts](https://huggingface.co/docs/timm/training_script) that work in several process/GPU modes:
     * NVIDIA DDP w/ a single GPU per process, multiple processes with APEX present (AMP mixed-precision optional)
     * PyTorch DistributedDataParallel w/ multi-gpu, single process (AMP disabled as it crashes when enabled)
     * PyTorch w/ single GPU single process (AMP optional)
@@ -573,7 +580,7 @@ Several (less common) features that I often utilize in my projects are included.
 * AutoAugment (https://arxiv.org/abs/1805.09501) and RandAugment (https://arxiv.org/abs/1909.13719) ImageNet configurations modeled after impl for EfficientNet training (https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/autoaugment.py)
 * AugMix w/ JSD loss (https://arxiv.org/abs/1912.02781), JSD w/ clean + augmented mixing support works with AutoAugment and RandAugment as well
 * SplitBachNorm - allows splitting batch norm layers between clean and augmented (auxiliary batch norm) data
-* DropPath aka "Stochastic Depth" (https://arxiv.org/abs/1603.09382) 
+* DropPath aka "Stochastic Depth" (https://arxiv.org/abs/1603.09382)
 * DropBlock (https://arxiv.org/abs/1810.12890)
 * Blur Pooling (https://arxiv.org/abs/1904.11486)
 * Space-to-Depth by [mrT23](https://github.com/mrT23/TResNet/blob/master/src/models/tresnet/layers/space_to_depth.py) (https://arxiv.org/abs/1801.04590) -- original paper?
@@ -600,19 +607,17 @@ Model validation results can be found in the [results tables](results/README.md)
 
 ## Getting Started (Documentation)
 
-My current [documentation](https://rwightman.github.io/pytorch-image-models/) for `timm` covers the basics.
-
-Hugging Face [`timm` docs](https://huggingface.co/docs/hub/timm) will be the documentation focus going forward and will eventually replace the `github.io` docs above.
+The official documentation can be found at https://huggingface.co/docs/hub/timm. Documentation contributions are welcome.
 
 [Getting Started with PyTorch Image Models (timm): A Practitioner’s Guide](https://towardsdatascience.com/getting-started-with-pytorch-image-models-timm-a-practitioners-guide-4e77b4bf9055) by [Chris Hughes](https://github.com/Chris-hughes10) is an extensive blog post covering many aspects of `timm` in detail.
 
-[timmdocs](http://timm.fast.ai/) is quickly becoming a much more comprehensive set of documentation for `timm`. A big thanks to [Aman Arora](https://github.com/amaarora) for his efforts creating timmdocs.
+[timmdocs](http://timm.fast.ai/) is an alternate set of documentation for `timm`. A big thanks to [Aman Arora](https://github.com/amaarora) for his efforts creating timmdocs.
 
 [paperswithcode](https://paperswithcode.com/lib/timm) is a good resource for browsing the models within `timm`.
 
 ## Train, Validation, Inference Scripts
 
-The root folder of the repository contains reference train, validation, and inference scripts that work with the included models and other features of this repository. They are adaptable for other datasets and use cases with a little hacking. See [documentation](https://rwightman.github.io/pytorch-image-models/scripts/) for some basics and [training hparams](https://rwightman.github.io/pytorch-image-models/training_hparam_examples) for some train examples that produce SOTA ImageNet results.
+The root folder of the repository contains reference train, validation, and inference scripts that work with the included models and other features of this repository. They are adaptable for other datasets and use cases with a little hacking. See [documentation](https://huggingface.co/docs/timm/training_script).
 
 ## Awesome PyTorch Resources