Skip to content

Commit c30a160

Browse files
committed
Merge remote-tracking branch 'origin/main' into focalnet_and_swin_refactor
2 parents 947c1d7 + cd3ee78 commit c30a160

18 files changed

+547
-246
lines changed

.github/workflows/tests.yml

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ jobs:
1919
python: ['3.10']
2020
torch: ['1.13.0']
2121
torchvision: ['0.14.0']
22+
testmarker: ['-k "not test_models"', '-m base', '-m cfg', '-m torchscript', '-m features', '-m fxforward', '-m fxbackward']
2223
runs-on: ${{ matrix.os }}
2324

2425
steps:
@@ -30,7 +31,7 @@ jobs:
3031
- name: Install testing dependencies
3132
run: |
3233
python -m pip install --upgrade pip
33-
pip install pytest pytest-timeout pytest-xdist pytest-forked expecttest
34+
pip install -r requirements-dev.txt
3435
- name: Install torch on mac
3536
if: startsWith(matrix.os, 'macOS')
3637
run: pip install --no-cache-dir torch==${{ matrix.torch }} torchvision==${{ matrix.torchvision }}
@@ -54,10 +55,10 @@ jobs:
5455
PYTHONDONTWRITEBYTECODE: 1
5556
run: |
5657
pytest -vv tests
57-
- name: Run tests on Linux / Mac
58+
- name: Run '${{ matrix.testmarker }}' tests on Linux / Mac
5859
if: ${{ !startsWith(matrix.os, 'windows') }}
5960
env:
6061
LD_PRELOAD: /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4
6162
PYTHONDONTWRITEBYTECODE: 1
6263
run: |
63-
pytest -vv --forked --durations=0 tests
64+
pytest -vv --forked --durations=0 ${{ matrix.testmarker }} tests

CONTRIBUTING.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
*This guideline is very much a work-in-progress.*
2+
3+
Contriubtions to `timm` for code, documentation, tests are more than welcome!
4+
5+
There haven't been any formal guidelines to date so please bear with me, and feel free to add to this guide.
6+
7+
# Coding style
8+
9+
Code linting and auto-format (black) are not currently in place but open to consideration. In the meantime, the style to follow is (mostly) aligned with Google's guide: https://google.github.io/styleguide/pyguide.html.
10+
11+
A few specific differences from Google style (or black)
12+
1. Line length is 120 char. Going over is okay in some cases (e.g. I prefer not to break URL across lines).
13+
2. Hanging indents are always prefered, please avoid aligning arguments with closing brackets or braces.
14+
15+
Example, from Google guide, but this is a NO here:
16+
```
17+
# Aligned with opening delimiter.
18+
foo = long_function_name(var_one, var_two,
19+
var_three, var_four)
20+
meal = (spam,
21+
beans)
22+
23+
# Aligned with opening delimiter in a dictionary.
24+
foo = {
25+
'long_dictionary_key': value1 +
26+
value2,
27+
...
28+
}
29+
```
30+
This is YES:
31+
32+
```
33+
# 4-space hanging indent; nothing on first line,
34+
# closing parenthesis on a new line.
35+
foo = long_function_name(
36+
var_one, var_two, var_three,
37+
var_four
38+
)
39+
meal = (
40+
spam,
41+
beans,
42+
)
43+
44+
# 4-space hanging indent in a dictionary.
45+
foo = {
46+
'long_dictionary_key':
47+
long_dictionary_value,
48+
...
49+
}
50+
```
51+
52+
When there is descrepancy in a given source file (there are many origins for various bits of code and not all have been updated to what I consider current goal), please follow the style in a given file.
53+
54+
In general, if you add new code, formatting it with black using the following options should result in a style that is compatible with the rest of the code base:
55+
56+
```
57+
black --skip-string-normalization --line-length 120 <path-to-file>
58+
```
59+
60+
Avoid formatting code that is unrelated to your PR though.
61+
62+
PR with pure formatting / style fixes will be accepted but only in isolation from functional changes, best to ask before starting such a change.
63+
64+
# Documentation
65+
66+
As with code style, docstrings style based on the Google guide: guide: https://google.github.io/styleguide/pyguide.html
67+
68+
The goal for the code is to eventually move to have all major functions and `__init__` methods use PEP484 type annotations.
69+
70+
When type annotations are used for a function, as per the Google pyguide, they should **NOT** be duplicated in the docstrings, please leave annotations as the one source of truth re typing.
71+
72+
There are a LOT of gaps in current documentation relative to the functionality in timm, please, document away!
73+
74+
# Installation
75+
76+
Create a Python virtual environment using Python 3.10. Inside the environment, install torch` and `torchvision` using the instructions matching your system as listed on the [PyTorch website](https://pytorch.org/).
77+
78+
Then install the remaining dependencies:
79+
80+
```
81+
python -m pip install -r requirements.txt
82+
python -m pip install -r requirements-dev.txt # for testing
83+
python -m pip install --no-cache-dir git+https://github.com/mapillary/inplace_abn.git
84+
python -m pip install -e .
85+
```
86+
87+
## Unit tests
88+
89+
Run the tests using:
90+
91+
```
92+
pytest tests/
93+
```
94+
95+
Since the whole test suite takes a lot of time to run locally (a few hours), you may want to select a subset of tests relating to the changes you made by using the `-k` option of [`pytest`](https://docs.pytest.org/en/7.1.x/example/markers.html#using-k-expr-to-select-tests-based-on-their-name). Moreover, running tests in parallel (in this example 4 processes) with the `-n` option may help:
96+
97+
```
98+
pytest -k "substring-to-match" -n 4 tests/
99+
```
100+
101+
## Building documentation
102+
103+
Please refer to [this document](https://github.com/huggingface/pytorch-image-models/tree/main/hfdocs).
104+
105+
# Questions
106+
107+
If you have any questions about contribution, where / how to contribute, please ask in the [Discussions](https://github.com/huggingface/pytorch-image-models/discussions/categories/contributing) (there is a `Contributing` topic).

README.md

Lines changed: 25 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,15 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
2424
* ❗Updates after Oct 10, 2022 are available in 0.8.x pre-releases (`pip install --pre timm`) or cloning main❗
2525
* Stable releases are 0.6.x and available by normal pip install or clone from [0.6.x](https://github.com/rwightman/pytorch-image-models/tree/0.6.x) branch.
2626

27+
### Feb 26, 2023
28+
* Add ConvNeXt-XXLarge CLIP pretrained image tower weights for fine-tune & features (fine-tuning TBD) -- see [model card](https://huggingface.co/laion/CLIP-convnext_xxlarge-laion2B-s34B-b82K-augreg-soup)
29+
* Update `convnext_xxlarge` default LayerNorm eps to 1e-5 (for CLIP weights, improved stability)
30+
* 0.8.15dev0
31+
32+
### Feb 20, 2023
33+
* Add 320x320 `convnext_large_mlp.clip_laion2b_ft_320` and `convnext_lage_mlp.clip_laion2b_ft_soup_320` CLIP image tower weights for features & fine-tune
34+
* 0.8.13dev0 pypi release for latest changes w/ move to huggingface org
35+
2736
### Feb 16, 2023
2837
* `safetensor` checkpoint support added
2938
* Add ideas from 'Scaling Vision Transformers to 22 B. Params' (https://arxiv.org/abs/2302.05442) -- qk norm, RmsNorm, parallel block
@@ -112,7 +121,7 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
112121
* Finally got around to adding `--model-kwargs` and `--opt-kwargs` to scripts to pass through rare args directly to model classes from cmd line
113122
* `train.py /imagenet --model resnet50 --amp --model-kwargs output_stride=16 act_layer=silu`
114123
* `train.py /imagenet --model vit_base_patch16_clip_224 --img-size 240 --amp --model-kwargs img_size=240 patch_size=12`
115-
* Cleanup some popular models to better support arg passthrough / merge with model configs, more to go.
124+
* Cleanup some popular models to better support arg passthrough / merge with model configs, more to go.
116125

117126
### Jan 5, 2023
118127
* ConvNeXt-V2 models and weights added to existing `convnext.py`
@@ -142,7 +151,7 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
142151
| eva_large_patch14_196.in22k_ft_in1k | 87.9 | 304.1 | 61.6 | 63.5 | [link](https://huggingface.co/BAAI/EVA) |
143152

144153
### Dec 6, 2022
145-
* Add 'EVA g', BEiT style ViT-g/14 model weights w/ both MIM pretrain and CLIP pretrain to `beit.py`.
154+
* Add 'EVA g', BEiT style ViT-g/14 model weights w/ both MIM pretrain and CLIP pretrain to `beit.py`.
146155
* original source: https://github.com/baaivision/EVA
147156
* paper: https://arxiv.org/abs/2211.07636
148157

@@ -237,7 +246,7 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
237246
* `maxxvit_rmlp_small_rw_256` - 84.6 @ 256, 84.9 @ 288 (G) -- could be trained better, hparams need tuning (uses ConvNeXt block, no BN)
238247
* `coatnet_rmlp_2_rw_224` - 84.6 @ 224, 85 @ 320 (T)
239248
* NOTE: official MaxVit weights (in1k) have been released at https://github.com/google-research/maxvit -- some extra work is needed to port and adapt since my impl was created independently of theirs and has a few small differences + the whole TF same padding fun.
240-
249+
241250
### Sept 23, 2022
242251
* LAION-2B CLIP image towers supported as pretrained backbones for fine-tune or features (no classifier)
243252
* vit_base_patch32_224_clip_laion2b
@@ -268,7 +277,7 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
268277
* `coatnet_bn_0_rw_224` - 82.4 (T)
269278
* `maxvit_nano_rw_256` - 82.9 @ 256 (T)
270279
* `coatnet_rmlp_1_rw_224` - 83.4 @ 224, 84 @ 320 (T)
271-
* `coatnet_1_rw_224` - 83.6 @ 224 (G)
280+
* `coatnet_1_rw_224` - 83.6 @ 224 (G)
272281
* (T) = TPU trained with `bits_and_tpu` branch training code, (G) = GPU trained
273282
* GCVit (weights adapted from https://github.com/NVlabs/GCVit, code 100% `timm` re-write for license purposes)
274283
* MViT-V2 (multi-scale vit, adapted from https://github.com/facebookresearch/mvit)
@@ -283,7 +292,7 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
283292
* `convnext_atto_ols` - 75.9 @ 224, 77.2 @ 288
284293

285294
### Aug 5, 2022
286-
* More custom ConvNeXt smaller model defs with weights
295+
* More custom ConvNeXt smaller model defs with weights
287296
* `convnext_femto` - 77.5 @ 224, 78.7 @ 288
288297
* `convnext_femto_ols` - 77.9 @ 224, 78.9 @ 288
289298
* `convnext_pico` - 79.5 @ 224, 80.4 @ 288
@@ -304,7 +313,7 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
304313
* `cs3sedarknet_x` - 82.2 @ 256, 82.7 @ 288
305314
* `cs3edgenet_x` - 82.2 @ 256, 82.7 @ 288
306315
* `cs3se_edgenet_x` - 82.8 @ 256, 83.5 @ 320
307-
* `cs3*` weights above all trained on TPU w/ `bits_and_tpu` branch. Thanks to TRC program!
316+
* `cs3*` weights above all trained on TPU w/ `bits_and_tpu` branch. Thanks to TRC program!
308317
* Add output_stride=8 and 16 support to ConvNeXt (dilation)
309318
* deit3 models not being able to resize pos_emb fixed
310319
* Version 0.6.7 PyPi release (/w above bug fixes and new weighs since 0.6.5)
@@ -337,8 +346,8 @@ More models, more fixes
337346
* Hugging Face Hub support fixes verified, demo notebook TBA
338347
* Pretrained weights / configs can be loaded externally (ie from local disk) w/ support for head adaptation.
339348
* Add support to change image extensions scanned by `timm` datasets/readers. See (https://github.com/rwightman/pytorch-image-models/pull/1274#issuecomment-1178303103)
340-
* Default ConvNeXt LayerNorm impl to use `F.layer_norm(x.permute(0, 2, 3, 1), ...).permute(0, 3, 1, 2)` via `LayerNorm2d` in all cases.
341-
* a bit slower than previous custom impl on some hardware (ie Ampere w/ CL), but overall fewer regressions across wider HW / PyTorch version ranges.
349+
* Default ConvNeXt LayerNorm impl to use `F.layer_norm(x.permute(0, 2, 3, 1), ...).permute(0, 3, 1, 2)` via `LayerNorm2d` in all cases.
350+
* a bit slower than previous custom impl on some hardware (ie Ampere w/ CL), but overall fewer regressions across wider HW / PyTorch version ranges.
342351
* previous impl exists as `LayerNormExp2d` in `models/layers/norm.py`
343352
* Numerous bug fixes
344353
* Currently testing for imminent PyPi 0.6.x release
@@ -435,9 +444,7 @@ The work of many others is present here. I've tried to make sure all source mate
435444

436445
## Models
437446

438-
All model architecture families include variants with pretrained weights. There are specific model variants without any weights, it is NOT a bug. Help training new or better weights is always appreciated. Here are some example [training hparams](https://rwightman.github.io/pytorch-image-models/training_hparam_examples) to get you started.
439-
440-
A full version of the list below with source links can be found in the [documentation](https://rwightman.github.io/pytorch-image-models/models/).
447+
All model architecture families include variants with pretrained weights. There are specific model variants without any weights, it is NOT a bug. Help training new or better weights is always appreciated.
441448

442449
* Aggregating Nested Transformers - https://arxiv.org/abs/2105.12723
443450
* BEiT - https://arxiv.org/abs/2106.08254
@@ -538,15 +545,15 @@ Several (less common) features that I often utilize in my projects are included.
538545

539546
* All models have a common default configuration interface and API for
540547
* accessing/changing the classifier - `get_classifier` and `reset_classifier`
541-
* doing a forward pass on just the features - `forward_features` (see [documentation](https://rwightman.github.io/pytorch-image-models/feature_extraction/))
548+
* doing a forward pass on just the features - `forward_features` (see [documentation](https://huggingface.co/docs/timm/feature_extraction))
542549
* these makes it easy to write consistent network wrappers that work with any of the models
543-
* All models support multi-scale feature map extraction (feature pyramids) via create_model (see [documentation](https://rwightman.github.io/pytorch-image-models/feature_extraction/))
550+
* All models support multi-scale feature map extraction (feature pyramids) via create_model (see [documentation](https://huggingface.co/docs/timm/feature_extraction))
544551
* `create_model(name, features_only=True, out_indices=..., output_stride=...)`
545552
* `out_indices` creation arg specifies which feature maps to return, these indices are 0 based and generally correspond to the `C(i + 1)` feature level.
546553
* `output_stride` creation arg controls output stride of the network by using dilated convolutions. Most networks are stride 32 by default. Not all networks support this.
547554
* feature map channel counts, reduction level (stride) can be queried AFTER model creation via the `.feature_info` member
548555
* All models have a consistent pretrained weight loader that adapts last linear if necessary, and from 3 to 1 channel input if desired
549-
* High performance [reference training, validation, and inference scripts](https://rwightman.github.io/pytorch-image-models/scripts/) that work in several process/GPU modes:
556+
* High performance [reference training, validation, and inference scripts](https://huggingface.co/docs/timm/training_script) that work in several process/GPU modes:
550557
* NVIDIA DDP w/ a single GPU per process, multiple processes with APEX present (AMP mixed-precision optional)
551558
* PyTorch DistributedDataParallel w/ multi-gpu, single process (AMP disabled as it crashes when enabled)
552559
* PyTorch w/ single GPU single process (AMP optional)
@@ -573,7 +580,7 @@ Several (less common) features that I often utilize in my projects are included.
573580
* AutoAugment (https://arxiv.org/abs/1805.09501) and RandAugment (https://arxiv.org/abs/1909.13719) ImageNet configurations modeled after impl for EfficientNet training (https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/autoaugment.py)
574581
* AugMix w/ JSD loss (https://arxiv.org/abs/1912.02781), JSD w/ clean + augmented mixing support works with AutoAugment and RandAugment as well
575582
* SplitBachNorm - allows splitting batch norm layers between clean and augmented (auxiliary batch norm) data
576-
* DropPath aka "Stochastic Depth" (https://arxiv.org/abs/1603.09382)
583+
* DropPath aka "Stochastic Depth" (https://arxiv.org/abs/1603.09382)
577584
* DropBlock (https://arxiv.org/abs/1810.12890)
578585
* Blur Pooling (https://arxiv.org/abs/1904.11486)
579586
* Space-to-Depth by [mrT23](https://github.com/mrT23/TResNet/blob/master/src/models/tresnet/layers/space_to_depth.py) (https://arxiv.org/abs/1801.04590) -- original paper?
@@ -600,19 +607,17 @@ Model validation results can be found in the [results tables](results/README.md)
600607

601608
## Getting Started (Documentation)
602609

603-
My current [documentation](https://rwightman.github.io/pytorch-image-models/) for `timm` covers the basics.
604-
605-
Hugging Face [`timm` docs](https://huggingface.co/docs/hub/timm) will be the documentation focus going forward and will eventually replace the `github.io` docs above.
610+
The official documentation can be found at https://huggingface.co/docs/hub/timm. Documentation contributions are welcome.
606611

607612
[Getting Started with PyTorch Image Models (timm): A Practitioner’s Guide](https://towardsdatascience.com/getting-started-with-pytorch-image-models-timm-a-practitioners-guide-4e77b4bf9055) by [Chris Hughes](https://github.com/Chris-hughes10) is an extensive blog post covering many aspects of `timm` in detail.
608613

609-
[timmdocs](http://timm.fast.ai/) is quickly becoming a much more comprehensive set of documentation for `timm`. A big thanks to [Aman Arora](https://github.com/amaarora) for his efforts creating timmdocs.
614+
[timmdocs](http://timm.fast.ai/) is an alternate set of documentation for `timm`. A big thanks to [Aman Arora](https://github.com/amaarora) for his efforts creating timmdocs.
610615

611616
[paperswithcode](https://paperswithcode.com/lib/timm) is a good resource for browsing the models within `timm`.
612617

613618
## Train, Validation, Inference Scripts
614619

615-
The root folder of the repository contains reference train, validation, and inference scripts that work with the included models and other features of this repository. They are adaptable for other datasets and use cases with a little hacking. See [documentation](https://rwightman.github.io/pytorch-image-models/scripts/) for some basics and [training hparams](https://rwightman.github.io/pytorch-image-models/training_hparam_examples) for some train examples that produce SOTA ImageNet results.
620+
The root folder of the repository contains reference train, validation, and inference scripts that work with the included models and other features of this repository. They are adaptable for other datasets and use cases with a little hacking. See [documentation](https://huggingface.co/docs/timm/training_script).
616621

617622
## Awesome PyTorch Resources
618623

0 commit comments

Comments
 (0)