Releases: deepmodeling/deepmd-kit
Releases · deepmodeling/deepmd-kit
v2.0.0
Breaking changes to v1.3
- Training parameters: Several training parameters have been updated. Original training data is splited into training data and validation data. Please read the document to apply the changes. Old styles can still work but are not recommended.
- Model inference: Old models trained by v1 will not work in v2. Run
dp convert-fromto convert old models to v2. - Python interface:
deepmd.DeepPothas been moved todeepmd.infer.DeepPot. - C++ interface:
NNPInterhas been renamed todeepmd::DeepPotandNNPInter.hhas been renamed toDeepPot.h. Use-ldeepmd_ccto link instead.
New features
- Model compression (#350 #586 #610 #921 #948 #956 #1000 #1008 #1020 #1043)
- Parallel training (#892 #905 #913 #1030 #1032) (Bytedance)
- ROCm device support (#656 )
- New descriptor: three body embedding (
se_e3) - Hybridization of descriptors (
hybrid) - Type embedding
- Training and inference the dipole (vector) and polarizability (matrix). (#495 #538 #927)
- Support derivatives of the tensor properties. (#805)
- Split of training and validation dataset.
- Model deviation for virial
- Add subcommand and python interface to calculate model-deviation (#715)
- Automatically determine the sel from the training data. (#831)
- Building with lammps with plugin mode (#930 #945)
Performance improvement:
- More efficient training: all customized OPs are implemented with GPU.
- MPI support for atomic model deviation #628
- speedup ROCm kernels which use atomicAdd (#809 #815 ) (from ByteDance)
- speedup CUDA kernels (use atomicAdd inside) by reducing the global memory write (#811)
- speedup tabulate cuda kernel by reducing shm using (#830) (Bytedance)
- speedup
format_nlist_b(#832 #845) - speedup
scan_nlistkernel (#1028)
Enhancements
- Strict argument check in the input script.
- Auto conversion of input file to v2.0 compatibility
- Append out_file when lammps restarts #640
- Document and examples for the C++ interface #652 #663
- Instructions for the i-pi #660
- Document for the network size and sel #657
- Use fmod to wrap the coord of atoms (solve slow PBC) (#741)
- bit operations to encode neighbor information
- add CUDA/ROCM buidling documents (#739)
- add type-embedding developer doc (#762 #967)
- add model compression support for models with exclude_types feature (#754)
- improve the doc and user interface of model compression (#772)
- support converting models generated in v1.3 to 2.0 compatibility (#725)
- give a default value to T and convert models from v1.2 to 2.0 compatibility (#789)
- improved documents for conda (#798 #925)
- throw a message if tf runtime is incompatible (#797)
- capture OOM and print debug message (#801)
- add message for DecodeError raised when using model compression (#839)
- Passing error to TF instead of exit (#918)
- refactor docs (#952)
- add an example of
nopbcand related docs (#994) - add
dp --version(#995) - add the argument
tensorboard_freqto control sampling ratio during training. (#996) - add sphinx plugins
viewcodeandintersphinx(#997) - generate Python API document automatically (#998)
- give a clear message if
model.get_ntypes()<data.get_ntypes()(#1016) - add docstring for
descrpt/se_e2_a(#1017) - add docstring for
fit/ener(#1024) - add
InputNlistinto API doc (#1009) - save checkpoint files with step and keep recent files (#1031)
Improvement of the code for developers
- Support version of the model. Easily check model compatability
- Clear and pythonic python interface
- C++ lib that can be tested independently
- C++ API that can be tested independently
- OP supports multi-device.
- Added
deepmdnamespace for the C++ API - UT for Cuda/ROCm code (#569)
- UT for model compression (#586)
- UT for prod_force/virial ops (#703 #741)
- CI test Lammps build (#600)
- allow c++ tests to run without internet (#785)
- build low and high precision at the same time (#879)
- support to specify CUDA/ROCm root in python pkg building (#834) (Bytedance)
- use cached Session to speed up py tests (#833)
- remove cub include for CUDA>=11 (#866)
- Add Errcheck after every kernel function runs And merge redundant code (#855)
- adapt changes to auditwheel directory in manylinux (#889)
- enhance the cli to generate doc json file (#891)
- raise warning before training if
selis not enough (#914) - make native MD compatible with v2.0 (#950)
- fix type hints and add doc for
exclude_types(#1005) - use TF's built-in method to get numpy dtype (#1035)
Bug fixings:
- Remove
using namespace std. Solve compiling compatability problem. cudamemory access error #566- Relative force model deviation is not copied back at single precision #599
- Correct way of allocating memory in float precision #612
- Fix TB logdir remove bug #617
- Illegal nlist #680
- Bug in
prod_virial_gradthat causes wrong results when training with virials #685 - Uniform random seed #691
- Illegal nlist #680
- Bug in
prod_virial_gradthat causes wrong results when training with virials #685 - Uniform random seed #691
- fix bug of adding int to a None random seed (#705)
- reuse the zero layer rather than building a new one (#714)
- fix bug in CI (#739)
- fix bug 824 and Synchronize updates to CUDA cod (#828)
- Fix the empty neighbor distance array in neighbor_stat.py (#882)
- fix InvalidArgumentError caused by zero sel and optimize zero matrix (#900)
- fix 'NoneType' has no len() in auto_sel (#911)
- set input
DeepmdData.type_mapto inputtype_map(#924) - Fix member declartion of
deepmdanddeepmd.entrypoints. (#922) - add aliases to Arguments (#933)
- fix bug of gelu activation function (#939)
- convert
decay_ratetostop_lrfrom old inputs (#949) - only enable link what you use on GNU compilers (#962)
- Do not find protobuf for python (#963)
- fix an error in stress by ase interface (#964)
- remove bare
exceptand limit thetryclause (#977) - fix python cmake error (#976)
- Instantiate RunOptions first when training. (#1019)
- Fix complier type in cmake:
CMAKE_COMPILER_IS_GNUCXX(#1038) - other cleanups of the code (#968 #970 #975 #999 #1004 #1002 #1001 #1010 #1014 #1012 #1011 #1021 #1036 #1037)
Contributors
- Han Bao
- Roberto Car
- Junhan Chang
- Yixiao Chen
- Ye Ding
- Weinan E
- Jiequn Han
- Li'ang Huang
- Weile Jia
- Zeyu Li
- Ziyao Li
- Yinnian Lin
- Yihao Liu
- Xinzijian Liu
- Denghui Lu
- Marián Rynik
- Shaochen Shi
- Ping Tuo
- Bo Wang
- Haidi Wang
- Han Wang
- Yingze Wang
- Yu Xia
- Fengbo Yuan
- Jiabin Yang
- Haotian Ye
- Jinzhe Zeng
- Duo Zhang
- Linfeng Zhang
- Yuzhi Zhang
v2.0.0-beta.4
New features:
- parallel training (#892 #905 #913) (Bytedance)
- automatically determine the
selfrom the training data. (#831) - build low and high precision at the same time (#879)
Performance improvement:
- speedup tabulate cuda kernel by reducing shm using (#830) (Bytedance)
- speedup format_nlist_b (#832 #845)
Enhancements:
- support to specify CUDA/ROCm root in python pkg building (#834) (Bytedance)
- use cached Session to speed up py tests (#833)
- add message for DecodeError raised when using model compression (#839)
- remove cub include for CUDA>=11 (#866)
- Add Errcheck after every kernel function runs And merge redundant code (#855)
- adapt changes to auditwheel directory in manylinux (#889)
- enhance the cli to generate doc json file (#891)
- raise warning before training if sel is not enough (#914)
Bug fixings:
v2.0.0-beta.3
New feature:
- derivatives for deep tensor (#805)
Performance improvement:
- speedup ROCm kernels which use atomicAdd (#809 #815 ) (from ByteDance)
- speedup CUDA kernels (use atomicAdd inside) by reducing the global memory write (#811)
Enhancement:
- add type-embedding developer doc (#762)
- add model compression support for models with exclude_types feature (#754)
- improve the doc and user interface of model compression (#772)
- allow c++ tests to run without internet (#785)
- support converting models generated in v1.3 to 2.0 compatibility (#725)
- give a default value to T and convert models from v1.2 to 2.0 compatibility (#789)
- improved documents for conda (#798)
- throw a message if tf runtime is incompatible (#797)
- capture OOM and print debug message (#801)
Bug fixings
v2.0.0-beta.2
New features:
- Add subcommand and python interface to calculate model-deviation (#715)
Enhancements
- Use fmod to wrap the coord of atoms. UT for force/virial ops (#741)
- UT for model devi C++ interface (#731)
- add CUDA/ROCM buidling documents (#739)
- add op unittests for prod_force, prod_virial, prod_force_grad and prod_virial_grad (#703)
Bug fixings:
v2.0.0-beta.1
v2.0.0-beta.0
Increment to v2.0.0-alpha:
New features:
- Atom type embedding
- Model deviation for virial
Enhancement:
- Improved documentation
- Better support for dipole and polarizability learning
- bit operations to encode neighbor information
- MPI support for atomic model deviation #628
- UT for GPU code #569
- UT for model compression #586
- Test Lammps build #600
Bug fixings
v2.0.0-alpha.1
What's new to v2.0.0-alpha.0
- Training and inference the dipole (vector).
- Split of training and validation dataset.
Enhancement:
- Strict argument check in the input script.
- Update readme for v2.0
- Auto conversion of input file to v2.0 compatibility
Bug fixings:
- Fix bugs of broken examples.
v2.0.0-alpha.0
The very first alpha release of deepmd-kit version 2.0.0. It includes the following new features
- Model compression
- New descriptor: three body embedding
- Hybridization of descriptors
- Long-range modification
- Type embedding (under development)
- Training and inference the dipole (vector) and polarizability (matrix). (under development)
- Split of training and validation dataset. (under development)
- ROCm device support (under development)
Enhancements
- More efficient training: all customized OPs are implemented with GPU.
- Parallel training with multiple GPU support (under development)
Improvement of the code for developers
- Supports version of the model. Easily check model compatability
- Clear and pythonic python interface
- C++ API that can be tested independently
- OP supports multi-device.
Bug fixings:
- remove
using namespace std. Solves compiling compatability problem. - added
deepmdnamespace for the C++ API