|
| 1 | + |
| 2 | +Finetune MatterSim |
| 3 | +================== |
| 4 | + |
| 5 | +Finetune Script |
| 6 | +--------------- |
| 7 | + |
| 8 | +MatterSim provides a finetune script to |
| 9 | +finetune the pre-trained MatterSim model on a custom dataset. |
| 10 | +You can find the script in the ``training`` folder or in the |
| 11 | +`github link <https://github.com/microsoft/mattersim/blob/main/src/mattersim/training/finetune_mattersim.py>`_. |
| 12 | + |
| 13 | +Finetune Parameters |
| 14 | +-------------------- |
| 15 | + |
| 16 | +The finetune script accepts several command-line arguments to customize the training process. Below is a list of the available parameters: |
| 17 | + |
| 18 | +- **run_name**: (str) The name of the run. Default is "example". |
| 19 | + |
| 20 | +- **train_data_path**: (str) Path to the training data file. Supports various file types readable by ASE (e.g., `.xyz`, `.traj`, `.cif`) and `.pkl` files. Default is "./sample.xyz". |
| 21 | + |
| 22 | +- **valid_data_path**: (str) Path to the validation data file. Default is None. |
| 23 | + |
| 24 | +- **load_model_path**: (str) Path to load the pre-trained model. Default is "mattersim-v1.0.0-1m". |
| 25 | + |
| 26 | +- **save_path**: (str) Path to save the trained model. Default is "./results". |
| 27 | + |
| 28 | +- **save_checkpoint**: (bool) Whether to save checkpoints during training. Default is False. |
| 29 | + |
| 30 | +- **ckpt_interval**: (int) Interval (in epochs) to save checkpoints. Default is 10. |
| 31 | + |
| 32 | +- **device**: (str) Device to use for training, either "cuda" or "cpu". Default is "cuda". |
| 33 | + |
| 34 | +- **cutoff**: (float) Cutoff radius for interactions. Default is 5.0. |
| 35 | + |
| 36 | +- **threebody_cutoff**: (float) Cutoff radius for three-body interactions, should be smaller than the two-body cutoff. Default is 4.0. |
| 37 | + |
| 38 | +- **epochs**: (int) Number of training epochs. Default is 1000. |
| 39 | + |
| 40 | +- **batch_size**: (int) Batch size for training. Default is 16. |
| 41 | + |
| 42 | +- **lr**: (float) Learning rate for the optimizer. Default is 2e-4. |
| 43 | + |
| 44 | +- **step_size**: (int) Step size for the learning rate scheduler. Default is 10. |
| 45 | + |
| 46 | +- **include_forces**: (bool) Whether to include forces in the training. Default is True. |
| 47 | + |
| 48 | +- **include_stresses**: (bool) Whether to include stresses in the training. Default is False. |
| 49 | + |
| 50 | +- **force_loss_ratio**: (float) Ratio of force loss in the total loss. Default is 1.0. |
| 51 | + |
| 52 | +- **stress_loss_ratio**: (float) Ratio of stress loss in the total loss. Default is 0.1. |
| 53 | + |
| 54 | +- **early_stop_patience**: (int) Patience for early stopping. Default is 10. |
| 55 | + |
| 56 | +- **seed**: (int) Random seed for reproducibility. Default is 42. |
| 57 | + |
| 58 | +- **re_normalize**: (bool) Whether to re-normalize energy and forces according to new data. Default is False. |
| 59 | + |
| 60 | +- **scale_key**: (str) Key for scaling forces. Only used when ``re_normalize`` is True. Default is "per_species_forces_rms". |
| 61 | + |
| 62 | +- **shift_key**: (str) Key for shifting energy. Only used when ``re_normalize`` is True. Default is "per_species_energy_mean_linear_reg". |
| 63 | + |
| 64 | +- **init_scale**: (float) Initial scale value. Only used when ``re_normalize`` is True. Default is None. |
| 65 | + |
| 66 | +- **init_shift**: (float) Initial shift value. Only used when ``re_normalize`` is True. Default is None. |
| 67 | + |
| 68 | +- **trainable_scale**: (bool) Whether the scale is trainable. Only used when ``re_normalize`` is True. Default is False. |
| 69 | + |
| 70 | +- **trainable_shift**: (bool) Whether the shift is trainable. Only used when ``re_normalize`` is True. Default is False. |
| 71 | + |
| 72 | +- **wandb**: (bool) Whether to use Weights & Biases for logging. Default is False. |
| 73 | + |
| 74 | +- **wandb_api_key**: (str) API key for Weights & Biases. Default is None. |
| 75 | + |
| 76 | +- **wandb_project**: (str) Project name for Weights & Biases. Default is "wandb_test". |
| 77 | + |
| 78 | +These parameters allow you to customize the finetuning process to suit your specific dataset and computational resources. |
| 79 | + |
| 80 | +Finetune Example |
| 81 | +---------------- |
| 82 | +You can replace the data path with your own data path. |
| 83 | + |
| 84 | +.. code-block:: bash |
| 85 | +
|
| 86 | + torchrun --nproc_per_node=1 src/mattersim/training/finetune_mattersim.py --load_model_path mattersim-v1.0.0-1m --train_data_path xyz_files/train.xyz --valid_data_path xyz_files/valid.xyz --batch_size 16 --lr 2e-4 --step_size 20 --epochs 200 --save_path ./finetune_result --save_checkpoint --ckpt_interval 20 --include_stresses --include_forces |
0 commit comments