v3.0.0a0 #3401
Replies: 5 comments 8 replies
-
Great works 🎉 |
Beta Was this translation helpful? Give feedback.
-
Is there any examples for finetuning on DPA-2 pretrained model? |
Beta Was this translation helpful? Give feedback.
-
Nice work! In the DPA-2 descriptor, the outputs of the |
Beta Was this translation helpful? Give feedback.
-
When discussing the representation of a rotation group, specifically SO(3), it is necessary to introduce the concept of representation space. In the majority of "equivariant" models, this space is typically represented by spherical harmonics. To preserve as much information about the atomic environment as possible, one must increase the completeness (denoted as 'l' in the context of spherical harmonics) of the basis. However, this comes at the cost of increased computational expense. In the development of DP models, the approach does not involve spanning the atomic environment within a representation space. Instead, it focuses on utilizing the original (equivariant) information without any approximation. This may be the reason why DPA-2 exhibits higher generalizability in our benchmarks, as evidenced by supplementary S2 of https://arxiv.org/abs/2312.15492. |
Beta Was this translation helpful? Give feedback.
-
Hello. Thank you for developing the new version of DeepMd. When will the stable version of this release be available? Is it possible to install it via Conda? Thanks a lot. |
Beta Was this translation helpful? Give feedback.
-
DeePMD-kit v3: A multiple-backend framework for deep potentials
We are excited to announce the first alpha version of DeePMD-kit v3. DeePMD-kit v3 allows you to train and run deep potential models on top of TensorFlow or PyTorch. DeePMD-kit v3 also supports the DPA-2 model, a novel architecture for large atomic models.
Highlights
Multiple-backend framework
DeePMD-kit v3 adds a pluggable multiple-backend framework to provide consistent training and inference experiences between different backends. You can:
dp test
, Python/C++/C interface, and third-party packages (dpdata, ASE, LAMMPS, AMBER, Gromacs, i-PI, CP2K, OpenMM, ABACUS, etc). Take an example on LAMMPS:dp convert-backend
, if both backends support a model:PyTorch backend: a backend designed for large atomic models and new research
We added the PyTorch backend in DeePMD-kit v3 to support the development of new models, especially for large atomic models.
DPA-2 model: Towards a universal large atomic model for molecular and material simulation
DPA-2 model is a novel architecture for Large Atomic Model (LAM) and can accurately represent a diverse range of chemical systems and materials, enabling high-quality simulations and predictions with significantly reduced efforts compared to traditional methods. The DPA-2 model is only implemented in the PyTorch backend. An example configuration is in the
examples/water/dpa2
directory.The DPA-2 descriptor includes two primary components:
repinit
andrepformer
. The detailed architecture is shown in the following figure.Training strategies for large atomic models
The PyTorch backend has supported multiple training strategies to develop large atomic models.
Parallel training: Large atomic models have a number of hyper-parameters and complex architecture, so training a model on multiple GPUs is necessary. Benefiting from the PyTorch community ecosystem, the parallel training for the PyTorch backend can be driven by
torchrun
, a launcher for distributed data parallel.Multi-task training: Large atomic models are trained against data in a wide scope and at different DFT levels, which requires multi-task training. The PyTorch backend supports multi-task training, sharing the descriptor between different An example is given in
examples/water_multi_task/pytorch_example/input_torch.json
.Finetune: Fine-tune is useful to train a pre-train large model on a smaller, task-specific dataset. The PyTorch backend has supported
--finetune
argument in thedp --pt train
command line.Developing new models using Python and dynamic graphs
Researchers may feel pain about the static graph and the custom C++ OPs from the TensorFlow backend, which sacrifices research convenience for computational performance. The PyTorch backend has a well-designed code structure written using the dynamic graph, which is currently 100% written with the Python language, making extending and debugging new deep potential models easier than the static graph.
Supporting traditional deep potential models
People may still want to use the traditional models already supported by the TensorFlow backend in the PyTorch backend and compare the same model among different backends. We almost rewrote all of the traditional models in the PyTorch backend, which are listed below:
se_e2_a
,se_e2_r
,se_atten
,hybrid
;standard
, DPRcse_e3
,se_atten_v2
,se_e2_a_mask
dos
linear_ener
, DPLR,pairtab
,linear_ener
,frozen
,pairwise_dprc
, ZBL, Spinloc_frame
,se_e2_a
+ type embedding,se_a_ebd_v2
Warning
As part of an alpha release, the PyTorch backend's API or user input arguments may change before the first stable version.
DP backend and format: reference backend for other backends
DP is a reference backend for development that uses pure NumPy to implement models without using any heavy deep-learning frameworks. It cannot be used for training but only for Python inference. As a reference backend, it is not aimed at the best performance but only the correct results. The DP backend uses HDF5 to store model serialization data, which is backend-independent.
The DP backend and the serialization data are used in the unit test to ensure different backends have consistent results and can be converted between each other.
In the current version, the DP backend has a similar supporting status to the PyTorch backend, while DPA-1 and DPA-2 are not supported yet.
Authors
The above highlights were mainly contributed by
deepmd_utils
#3173, add dpdata driver #3174, docs: rewrite README; deprecate manually written TOC #3179, docs: document PyTorch backend #3193, allow disabling TensorFlow backend during Python installation #3200, pt: add tensorboard and profiler support #3204, pt: set nthreads from env #3205, docs: install pytorch in RTD #3333, docs: dpmodel, model conversion #3360, docs: apply type_one_side=True tose_a
andse_r
#3364, Hybrid descriptor #3365, setup PyTorch C++ interface build environement #3169, add universal Python inference interface DeepPot #3164, add more abstractmethods to universalDeepPot
#3175, cc: reimplement read_file_to_string without calling TensorFlow #3176, Merge TF and PT CLI #3187, Fix PTDeepPot
and replace ASE calculator #3186, PT: keep the same checkpoint behavior as TF #3191, add size and replace arguments to deepmd.utils.random.choice #3195, drop tqdm #3194, reorganize tests directory #3196, fix: install CU11 PyTorch in the CU11 docker image #3198, throw errors when PyTorch CXX11 ABI is different from TensorFlow #3201, fix GPU test OOM problem #3207, pt: rename atomic_virial to atom_virial in the model output #3226, pt: apply global logger to pt #3222, pt: apply global user set precision to pt #3220, fix DP_ENABLE_TENSORFLOW support #3229, pt: rename atomic_virial to atom_virial in the model output #3226, dropdeepmd.tf.cluster.slurm
#3239, add category property to OutputVariableDef #3228, issue template: change TF version to backend version #3244, refactor print summary #3243, refactor DeepEval #3213, backend-indepedent dp test #3249, pt: infer model type from ModelOutputDef #3250, tf: support checkpoint path (instead of directory) in dp freeze #3254, addget_type_map
method to model; export model methods #3247, pt: support loading frozen models in DeepEval #3253, add neighbor stat support with NumPy and PyTorch implementation #3271, support TF se_e2_a serialization; add a common test fixture to compare TF, PT, and DP models #3263, pt: add exported methods to BaseAtomicModel #3258, pt: fix torchscript converage #3276, pt: refactor data stat #3285, consistent energy fitting #3286, dp&pt: let DPAtomicModel fetch attributes from Fitting #3292, pluggable backend #3294, pt: process frames in parallel for env mat stat #3293, pt: avoidset_default_dtype
in tests #3303, fix neighbor stat mixed_types input #3304, allow both absulute and relative tolerance when testing consistency #3308, pt: explicitly set device #3307, consistent energy model #3306, merge common subcommands in cli #3316, pt: fix se_e2_a precision cast #3315, pt: exportmodel_output_type
instead ofmodel_output_def
#3318, feat: convert model files between backends #3323, store type in descriptor serialization data #3325, feat: add NumPy DeepPot #3332, store type in fitting serialization data #3331, pt: remove env.DEVICE in allforward
functions #3330, feat(pt/dpmodel): support type_one_side in se_e2_a #3339, add BaseModel; store type in serialization #3335, apply PluginVariant and make_plugin_registry to classes #3346, add @version to serialization data #3349, pt: support--init-frz-model
#3350, merge compute_output_stat #3310, feat(pt): support fparam/aparam in DeepEval #3356, pt: fix se_a type_one_side performance degradation #3361, pt: apply argcheck to pt #3342, feat: update sel by statistics #3348, feat(pt): support fparam/aparam in C++ DeepPot #3358, fix se_r consistency #3366, sync descriptor alias #3374, feat: atom_ener in energy fitting #3370, docs: DPRc for PT, DPModel #3373, pt: supprot--output
indp train
#3377, fix: prevent deepmd.tf be imported globally #3382, pt: print data summary #3383, pt: expand systems before training #3384, pt: add fparam/aparam data requirements #3386, breaking: change DeepTensor output dim from nsel_atoms to natoms #3390, pt: bantorch.testing.assert_allclose
#3395, allow loading either nsel or natoms atomic tensor data #3394, do not return g2, h2, sw in hybrid descriptors #3396, format training logging #3397change_energy_bias
and fix finetune #3378Breaking changes
.pb
extension.DeepTensor
(includingDeepDiople
andDeepPolar
) now returns atomic tensor in the dimension ofnatoms
instead ofnsel_atoms
. by @njzjz in breaking: change DeepTensor output dim from nsel_atoms to natoms #3390deepmd
module was moved todeepmd.tf
without other API changes, anddeepmd_utils
was moved todeepmd
without other API changes. by @njzjz in breaking: move deepmd to deepmd.tf #3177, breaking: move deepmd_utils to deepmd #3178Other changes
Enhancement
CI/CD
Bugfix
New Contributors
Full Changelog: v2.2.8...v3.0.0a0
This discussion was created from the release v3.0.0a0.
Beta Was this translation helpful? Give feedback.
All reactions