You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I successfully trained a model using deepmd-kit. Now I want to run dpgen autotest for calculating physical properties.
I follow the dpgen document, and have prepared relaxation.json and machine_local.json.
I run dpgen autotest make relaxation_T.json
It successfully works.
Then I run dpgen autotest run relaxation_T.json machine_local.json
It comes out an error.
It seems that dpgen is trying to submit jobs, but I am running it on my local shell. I think that there should not be job submissions.
The error calls "unexpected submission state".
pymatgen unknown version or path
monty 2024.4.17 /home/lijh/anaconda3/envs/deepmd/lib/python3.10/site-packages/monty
ase 3.22.1 /home/lijh/anaconda3/envs/deepmd/lib/python3.10/site-packages/ase
paramiko 3.4.0 /home/lijh/anaconda3/envs/deepmd/lib/python3.10/site-packages/paramiko
custodian 2024.4.18 /home/lijh/anaconda3/envs/deepmd/lib/python3.10/site-packages/custodian
Reference
Please cite:
Yuzhi Zhang, Haidi Wang, Weijie Chen, Jinzhe Zeng, Linfeng Zhang, Han Wang, and Weinan E,
DP-GEN: A concurrent learning platform for the generation of reliable deep learning
based potential energy models, Computer Physics Communications, 2020, 107206.
Description
/home/lijh/HfO2/4phase-200w/autotest --> Runing...
2024-05-14 15:53:04,453 - INFO : info:check_all_finished: False
2024-05-14 15:53:04,457 - INFO : job: b910e4a6be4620f8b89f5ed1af23cab264b0e786 submit; job_id is 31369
2024-05-14 15:53:35,592 - INFO : job: b910e4a6be4620f8b89f5ed1af23cab264b0e786 31369 terminated; fail_cout is 1; resubmitting job
2024-05-14 15:53:35,642 - INFO : job:b910e4a6be4620f8b89f5ed1af23cab264b0e786 re-submit after terminated; new job_id is 31708
2024-05-14 15:53:35,851 - INFO : job:b910e4a6be4620f8b89f5ed1af23cab264b0e786 job_id:31708 after re-submitting; the state now is <JobStatus.running: 3>
2024-05-14 15:54:05,986 - INFO : job: b910e4a6be4620f8b89f5ed1af23cab264b0e786 31708 terminated; fail_cout is 2; resubmitting job
2024-05-14 15:54:06,029 - INFO : job:b910e4a6be4620f8b89f5ed1af23cab264b0e786 re-submit after terminated; new job_id is 32098
2024-05-14 15:54:06,238 - INFO : job:b910e4a6be4620f8b89f5ed1af23cab264b0e786 job_id:32098 after re-submitting; the state now is <JobStatus.running: 3>
2024-05-14 15:54:36,367 - INFO : job: b910e4a6be4620f8b89f5ed1af23cab264b0e786 32098 terminated; fail_cout is 3; resubmitting job
Traceback (most recent call last):
File "/home/lijh/anaconda3/envs/deepmd/lib/python3.10/site-packages/dpdispatcher/submission.py", line 358, in handle_unexpected_submission_state
job.handle_unexpected_job_state()
File "/home/lijh/anaconda3/envs/deepmd/lib/python3.10/site-packages/dpdispatcher/submission.py", line 862, in handle_unexpected_job_state
raise RuntimeError(err_msg)
RuntimeError: job:b910e4a6be4620f8b89f5ed1af23cab264b0e786 32098 failed 3 times.
Possible remote error message: ==> /home/lijh/HfO2/4phase-200w/autotest/work/b279de7e9ede8ee9b4d5502ff7df4cc95cbe3866/confs/T_phase/relaxation/relax_task/errlog <==
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/lijh/anaconda3/envs/deepmd/bin/dpgen", line 8, in
sys.exit(main())
File "/home/lijh/anaconda3/envs/deepmd/lib/python3.10/site-packages/dpgen/main.py", line 255, in main
args.func(args)
File "/home/lijh/anaconda3/envs/deepmd/lib/python3.10/site-packages/dpgen/auto_test/run.py", line 58, in gen_test
run_task(args.TASK, args.PARAM, args.MACHINE)
File "/home/lijh/anaconda3/envs/deepmd/lib/python3.10/site-packages/dpgen/auto_test/run.py", line 34, in run_task
run_equi(confs, inter_parameter, mdata)
File "/home/lijh/anaconda3/envs/deepmd/lib/python3.10/site-packages/dpgen/auto_test/common_equi.py", line 197, in run_equi
submission.run_submission()
File "/home/lijh/anaconda3/envs/deepmd/lib/python3.10/site-packages/dpdispatcher/submission.py", line 261, in run_submission
self.handle_unexpected_submission_state()
File "/home/lijh/anaconda3/envs/deepmd/lib/python3.10/site-packages/dpdispatcher/submission.py", line 362, in handle_unexpected_submission_state
raise RuntimeError(
RuntimeError: Meet errors will handle unexpected submission state.
Debug information: remote_root==/home/lijh/HfO2/4phase-200w/autotest/work/b279de7e9ede8ee9b4d5502ff7df4cc95cbe3866.
Debug information: submission_hash==b279de7e9ede8ee9b4d5502ff7df4cc95cbe3866.
Please check error messages above and in remote_root. The submission information is saved in /home/lijh/.dpdispatcher/submission/b279de7e9ede8ee9b4d5502ff7df4cc95cbe3866.json.
For furthur actions, run the following command with proper flags: dpdisp submission b279de7e9ede8ee9b4d5502ff7df4cc95cbe3866
The text was updated successfully, but these errors were encountered:
Summary
I successfully trained a model using deepmd-kit. Now I want to run
dpgen autotest
for calculating physical properties.I follow the dpgen document, and have prepared relaxation.json and machine_local.json.
I run
dpgen autotest make relaxation_T.json
It successfully works.
Then I run
dpgen autotest run relaxation_T.json machine_local.json
It comes out an error.
It seems that dpgen is trying to submit jobs, but I am running it on my local shell. I think that there should not be job submissions.
The error calls "unexpected submission state".
I put my json files here.
machine_local.json
relaxation_T.json
I would like to know if there is any mistakes in the json files, and how can I solve it.
DP-GEN Version
0.12.1
Platform, Python Version, etc
Platform: WSL Ubuntu 22.04
Python version: 3.10.13
Details
The text was updated successfully, but these errors were encountered: