Details
I've tested HSE SCF in magnetic system, example is Fe-bcc conventional cell:
ATOMIC_SPECIES
Fe 55.845 Fe_ONCV_PBE-1.0.upf upf201
NUMERICAL_ORBITAL
Fe_gga_8au_100Ry_4s2p2d1f.orb
LATTICE_CONSTANT
1.889726
LATTICE_VECTORS
2.8301511117 0.0000000000 0.0000000000 #latvec1
0.0000000000 2.8301511117 0.0000000000 #latvec2
0.0000000000 -0.0000000000 2.8301511117 #latvec3
ATOMIC_POSITIONS
Direct
Fe #label
2 #magnetism
2 #number of atoms
0.0000000000 0.0000000000 0.0000000000 m 1 1 1
0.5000000000 0.5000000000 0.5000000000 m 1 1 1
And KPT is 9 9 9
Fe-HSE.tar.gz
Information:
ABACUS version: 3.4.4: Commit: 5f9d472 (Mon Dec 4 14:10:21 2023 +0800)
Dependence: Intel-OneAPI and Intel-toolchain
LibRI and LibComm: latest version before Nov 18
At first, my INPUT example is
#Parameters (1.General)
suffix Fe # suffix of OUTPUT DIR
nspin 2 # 1/2/4 4 for SOC
symmetry 0 # 0/1 1 for open, default
esolver_type ksdft # ksdft, ofdft, sdft, tddft, lj, dp
dft_functional hse # same as upf file, can be lda/pbe/scan/hf/pbe0/hse
ks_solver genelpa # default for ksdft-lcao
vdw_method none # none, d3, d3_bj
#Parameters (2.Iteration)
calculation scf # scf relax cell-relax md
ecutwfc 100
scf_thr 1e-7
scf_nmax 300
#Parameters (3.Basis)
basis_type lcao # lcao or pw
#Parameters (4.Smearing)
smearing_method mp # mp/gaussian/fixed
smearing_sigma 0.002 # Rydberg
#Parameters (5.Mixing)
mixing_type broyden # pulay/broyden
#Parameters (6.Calculation)
cal_force 1
cal_stress 1
out_stru 1 # print STRU in OUT
out_chg 1 # print CHG or not
out_bandgap 1
out_mul 1
it is very hard to converge to scf_the 1e-7, even cannot reach scf_thr 1e-6 within 5-days calculation in OMP_NUM_THREADS=16 mpirun -np 4 abacus in Intel-8358
# After more than 700 lines of print-out and 4-days calculation
Updating EXX and rerun SCF
GE1 5.32e+00 5.81e+00 -6.437418e+03 0.000000e+00 1.291e-06 9.196e+00
GE2 5.32e+00 5.81e+00 -6.437418e+03 1.364268e-09 7.188e-07 8.863e+00
GE3 5.32e+00 5.81e+00 -6.437418e+03 1.468676e-09 3.518e-07 8.800e+00
GE4 5.32e+00 5.81e+00 -6.437418e+03 5.839128e-10 2.326e-07 8.802e+00
GE5 5.32e+00 5.81e+00 -6.437418e+03 -2.100539e-09 3.236e-08 8.843e+00
Updating EXX and rerun SCF
GE1 5.32e+00 5.81e+00 -6.437418e+03 0.000000e+00 1.058e-06 9.111e+00
GE2 5.32e+00 5.81e+00 -6.437418e+03 -1.546015e-09 5.929e-07 8.805e+00
GE3 5.32e+00 5.81e+00 -6.437418e+03 -1.840679e-10 2.948e-07 8.871e+00
GE4 5.32e+00 5.81e+00 -6.437418e+03 1.423819e-09 4.995e-08 8.820e+00
And after I saw #3103 , I add a parameter in my INPUT:
After that, convergence performance is better, in 2-days calculation of OMP_NUM_THREADS=24 mpirun -np 2 abacus in Intel-8162, the SCF converge to scf_thr 1e-6, but not scf_thr 1e-7
START CHARGE : atomic
DONE(177.792 SEC) : INIT SCF
ITER TMAG AMAG ETOT(eV) EDIFF(eV) DRHO TIME(s)
GE1 4.01e+00 4.01e+00 -6.440073e+03 0.000000e+00 4.826e-02 4.429e+00
GE2 4.31e+00 4.41e+00 -6.440405e+03 -3.311553e-01 1.996e-02 3.688e+00
GE3 4.33e+00 4.43e+00 -6.440409e+03 -4.691903e-03 5.726e-03 3.677e+00
GE4 4.33e+00 4.43e+00 -6.440409e+03 2.332581e-04 3.079e-03 3.684e+00
GE5 4.33e+00 4.43e+00 -6.440409e+03 -5.472160e-05 1.219e-03 3.626e+00
GE6 4.33e+00 4.43e+00 -6.440409e+03 -1.579811e-05 1.703e-04 3.681e+00
GE7 4.33e+00 4.43e+00 -6.440409e+03 -2.383246e-07 6.439e-05 3.724e+00
GE8 4.33e+00 4.43e+00 -6.440409e+03 -6.277874e-08 2.805e-05 3.635e+00
GE9 4.33e+00 4.43e+00 -6.440409e+03 -2.755682e-08 9.261e-06 3.668e+00
GE10 4.33e+00 4.43e+00 -6.440409e+03 1.987624e-10 9.984e-07 3.717e+00
GE11 4.33e+00 4.43e+00 -6.440409e+03 1.256766e-09 1.477e-07 3.667e+00
GE12 4.33e+00 4.43e+00 -6.440409e+03 -2.078884e-09 8.750e-08 3.641e+00
Updating EXX and rerun SCF
GE1 5.07e+00 5.25e+00 -6.432274e+03 0.000000e+00 6.975e-02 1.732e+01
GE2 5.12e+00 5.38e+00 -6.437178e+03 -4.903432e+00 5.335e-02 1.714e+01
GE3 5.08e+00 5.37e+00 -6.437337e+03 -1.595823e-01 2.761e-02 1.717e+01
GE4 5.08e+00 5.36e+00 -6.436762e+03 5.755460e-01 2.955e-02 1.724e+01
GE5 5.18e+00 5.45e+00 -6.437070e+03 -3.075961e-01 1.282e-02 1.730e+01
GE6 5.20e+00 5.46e+00 -6.437078e+03 -8.548606e-03 8.137e-03 1.715e+01
GE7 5.19e+00 5.45e+00 -6.437053e+03 2.523551e-02 9.021e-03 1.717e+01
GE8 5.22e+00 5.47e+00 -6.437049e+03 4.194422e-03 4.162e-03 1.725e+01
GE9 5.25e+00 5.49e+00 -6.437052e+03 -2.974158e-03 3.035e-04 1.720e+01
GE10 5.25e+00 5.49e+00 -6.437052e+03 1.164049e-05 3.154e-04 1.713e+01
GE11 5.25e+00 5.49e+00 -6.437052e+03 -1.927004e-05 9.251e-05 1.714e+01
GE12 5.25e+00 5.49e+00 -6.437052e+03 4.742927e-06 1.342e-04 1.723e+01
GE13 5.25e+00 5.49e+00 -6.437052e+03 -3.654831e-06 1.064e-04 1.724e+01
GE14 5.25e+00 5.49e+00 -6.437052e+03 -1.292602e-06 2.761e-06 1.720e+01
GE15 5.25e+00 5.49e+00 -6.437052e+03 -4.918788e-10 1.088e-06 1.726e+01
GE16 5.25e+00 5.49e+00 -6.437052e+03 -1.480277e-09 5.592e-07 1.722e+01
GE17 5.25e+00 5.49e+00 -6.437052e+03 3.408349e-09 1.277e-07 1.720e+01
GE18 5.25e+00 5.49e+00 -6.437052e+03 3.209587e-10 1.536e-08 1.722e+01
Updating EXX and rerun SCF
GE1 5.30e+00 5.66e+00 -6.437386e+03 0.000000e+00 7.783e-03 1.756e+01
GE2 5.30e+00 5.70e+00 -6.437389e+03 -2.905916e-03 3.097e-03 1.776e+01
GE3 5.30e+00 5.69e+00 -6.437389e+03 -1.426553e-04 3.709e-04 1.768e+01
GE4 5.30e+00 5.69e+00 -6.437389e+03 -3.933669e-07 1.830e-04 1.763e+01
GE5 5.30e+00 5.69e+00 -6.437389e+03 -1.916378e-07 6.337e-05 1.758e+01
GE6 5.30e+00 5.69e+00 -6.437389e+03 -5.438509e-08 6.068e-06 1.773e+01
GE7 5.30e+00 5.69e+00 -6.437389e+03 6.844540e-10 4.172e-06 1.765e+01
GE8 5.30e+00 5.69e+00 -6.437389e+03 -2.401390e-09 2.932e-06 1.760e+01
GE9 5.30e+00 5.69e+00 -6.437389e+03 -1.980663e-09 3.465e-07 1.768e+01
GE10 5.30e+00 5.69e+00 -6.437389e+03 1.095900e-09 4.516e-08 1.761e+01
Updating EXX and rerun SCF
GE1 5.30e+00 5.75e+00 -6.437412e+03 0.000000e+00 2.970e-03 1.772e+01
GE2 5.30e+00 5.77e+00 -6.437412e+03 -5.071874e-04 1.115e-03 1.761e+01
GE3 5.30e+00 5.76e+00 -6.437412e+03 -3.600643e-05 3.660e-04 1.766e+01
GE4 5.30e+00 5.76e+00 -6.437412e+03 1.002332e-06 1.333e-04 1.767e+01
GE5 5.30e+00 5.76e+00 -6.437412e+03 -3.536508e-07 3.344e-05 1.765e+01
GE6 5.30e+00 5.76e+00 -6.437412e+03 -1.065892e-08 3.677e-06 1.761e+01
GE7 5.30e+00 5.76e+00 -6.437412e+03 1.508119e-10 2.340e-06 1.777e+01
GE8 5.30e+00 5.76e+00 -6.437412e+03 -2.848412e-09 1.372e-06 1.762e+01
GE9 5.30e+00 5.76e+00 -6.437412e+03 -6.697595e-10 5.157e-07 1.766e+01
GE10 5.30e+00 5.76e+00 -6.437412e+03 2.343385e-10 2.126e-07 1.772e+01
GE11 5.30e+00 5.76e+00 -6.437412e+03 2.143849e-09 3.190e-08 1.777e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 8.249e-04 1.792e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 -3.188180e-05 3.782e-04 1.772e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 -7.317703e-08 1.303e-04 1.774e+01
GE4 5.29e+00 5.78e+00 -6.437418e+03 -3.181451e-07 5.785e-05 1.770e+01
GE5 5.29e+00 5.78e+00 -6.437418e+03 1.346944e-08 1.282e-05 1.783e+01
GE6 5.29e+00 5.78e+00 -6.437418e+03 -2.061869e-09 2.488e-06 1.767e+01
GE7 5.29e+00 5.78e+00 -6.437418e+03 2.597832e-09 4.422e-07 1.771e+01
GE8 5.29e+00 5.78e+00 -6.437418e+03 3.727761e-10 1.378e-07 1.783e+01
GE9 5.29e+00 5.78e+00 -6.437418e+03 -5.916467e-10 5.191e-08 1.774e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 2.048e-04 1.776e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 -1.439515e-06 9.945e-05 1.772e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 2.207036e-08 4.256e-05 1.779e+01
GE4 5.29e+00 5.78e+00 -6.437418e+03 1.388243e-09 1.148e-05 1.771e+01
GE5 5.29e+00 5.78e+00 -6.437418e+03 -1.028305e-08 4.965e-06 1.766e+01
GE6 5.29e+00 5.78e+00 -6.437418e+03 -4.977566e-09 4.509e-07 1.777e+01
GE7 5.29e+00 5.78e+00 -6.437418e+03 8.770292e-10 1.506e-07 1.783e+01
GE8 5.29e+00 5.78e+00 -6.437418e+03 -9.288467e-10 6.491e-08 1.770e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 6.077e-05 1.769e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 -1.134252e-07 3.019e-05 1.777e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 8.373541e-09 1.607e-05 1.777e+01
GE4 5.29e+00 5.78e+00 -6.437418e+03 2.111367e-09 2.498e-06 1.768e+01
GE5 5.29e+00 5.78e+00 -6.437418e+03 -3.221961e-09 4.133e-07 1.770e+01
GE6 5.29e+00 5.78e+00 -6.437418e+03 4.555293e-10 1.491e-07 1.771e+01
GE7 5.29e+00 5.78e+00 -6.437418e+03 2.135342e-09 4.883e-08 1.783e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 2.286e-05 1.784e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 -2.029078e-08 1.168e-05 1.772e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 1.047176e-09 6.660e-06 1.773e+01
GE4 5.29e+00 5.78e+00 -6.437418e+03 2.583137e-10 1.001e-06 1.789e+01
GE5 5.29e+00 5.78e+00 -6.437418e+03 -1.795822e-09 4.420e-07 1.777e+01
GE6 5.29e+00 5.78e+00 -6.437418e+03 1.625675e-09 7.378e-08 1.779e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 1.176e-05 1.768e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 -6.389011e-09 5.673e-06 1.767e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 1.296982e-09 3.038e-06 1.780e+01
GE4 5.29e+00 5.78e+00 -6.437418e+03 3.175557e-09 3.255e-06 1.771e+01
GE5 5.29e+00 5.78e+00 -6.437418e+03 -3.668210e-09 2.879e-07 1.769e+01
GE6 5.29e+00 5.78e+00 -6.437418e+03 7.262173e-10 4.905e-08 1.772e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 6.956e-06 1.776e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 -2.089712e-09 3.181e-06 1.795e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 -1.856147e-11 1.484e-06 1.772e+01
GE4 5.29e+00 5.78e+00 -6.437418e+03 4.439284e-10 7.081e-07 1.771e+01
GE5 5.29e+00 5.78e+00 -6.437418e+03 -5.777256e-10 2.457e-07 1.781e+01
GE6 5.29e+00 5.78e+00 -6.437418e+03 6.627990e-10 3.775e-08 1.774e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 3.987e-06 1.776e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 5.614843e-10 1.749e-06 1.779e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 -1.237431e-11 7.742e-07 1.771e+01
GE4 5.29e+00 5.78e+00 -6.437418e+03 -1.139210e-09 9.115e-07 1.785e+01
GE5 5.29e+00 5.78e+00 -6.437418e+03 2.412990e-10 6.300e-08 1.778e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 2.462e-06 1.779e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 6.380504e-10 1.072e-06 1.781e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 1.345706e-09 5.480e-07 1.785e+01
GE4 5.29e+00 5.78e+00 -6.437418e+03 2.142302e-10 4.647e-07 1.776e+01
GE5 5.29e+00 5.78e+00 -6.437418e+03 -2.590871e-10 4.279e-08 1.778e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 1.403e-06 1.777e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 1.235111e-09 6.003e-07 1.775e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 8.615614e-10 2.236e-07 1.787e+01
GE4 5.29e+00 5.78e+00 -6.437418e+03 -1.662025e-09 1.244e-07 1.777e+01
GE5 5.29e+00 5.78e+00 -6.437418e+03 8.360393e-10 4.124e-08 1.779e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 9.645e-07 1.775e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 -1.137663e-09 6.704e-07 1.771e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 9.613292e-10 4.905e-07 1.784e+01
GE4 5.29e+00 5.78e+00 -6.437418e+03 6.372770e-10 1.410e-07 1.780e+01
GE5 5.29e+00 5.78e+00 -6.437418e+03 -5.181742e-10 2.714e-08 1.778e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 5.150e-07 1.781e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 -2.575403e-10 3.110e-07 1.782e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 1.438514e-09 2.384e-07 1.790e+01
GE4 5.29e+00 5.78e+00 -6.437418e+03 -8.399063e-10 7.898e-08 1.780e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 3.857e-07 1.780e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 -1.633409e-09 5.688e-07 1.778e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 -1.924205e-09 1.518e-07 1.777e+01
GE4 5.29e+00 5.78e+00 -6.437418e+03 3.443925e-09 5.881e-08 1.782e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 3.686e-07 1.778e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 -1.023974e-09 1.722e-07 1.784e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 2.084298e-09 7.166e-08 1.777e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 2.508e-07 1.841e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 -7.285375e-10 5.146e-07 1.832e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 1.717709e-09 9.339e-08 1.835e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 2.401e-07 1.783e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 2.266046e-10 2.545e-07 1.782e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 -5.599375e-10 1.674e-07 1.791e+01
GE4 5.29e+00 5.78e+00 -6.437418e+03 -1.832945e-10 5.861e-08 1.780e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 2.153e-07 1.786e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 -2.714614e-10 2.968e-07 1.779e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 -6.473311e-10 1.489e-07 1.779e+01
GE4 5.29e+00 5.78e+00 -6.437418e+03 2.733949e-09 4.245e-08 1.788e+01
Updating EXX and rerun SCF
GE1 5.29e+00 5.78e+00 -6.437418e+03 0.000000e+00 2.514e-07 1.780e+01
GE2 5.29e+00 5.78e+00 -6.437418e+03 1.046403e-09 2.553e-07 1.787e+01
GE3 5.29e+00 5.78e+00 -6.437418e+03 1.063417e-09 1.525e-07 1.774e+01
GE4 5.29e+00 5.78e+00 -6.437418e+03 -5.251348e-10 4.837e-08 1.787e+01
And memory consumption is 50G during calculation. Is this performance normal and proper for this system ? Can some improvements be done ?
Also. there exists some problem from user for using HSE :
- There is not any print-out in stdout and running*.log in EXX process (despite
Updateing EXX and rerun SCF notice), which will give user a bad view that the calculation is stuck. Can more print-out information like consumed time in EXX process and some key process
- How can I restart HSE SCF calculation properly if a complete SCF is not done? Because total SCF is not done, charge file will not be written, I'm trying using wavefunction file and restart file. However, due to HSE process will calculate PBE SCF first no matter
exx_separate_loop is 0 or 1, if I directly use wfc or restart file from half-calculated HSE process, will the initialization useless because of the first PBE process ?
- How can I set MPI and OMP number for best calculation performance (if memory is permitted and number of physical core is fixed)? set more OMP number will reduce memory cost, but from my observation on CPU status of HPC server during EXX process, it seems EXX process are sometimes mainly parallelized by MPI
Task list for Issue attackers (only for developers)
Details
I've tested HSE SCF in magnetic system, example is Fe-bcc conventional cell:
And KPT is
9 9 9Fe-HSE.tar.gz
Information:
ABACUS version: 3.4.4: Commit: 5f9d472 (Mon Dec 4 14:10:21 2023 +0800)
Dependence: Intel-OneAPI and Intel-toolchain
LibRI and LibComm: latest version before Nov 18
At first, my INPUT example is
it is very hard to converge to
scf_the 1e-7, even cannot reachscf_thr 1e-6within 5-days calculation inOMP_NUM_THREADS=16 mpirun -np 4 abacusin Intel-8358And after I saw #3103 , I add a parameter in my INPUT:
After that, convergence performance is better, in 2-days calculation of
OMP_NUM_THREADS=24 mpirun -np 2 abacusin Intel-8162, the SCF converge toscf_thr 1e-6, but notscf_thr 1e-7And memory consumption is 50G during calculation. Is this performance normal and proper for this system ? Can some improvements be done ?
Also. there exists some problem from user for using HSE :
Updateing EXX and rerun SCFnotice), which will give user a bad view that the calculation is stuck. Can more print-out information like consumed time in EXX process and some key processexx_separate_loopis 0 or 1, if I directly use wfc or restart file from half-calculated HSE process, will the initialization useless because of the first PBE process ?Task list for Issue attackers (only for developers)