Update `FermiDos.get_doping()` to be more robust #3879

kavanase · 2024-06-15T20:07:53Z

FermiDos.get_doping performs an integral over the density of states with the Fermi Dirac occupations to determine the carrier concentrations. However, it starts the conduction band integral (cb_integral) from self.idx_cbm and vice versa with self.idx_vbm for the valence band integral.

But, these attributes are determined upon initialisation with self.get_cbm_vbm() (https://github.com/materialsproject/pymatgen/blob/master/pymatgen/electronic_structure/dos.py#L427), where (1) the user cannot control the occupation tolerance settings for get_cbm_vbm() (as they can when otherwise calling this function), and (2) even when tuning these settings the VBM/CBM/gap is often not correct. E.g. for CdTe with a band gap of 1.497 eV (as given by the VASP EIGENVAL and vasprun.xml outputs, and Vasprun.eigenvalue_band_properties), this gives a different result of 1.58 eV, mainly due to the disperse CBM and the way this is estimated from the DOS in get_cbm_vbm():

In this PR, the integrals are adjusted to start from the mid-gap position (intermediate between idx_vbm and idx_cbm). This means that the result now is essentially independent of the calculated VBM/CBM positions and indices, as it's just integrating the DOS with the Fermi-Dirac distribution, and the DOS is zero in the gap until it reaches the (true) CBM/VBM, now giving the correct result.

Notebook with this test code and example files attached.

Uploading pymatgen_get_doping_PR.zip…

mkhorton · 2024-06-16T14:14:51Z

Thanks @kavanase, happy to merge this once the tests are updated. Really appreciate the detailed issue description.

mkhorton · 2024-06-16T14:17:40Z

If I understand correctly, this does not fix the issue of the inaccurate CBM itself however?

kavanase · 2024-06-17T15:44:02Z

Hi @mkhorton, yes this doesn't fix the issue of the inaccurate CBM, but just makes the calculation of doping concentrations essentially independent of any inaccuracies in the CBM/VBM from the DOS.
Generally we've found that the Vasprun.eigenvalue_band_properties property, with the default occu_tol=1e-8, is quite reliable for extracting the band gap and edges, but the DOS methods not so much. Tuning the tolerances with the DOS methods improves this but there are still some inaccuracies.

I've dug into this a bit more to see where this comes from. The main issue seems to be that the default occupation tolerance in the DOS methods of 1e-3 (c.f. 1e-8 for occu_tol with Vasprun) is too large, especially when you have disperse band edges like with CdTe, GaAs etc. Dropping this from the default 1e-3 (gap = 1.58 eV) to 1e-4 (gap = 1.53 eV) improves performance, compared to the correct Vasprun.eigenvalue_band_properties value of 1.497 eV, but you can't (meaningfully) drop this tolerance any further, as the DOS output in VASP is truncated to 4 decimal places – I've seen this discussed somewhere before but can't remember where, might've been in internal group discussions.
e.g. here in CdTe the CBM starts at 3.1435 eV eigenvalue, but only shows up in the vasprun.xml DOS output (screenshot) at 3.1617 eV:

Not sure if it's worth noting this somewhere in docstrings? – added this in the latest commit.

I would suggest updating the default tolerance for this to 1e-4 in that case, as it should be more accurate and also is closer to the behaviour of the Vasprun defaults – added this in the latest commit.

Also just to note this currently passes all tests, there were just one or two random failures due to issues with the API key for Materials Project tests

…n` DOS parsing docstrings

kavanase · 2024-06-17T15:51:47Z

Just to note, this rounding to 4 decimal places by VASP is also why nelecs needs to be provided to FermiDos to normalize the DOS densities, because this rounding summed over many values (particularly in large systems) can give inaccurate total electron counts; https://github.com/materialsproject/pymatgen/blob/master/pymatgen/electronic_structure/dos.py#L376

…htly faster, cleaner and no masking required to avoid overflow warnings

)

…matgen#3879) so `pymatgen` version constraint not required

…n` DOS parsing docstrings

…htly faster, cleaner and no masking required to avoid overflow warnings

Signed-off-by: Seán Kavanagh <[email protected]>

kavanase · 2024-08-10T18:54:42Z

Just bumping so it doesn't get lost, I think this PR is all ready to go.
Fixed the merge conflict that had been introduced with the updated master branch

kavanase · 2024-08-11T14:03:57Z

All tests passing, but I'm not sure why the lobster_basis functions are appearing as changed files here?

janosh · 2024-08-30T06:13:45Z

sorry for keeping you waiting and repeatedly addressing merge conflicts @kavanase! could you resolve the conflict one more time and then i'll merge?

kavanase · 2024-08-30T14:43:06Z

Merge conflict resolved!

Just to note about this point I mentioned above:

All tests passing, but I'm not sure why the lobster_basis functions are appearing as changed files here?

I looked into this, and it seems to be an issue with the line endings. On the current master branch, it seems the line endings are CRLF ('DOS' style) rather than LF ('Unix' style). If I delete my version of these files and download the current master version, it suggests this is the case (below). LF is the desired file ending I believed, as stated in the editorconfig and best for Mac/Linux compatibility etc, and also my git tries to automatically convert CRLF files to LF:

❯ git diff
warning: in the working copy of 'src/pymatgen/io/lobster/lobster_basis/BASIS_PBE_54_max.yaml', CRLF will be replaced by LF the next time Git touches it

So in this PR, those 3 lobster yaml files are LF endings, causing them to be picked up as changed too.

[FermiDos_update e9fd84cbd] Remove yaml files to force re-eval
 3 files changed, 567 deletions(-)
 delete mode 100644 src/pymatgen/io/lobster/lobster_basis/BASIS_PBE_54_max.yaml
 delete mode 100644 src/pymatgen/io/lobster/lobster_basis/BASIS_PBE_54_min.yaml
 delete mode 100644 src/pymatgen/io/lobster/lobster_basis/BASIS_PBE_54_standard.yaml
❯ wget https://raw.githubusercontent.com/materialsproject/pymatgen/master/src/pymatgen/io/lobster/lobster_basis/BASIS_PBE_54_standard.yaml
--2024-08-30 10:35:14--  https://raw.githubusercontent.com/materialsproject/pymatgen/master/src/pymatgen/io/lobster/lobster_basis/BASIS_PBE_54_standard.yaml
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3863 (3.8K) [text/plain]
Saving to: ‘BASIS_PBE_54_standard.yaml’

BASIS_PBE_54_standard.yaml                                             100%[============================================================================================================================================================================>]   3.77K  --.-KB/s    in 0s

2024-08-30 10:35:14 (35.1 MB/s) - ‘BASIS_PBE_54_standard.yaml’ saved [3863/3863]

❯ wget https://raw.githubusercontent.com/materialsproject/pymatgen/master/src/pymatgen/io/lobster/lobster_basis/BASIS_PBE_54_min.yaml
--2024-08-30 10:35:18--  https://raw.githubusercontent.com/materialsproject/pymatgen/master/src/pymatgen/io/lobster/lobster_basis/BASIS_PBE_54_min.yaml
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3734 (3.6K) [text/plain]
Saving to: ‘BASIS_PBE_54_min.yaml’

BASIS_PBE_54_min.yaml                                                  100%[============================================================================================================================================================================>]   3.65K  --.-KB/s    in 0s

2024-08-30 10:35:18 (26.2 MB/s) - ‘BASIS_PBE_54_min.yaml’ saved [3734/3734]

❯ wget https://raw.githubusercontent.com/materialsproject/pymatgen/master/src/pymatgen/io/lobster/lobster_basis/BASIS_PBE_54_max.yaml
--2024-08-30 10:35:21--  https://raw.githubusercontent.com/materialsproject/pymatgen/master/src/pymatgen/io/lobster/lobster_basis/BASIS_PBE_54_max.yaml
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3863 (3.8K) [text/plain]
Saving to: ‘BASIS_PBE_54_max.yaml’

BASIS_PBE_54_max.yaml                                                  100%[============================================================================================================================================================================>]   3.77K  --.-KB/s    in 0s

2024-08-30 10:35:21 (27.7 MB/s) - ‘BASIS_PBE_54_max.yaml’ saved [3863/3863]

❯ ll
.rw-r--r-- 3.9k kavanase 30 Aug 10:35  BASIS_PBE_54_max.yaml
.rw-r--r-- 3.7k kavanase 30 Aug 10:35  BASIS_PBE_54_min.yaml
.rw-r--r-- 3.9k kavanase 30 Aug 10:35  BASIS_PBE_54_standard.yaml
❯ git status
On branch FermiDos_update
Your branch is ahead of 'origin/FermiDos_update' by 1 commit.
  (use "git push" to publish your local commits)

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	./

nothing added to commit but untracked files present (use "git add" to track)
❯ file *yaml
BASIS_PBE_54_max.yaml:      ASCII text, with CRLF line terminators
BASIS_PBE_54_min.yaml:      ASCII text, with CRLF line terminators
BASIS_PBE_54_standard.yaml: ASCII text, with CRLF line terminators
❯ dos2unix *yaml
dos2unix: converting file BASIS_PBE_54_max.yaml to Unix format...
dos2unix: converting file BASIS_PBE_54_min.yaml to Unix format...
dos2unix: converting file BASIS_PBE_54_standard.yaml to Unix format...
❯ file *yaml
BASIS_PBE_54_max.yaml:      ASCII text
BASIS_PBE_54_min.yaml:      ASCII text
BASIS_PBE_54_standard.yaml: ASCII text
❯ git add *yaml

Update FermiDos.get_doping() to be more robust.

e8ee982

kavanase requested review from shyuep, janosh and mkhorton as code owners June 15, 2024 20:07

Merge branch 'master' into master

8b31dec

Update tol defaults for DOS methods, and add small notes to `Vaspru…

5b07d57

…n` DOS parsing docstrings

kavanase and others added 4 commits June 17, 2024 11:52

Merge branch 'master' of https://github.com/kavanase/pymatgen

e3b588d

pre-commit auto-fixes

30c8acd

Use scipy.special.expit function for Fermi-Dirac distribution, slig…

00e9e8c

…htly faster, cleaner and no masking required to avoid overflow warnings

Merge branch 'master' of https://github.com/kavanase/pymatgen

5224957

kavanase added a commit to SMTG-Bham/doped that referenced this pull request Jun 18, 2024

Update get_e_h_concs to be more robust (materialsproject/pymatgen#3879

a3672c1

)

kavanase added a commit to SMTG-Bham/doped that referenced this pull request Jun 19, 2024

Add get_doping function with updated code (from materialsproject/py…

015cb5f

…matgen#3879) so `pymatgen` version constraint not required

kavanase and others added 4 commits June 20, 2024 10:52

Update FermiDos.get_doping() to be more robust.

192e3b9

Update tol defaults for DOS methods, and add small notes to `Vaspru…

36f4b0c

…n` DOS parsing docstrings

pre-commit auto-fixes

a89730b

Use scipy.special.expit function for Fermi-Dirac distribution, slig…

3b89303

…htly faster, cleaner and no masking required to avoid overflow warnings

kavanase force-pushed the master branch from 5224957 to 3b89303 Compare June 20, 2024 14:52

kavanase and others added 5 commits June 20, 2024 10:54

Merge branch 'master' into master

0c6f84d

Merge branch 'master' of https://github.com/kavanase/pymatgen

96fa980

Merge branch 'materialsproject:master' into master

f99a4d2

Merge branch 'master' of https://github.com/kavanase/pymatgen

c3f8c0b

Merge pymatgen master

1d34ac2

kavanase requested review from JaGeo and Andrew-S-Rosen as code owners July 18, 2024 19:12

kavanase and others added 2 commits July 18, 2024 20:17

Merge pymatgen master (pt 2)

eebda8c

Merge branch 'master' into master

55561ab

pre-commit-ci bot and others added 2 commits July 18, 2024 19:18

pre-commit auto-fixes

b46644b

Merge branch 'materialsproject:master' into master

78e7e0a

janosh force-pushed the master branch 2 times, most recently from e3fbc67 to 41e6d99 Compare August 3, 2024 19:01

kavanase and others added 3 commits August 10, 2024 19:48

Merge branch 'master' into FermiDos_update

67aafb9

Signed-off-by: Seán Kavanagh <[email protected]>

pre-commit auto-fixes

b44590d

Fix default tol update

fc2d82a

kavanase added 3 commits August 30, 2024 10:15

Merge updated pymatgen master

0d7e967

Remove yaml files to force re-eval

e9fd84c

Ensure LF line endings for yamls

67b28ac

janosh approved these changes Aug 31, 2024

View reviewed changes

janosh merged commit c5296b3 into materialsproject:master Aug 31, 2024
43 checks passed

kavanase deleted the FermiDos_update branch August 31, 2024 18:50

janosh added fix Bug fix PRs electronic structure Electronic structure functionality related labels Sep 10, 2024

kavanase mentioned this pull request Sep 26, 2024

Different bandgap values from py-sc-fermi and doped SMTG-Bham/doped#90

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update `FermiDos.get_doping()` to be more robust #3879

Update `FermiDos.get_doping()` to be more robust #3879

kavanase commented Jun 15, 2024

mkhorton commented Jun 16, 2024

mkhorton commented Jun 16, 2024

kavanase commented Jun 17, 2024 •

edited

Loading

kavanase commented Jun 17, 2024

kavanase commented Aug 10, 2024

kavanase commented Aug 11, 2024

janosh commented Aug 30, 2024

kavanase commented Aug 30, 2024

Update FermiDos.get_doping() to be more robust #3879

Update FermiDos.get_doping() to be more robust #3879

Conversation

kavanase commented Jun 15, 2024

mkhorton commented Jun 16, 2024

mkhorton commented Jun 16, 2024

kavanase commented Jun 17, 2024 • edited Loading

kavanase commented Jun 17, 2024

kavanase commented Aug 10, 2024

kavanase commented Aug 11, 2024

janosh commented Aug 30, 2024

kavanase commented Aug 30, 2024

Update `FermiDos.get_doping()` to be more robust #3879

Update `FermiDos.get_doping()` to be more robust #3879

kavanase commented Jun 17, 2024 •

edited

Loading