Setup PySCF the performant way on Debian 12 Bookworm
Published at Oct 6, 2023
TL;DR
In High-Performance computing speed and reproducibility matter. In this article I want to shed light on which installation is the most performant way for the quantum chemistry package PySCF. The result might be surprising.

Preliminaries
Install an MKL library — For a scientist OpenBLAS is a good choice because we focus more on reproducibility than performance. We might need cmake, too.
sudo apt install -y cmake libopenblas-dev Install via pip
In a directory inside of your home directory (I use pyscf-pip)
python3 -m venv venv
source venv/bin/activate
pip install --prefer-binary pyscf Install from source
Compiling from source might speed up the calculations for you. In your home
git clone https://github.com/pyscf/pyscf.git
cd pyscf
python3 -m venv venv
source venv/bin/activate
pip install h5py scipy numpy
cd pyscf/lib
mkdir build
cd build
cmake ..
make That’s it.
Comparing the installations
In a Jupyter notebook for each environment I ran two cells
As a basis cell (I don’t want to measure that)
from pyscf import gto, scf The first cell are two standard Hartree-Fock caclulations with a good sized basis set
%%timeit
mol = gto.M(
atom = '''
O 0.000000 0.000000 0.117790
H 0.000000 0.755453 -0.471161
H 0.000000 -0.755453 -0.471161''',
basis = 'ccpvdz',
charge = 1,
spin = 1 # = 2S = spin_up - spin_down
)
#
# == ROHF solver
#
mf = scf.RHF(mol)
mf.kernel()
mf = scf.ROHF(mol)
mf.kernel()
mf = scf.UHF(mol)
mf.kernel()
#
# 2. closed-shell system
#
mol = gto.M(
atom = '''
O 0 0 0
H 0 -2.757 2.587
H 0 2.757 2.587''',
basis = 'ccpvdz',
)
#
# Using restricted closed shell solver
#
mf = scf.RHF(mol)
mf.kernel()
#
# Using restricted open shell solver
#
mf = scf.ROHF(mol)
mf.kernel()
mf = scf.UHF(mol) The second cell is a CISD calculation on the same molecule without specifying the spin states
%%timeit
mol = gto.M(
atom = '''
O 0.000000 0.000000 0.117790
H 0.000000 0.755453 -0.471161
H 0.000000 -0.755453 -0.471161''',
basis = 'ccpvdz',
)
mf = mol.HF().run()
mycc1 = mf.CISD().run()
mf = mol.UHF().run()
mycc2 = mf.CISD().run()
print('UCISD correlation energy', mycc1.e_corr)
print('RCISD correlation energy', mycc2.e_corr) Result
Let’s dive into the results.
Binary from pip

Result from the binary installation on HF calculation

Result of the binary installation on the CISD calculations
Compiled from source

Result from the source installation on HF

Result from the source installation on CISD
Conclusion
The binary installation had the same speed on the HF calculation but a much faster speed on the CISD calculation.
Thus, we choose the binary installation and might have learned something for other HPC programs as well.
Join my email list 9k+ and people to learn more about the good lifestyle, technology, and money.