Unable to construct Hamiltonian

Hi All,

I tried to create Hamiltonian for the molecule reported in the work (https://www.nature.com/articles/s41524-021-00540-6). The work was carried out in IBM Qiskit, I wanted to create a ground state Hamiltonian of the molecule with Active Space invoked. It gave me a following error.

“Unable to allocate 255. GiB for an array with shape (430, 430, 430, 430) and data type float64” .
I am running the calc. on 370GB RAM and 48 core system. I also noticed that throughout the job, not all the cores were occupied. I am not sure if the process is parallelized or not. Kindly help me in this regard.

I have attached the jupyter notebook I used to run the calculation for your kind reference. Also attached the files where no. of cores and memory occupied is noted down.

Hi @raghavv, thank you for your question!

I don’t see any code attached though. Could you please copy-paste the part of your code where you create the device?

Thanks!

CPUDataDump_1Xanad.txt (20 KB) DataDumpMemoryfil_171221.txt (56 KB) PSPCz-Copy1.txt (24.3 KB)
Thanks for the reply @CatalinaAlbornoz. Attached the files.
Kindly change PSPCz-Copy1.txt into .ipynb file to view on Jupyter notebook.

Thank you.

Hi @raghavv!

The molecule you’re trying to simulate is way too big for your computer. I would suggest starting with a very small example and then start growing to larger ones. Have you tried with smaller molecules?

Hi @raghavv. Thank you for reaching out! I went through your Jupyter notebook and it seems this error is due to the overcommit handling mode in your OS. If you are using Linux, setting overcommit_memory to 1 may fix the problem:

$ echo 1 > /proc/sys/vm/overcommit_memory

Moreover, this is also worth to try psi4 instead of pyscf as the error occurred in pyscf, to do so you need to update your code as follows,

H, qubits = qchem.molecular_hamiltonian(
    symbols,
    coordinates,
    charge=0,
    name='PSPCz',
    mult=1,
    basis='6-31G*',
    package='psi4',
   active_electrons=2,
    active_orbitals=1,
)

These are quantum chemistry packages used to solve the mean field electronic structure problem; you can find more information here. I hope these help to fix the problem.

2 Likes

@maliasadi. Thank you for kind response. It worked fine for a small molecule like NH3.

When I run the same molecule I had reported earlier , I am getting the following error.

Error Starts

/home/subbu/Raghav/2021/9_Sep/XANADU/OpenFermion-Psi4/openfermionpsi4/_run_psi4.py:224: Warning: No calculation saved. Psi4 segmentation fault possible.
warnings.warn('No calculation saved. ’

FileNotFoundError Traceback (most recent call last)
in

~/psi4conda/envs/penny/lib/python3.8/site-packages/pennylane_qchem/qchem/structure.py in molecular_hamiltonian(symbols, coordinates, name, charge, mult, basis, package, active_electrons, active_orbitals, mapping, outpath, wires)
829 hf_file = meanfield(symbols, coordinates, name, charge, mult, basis, package, outpath)
830
–> 831 molecule = openfermion.MolecularData(filename=hf_file)
832
833 core, active = active_space(

~/psi4conda/envs/penny/lib/python3.8/site-packages/openfermion/chem/molecular_data.py in init(self, geometry, basis, multiplicity, charge, description, filename, data_directory)
343 else:
344 self.filename = filename
–> 345 self.load()
346 self.init_lazy_properties()
347 return

~/psi4conda/envs/penny/lib/python3.8/site-packages/openfermion/chem/molecular_data.py in load(self)
720 geometry =
721
–> 722 with h5py.File("{}.hdf5".format(self.filename), “r”) as f:
723 # Load geometry:
724 data = f[“geometry/atoms”]

~/psi4conda/envs/penny/lib/python3.8/site-packages/h5py/_hl/files.py in init(self, name, mode, driver, libver, userblock_size, swmr, rdcc_nslots, rdcc_nbytes, rdcc_w0, track_order, fs_strategy, fs_persist, fs_threshold, **kwds)
440 with phil:
441 fapl = make_fapl(driver, libver, rdcc_nslots, rdcc_nbytes, rdcc_w0, **kwds)
–> 442 fid = make_fid(name, mode, userblock_size,
443 fapl, fcpl=make_fcpl(track_order=track_order, fs_strategy=fs_strategy,
444 fs_persist=fs_persist, fs_threshold=fs_threshold),

~/psi4conda/envs/penny/lib/python3.8/site-packages/h5py/_hl/files.py in make_fid(name, mode, userblock_size, fapl, fcpl, swmr)
193 if swmr and swmr_support:
194 flags |= h5f.ACC_SWMR_READ
–> 195 fid = h5f.open(name, flags, fapl=fapl)
196 elif mode == ‘r+’:
197 fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/h5f.pyx in h5py.h5f.open()

FileNotFoundError: [Errno 2] Unable to open file (unable to open file: name = ‘./PSPCz_631Gd_psi4_6-31G*.hdf5’, errno = 2, error message = ‘No such file or directory’, flags = 0, o_flags = 0)

Error Ends

I am not sure what I am missing here, as this molecule did well with the basis set 3-21G. Also, I noticed that the job is running on only one core and the memory being used is 8 GB. Is there a keyword that I can use to increase the memory and assign more no. of cores to the job. I am not sure, just a guess, do you think making the job run in parallel and occupy more memory can solve the problem. I am working on linux system with 48 cores and about 380 GB RAM. Kindly help me in this regard.
Thank you. @CatalinaAlbornoz .

Hi @raghavv. This is de facto a FileNotFoundError and to fix this you should ensure that you have PSPCz_631Gd_psi4_6-31G*.hdf5 in the right path. In the following, I will do my best to address your concerns,

  1. To run the job in parallel: PennyLane qchem uses pyscf through openfermion and you may want to try mpi4pyscf. Not sure how to do though!

  2. To fix the memory problem: as I mentioned earlier this is on pyscf that is trying to load too many 2e integrals. If setting overcommit_memory to 1 didn’t fix the problem, please try to decrease the basis set to sto-3g or maybe 6-31g.

I hope taking advantage of smaller basis set fix this problem. Please let us know if you get any concerns or problems.

PS. You may find " Building molecular Hamiltonians" tutorial helpful too.

1 Like

Thank you for your kind reply @maliasadi and @CatalinaAlbornoz. I will summarize the list of calculations I have done below over the past few days.

I ran the calculation systematically with 321G,631G and 631G* and cc-PVDZ. 6-31G* and ccPVDZ gave FileNotFoundError error. Initially my guess was that , in the linux shell , it is unable to write the file PSPCz_Psi4_631GD_psi4_6-31G* because in linux shell it can read it only as PSPCz_Psi4_631GD_psi4_6-31G * .out. However, when I tried to run the same molecule usign ccPVDZ, it gave the same error. To verify the observation, I also did the same Hamiltonian construction for the molecule NH3 using basis sets, STO3G,321G,6-31G,6-31G,ccPVDZ. And of them ran successfully. At this moment, I am unable to reason out why for the molecule PSPCz (45 Atoms, 200 electrons with Active Space=2 and no. of active electrons=2) is giving errors while using higher basis sets such as 6-31G or ccPVDZ. Kindly suggest how to proceed about this.

Please let me know if I am not clear.
Also, wish you a merry Christmas and happy a New Year 2022.

Thank you

Hi @raghavv, thanks for the question and sorry for the late reply.

It seems to me that, for larger basis sets, the pyscf output hdf5 file is not created at all either because the Hartree-Fock iteerations crash or you ran out of disc space. Could you please run independednt pyscf calculations for the same molecule follwoing the examples provided here and see if you can successfully compute the Hartree-Fock energy with pyscf? If you can run pyscf successfully, then we can track the problem through the pipeline. Please let me know if you face any issues with the calculations.

Dear @sjahangiri, thank you for the kind response. Sorry for the delayed response. I have been trying to construct the Hamiltonian from outside of Pennylane as you have suggested. I have not had much of a success yet. I am working on it and will update once I figure it out. If you there are any other suggestions/directions, kindly request you to share. Will keep things updated sir. Thank you very much.

Thanks @raghavv for the update. If you post the molecular geometry here, in xyz format for example, I can also give it a try.

Dear Sir @sjahangiri, Below I have given the coordinates of water which I am trying to use PySCF or any other chemistry package to construct Hamiltonian and have it written in the form of .HDF5 file format.

H -0.0399 -0.0038 0.0
O 1.5780 0.8540 0.0
H 2.7909 -0.5159 0.0

Thanks for your kind support sir.

Hi @raghavv. The following code creates the molecular Hamiltonian for water step by step. Could you please run the code for both water and PSPCz and let me know at what stage the code breaks? Please modify the molecular info given between the dashed lines for the new molecule. Thanks!

import pennylane as qml
import openfermion
from pennylane import numpy as np
from openfermionpyscf import run_pyscf

#------------------------
symbols = ["O", "H", "H"]
x = np.array([0.0000000000,  0.0000000000,  0.1501668459,
              0.0000000000,  0.7580826553, -0.4856334229,
              0.0000000000, -0.7580826553, -0.4856334229])
mult = 1
charge = 0
basis = 'sto-3g'
core = [0, 1, 2, 3]
active = [4, 5]
filename = './water'
#------------------------

geometry = [[s, tuple(x[3 * i : 3 * i + 3])] for i, s in enumerate(symbols)]

molecule = openfermion.MolecularData(geometry, basis, mult, charge, filename=filename)

run_pyscf(molecule, run_scf=1, verbose=0)

molecule_ = openfermion.MolecularData(filename=filename)

ht = molecule_.get_molecular_hamiltonian(occupied_indices=core, active_indices=active)

hf = openfermion.transforms.get_fermion_operator(ht) # fermion hamiltonian

hq = openfermion.transforms.jordan_wigner(hf) # qubit hamiltonian

hp = qml.qchem.convert_observable(hq) # PennyLane hamiltonian

print(hp)

The output for water should be

  (-74.19370628334303) [I0]
+ (-0.15460393042639708) [Z2]
+ (-0.15460393042639708) [Z3]
+ (0.13038589407848444) [Z0]
+ (0.13038589407848444) [Z1]
+ (0.138284189609364) [Z0 Z2]
+ (0.138284189609364) [Z1 Z3]
+ (0.14767808197773874) [Z0 Z3]
+ (0.14767808197773874) [Z1 Z2]
+ (0.14966950988795605) [Z2 Z3]
+ (0.22003977334376112) [Z0 Z1]
+ (-0.00939389236837476) [Y0 Y1 X2 X3]
+ (-0.00939389236837476) [X0 X1 Y2 Y3]
+ (0.00939389236837476) [Y0 X1 X2 Y3]
+ (0.00939389236837476) [X0 Y1 Y2 X3]

Dear Sir, @sjahangiri . Thanks for sharing the code. It ran smooth for both water and PSPCz. Earlier, I had tried within Pennylane using “qchem.molecular_hamiltonian” (as shown above) and it took me a very long time to complete the job and create a corresponding HDF5 with Hamiltonian written to it. It had ran on only one core. The code you suggested ran like a gem and completed within few minutes. Thank you so much for sharing the code. I have attached the code with PSPCz molecular coordinates for your kind reference. The qubit Hamiltonian that was output is pasted below for your quick reference.

PSPCz_Hamiltonian_Externally.py (3.2 KB)

(-322.1043666152533) [I0]

  • (8.87674183162617) [Z2]
  • (8.87674183162617) [Z3]
  • (10.218927690523312) [Z0]
  • (10.218927690523312) [Z1]
  • (-0.006447940283844799) [Y1 Y3]
  • (-0.006447940283844799) [X1 X3]
  • (0.004959753825945462) [Y0 Y2]
  • (0.004959753825945462) [X0 X2]
  • (0.02545094913478638) [Z0 Z2]
  • (0.02545094913478638) [Z1 Z3]
  • (0.02674734205767252) [Z0 Z3]
  • (0.02674734205767252) [Z1 Z2]
  • (0.07156347202814556) [Z0 Z1]
  • (0.08691649983952954) [Z2 Z3]
  • (0.14484917836670383) [Y0 Z1 Y2]
  • (0.14484917836670383) [X0 Z1 X2]
  • (0.14484917836670383) [Y1 Z2 Y3]
  • (0.14484917836670383) [X1 Z2 X3]
  • (-0.0064479402838447985) [Y0 Z1 Y2 Z3]
  • (-0.0064479402838447985) [X0 Z1 X2 Z3]
  • (-0.0012963929228861414) [Y0 Y1 X2 X3]
  • (-0.0012963929228861414) [X0 X1 Y2 Y3]
  • (0.0012963929228861414) [Y0 X1 X2 Y3]
  • (0.0012963929228861414) [X0 Y1 Y2 X3]
  • (0.004959753825945462) [Z0 Y1 Z2 Y3]
  • (0.004959753825945462) [Z0 X1 Z2 X3]

I am not sure why using “qchem.molecular_hamiltonian” took such a long time to create the Hamiltonian or it even crashed. Here, I am assuming calling the function ‘qchem.molecular_hamiltonian’ would still go to PySCF and ask for a Hartree-Fock level of calculation to be run. Then why does it take a longer time. Can you kindly clarify.

Please let me know if I am not clear.

Thank you.

Hi @raghavv. This is a very interesting molecule!

Yes, qml.qchem.molecular_hamiltonian just calls openfermion under the hood.

I computed the PSPCz Hamiltonian with both
qml.qchem.molecular_hamiltonian and the stepwise code using sto-3g basis and 2 active electrons and 2 active orbitals. Both methods take the same time, ~11 minutes, and the size of the hdf5 files are both 4.8G. I am not sure why qml.qchem.molecular_hamiltonian takes longer time in your examples with larger basis sets.

However, there is no fundamental difference between the two approaches. We are actually considering to refactor qml.qchem.molecular_hamiltonian and ask the users to call the external libraries on their side. So, you can safely use this code example if it works better for your system.

Please feel free to let us know if you have any other questions.