Unable to construct Hamiltonian

Hi All,

I tried to create Hamiltonian for the molecule reported in the work (https://www.nature.com/articles/s41524-021-00540-6). The work was carried out in IBM Qiskit, I wanted to create a ground state Hamiltonian of the molecule with Active Space invoked. It gave me a following error.

“Unable to allocate 255. GiB for an array with shape (430, 430, 430, 430) and data type float64” .
I am running the calc. on 370GB RAM and 48 core system. I also noticed that throughout the job, not all the cores were occupied. I am not sure if the process is parallelized or not. Kindly help me in this regard.

I have attached the jupyter notebook I used to run the calculation for your kind reference. Also attached the files where no. of cores and memory occupied is noted down.

1 Like

Hi @raghavv, thank you for your question!

I don’t see any code attached though. Could you please copy-paste the part of your code where you create the device?

Thanks!

CPUDataDump_1Xanad.txt (20 KB) DataDumpMemoryfil_171221.txt (56 KB) PSPCz-Copy1.txt (24.3 KB)
Thanks for the reply @CatalinaAlbornoz. Attached the files.
Kindly change PSPCz-Copy1.txt into .ipynb file to view on Jupyter notebook.

Thank you.

Hi @raghavv!

The molecule you’re trying to simulate is way too big for your computer. I would suggest starting with a very small example and then start growing to larger ones. Have you tried with smaller molecules?

Hi @raghavv. Thank you for reaching out! I went through your Jupyter notebook and it seems this error is due to the overcommit handling mode in your OS. If you are using Linux, setting overcommit_memory to 1 may fix the problem:

$ echo 1 > /proc/sys/vm/overcommit_memory

Moreover, this is also worth to try psi4 instead of pyscf as the error occurred in pyscf, to do so you need to update your code as follows,

H, qubits = qchem.molecular_hamiltonian(
    symbols,
    coordinates,
    charge=0,
    name='PSPCz',
    mult=1,
    basis='6-31G*',
    package='psi4',
   active_electrons=2,
    active_orbitals=1,
)

These are quantum chemistry packages used to solve the mean field electronic structure problem; you can find more information here. I hope these help to fix the problem.

2 Likes

@maliasadi. Thank you for kind response. It worked fine for a small molecule like NH3.

When I run the same molecule I had reported earlier , I am getting the following error.

Error Starts

/home/subbu/Raghav/2021/9_Sep/XANADU/OpenFermion-Psi4/openfermionpsi4/_run_psi4.py:224: Warning: No calculation saved. Psi4 segmentation fault possible.
warnings.warn('No calculation saved. ’

FileNotFoundError Traceback (most recent call last)
in

~/psi4conda/envs/penny/lib/python3.8/site-packages/pennylane_qchem/qchem/structure.py in molecular_hamiltonian(symbols, coordinates, name, charge, mult, basis, package, active_electrons, active_orbitals, mapping, outpath, wires)
829 hf_file = meanfield(symbols, coordinates, name, charge, mult, basis, package, outpath)
830
–> 831 molecule = openfermion.MolecularData(filename=hf_file)
832
833 core, active = active_space(

~/psi4conda/envs/penny/lib/python3.8/site-packages/openfermion/chem/molecular_data.py in init(self, geometry, basis, multiplicity, charge, description, filename, data_directory)
343 else:
344 self.filename = filename
–> 345 self.load()
346 self.init_lazy_properties()
347 return

~/psi4conda/envs/penny/lib/python3.8/site-packages/openfermion/chem/molecular_data.py in load(self)
720 geometry =
721
–> 722 with h5py.File("{}.hdf5".format(self.filename), “r”) as f:
723 # Load geometry:
724 data = f[“geometry/atoms”]

~/psi4conda/envs/penny/lib/python3.8/site-packages/h5py/_hl/files.py in init(self, name, mode, driver, libver, userblock_size, swmr, rdcc_nslots, rdcc_nbytes, rdcc_w0, track_order, fs_strategy, fs_persist, fs_threshold, **kwds)
440 with phil:
441 fapl = make_fapl(driver, libver, rdcc_nslots, rdcc_nbytes, rdcc_w0, **kwds)
–> 442 fid = make_fid(name, mode, userblock_size,
443 fapl, fcpl=make_fcpl(track_order=track_order, fs_strategy=fs_strategy,
444 fs_persist=fs_persist, fs_threshold=fs_threshold),

~/psi4conda/envs/penny/lib/python3.8/site-packages/h5py/_hl/files.py in make_fid(name, mode, userblock_size, fapl, fcpl, swmr)
193 if swmr and swmr_support:
194 flags |= h5f.ACC_SWMR_READ
–> 195 fid = h5f.open(name, flags, fapl=fapl)
196 elif mode == ‘r+’:
197 fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/h5f.pyx in h5py.h5f.open()

FileNotFoundError: [Errno 2] Unable to open file (unable to open file: name = ‘./PSPCz_631Gd_psi4_6-31G*.hdf5’, errno = 2, error message = ‘No such file or directory’, flags = 0, o_flags = 0)

Error Ends

I am not sure what I am missing here, as this molecule did well with the basis set 3-21G. Also, I noticed that the job is running on only one core and the memory being used is 8 GB. Is there a keyword that I can use to increase the memory and assign more no. of cores to the job. I am not sure, just a guess, do you think making the job run in parallel and occupy more memory can solve the problem. I am working on linux system with 48 cores and about 380 GB RAM. Kindly help me in this regard.
Thank you. @CatalinaAlbornoz .

Hi @raghavv. This is de facto a FileNotFoundError and to fix this you should ensure that you have PSPCz_631Gd_psi4_6-31G*.hdf5 in the right path. In the following, I will do my best to address your concerns,

  1. To run the job in parallel: PennyLane qchem uses pyscf through openfermion and you may want to try mpi4pyscf. Not sure how to do though!

  2. To fix the memory problem: as I mentioned earlier this is on pyscf that is trying to load too many 2e integrals. If setting overcommit_memory to 1 didn’t fix the problem, please try to decrease the basis set to sto-3g or maybe 6-31g.

I hope taking advantage of smaller basis set fix this problem. Please let us know if you get any concerns or problems.

PS. You may find " Building molecular Hamiltonians" tutorial helpful too.

1 Like

Thank you for your kind reply @maliasadi and @CatalinaAlbornoz. I will summarize the list of calculations I have done below over the past few days.

I ran the calculation systematically with 321G,631G and 631G* and cc-PVDZ. 6-31G* and ccPVDZ gave FileNotFoundError error. Initially my guess was that , in the linux shell , it is unable to write the file PSPCz_Psi4_631GD_psi4_6-31G* because in linux shell it can read it only as PSPCz_Psi4_631GD_psi4_6-31G * .out. However, when I tried to run the same molecule usign ccPVDZ, it gave the same error. To verify the observation, I also did the same Hamiltonian construction for the molecule NH3 using basis sets, STO3G,321G,6-31G,6-31G,ccPVDZ. And of them ran successfully. At this moment, I am unable to reason out why for the molecule PSPCz (45 Atoms, 200 electrons with Active Space=2 and no. of active electrons=2) is giving errors while using higher basis sets such as 6-31G or ccPVDZ. Kindly suggest how to proceed about this.

Please let me know if I am not clear.
Also, wish you a merry Christmas and happy a New Year 2022.

Thank you

Hi @raghavv, thanks for the question and sorry for the late reply.

It seems to me that, for larger basis sets, the pyscf output hdf5 file is not created at all either because the Hartree-Fock iteerations crash or you ran out of disc space. Could you please run independednt pyscf calculations for the same molecule follwoing the examples provided here and see if you can successfully compute the Hartree-Fock energy with pyscf? If you can run pyscf successfully, then we can track the problem through the pipeline. Please let me know if you face any issues with the calculations.

Dear @sjahangiri, thank you for the kind response. Sorry for the delayed response. I have been trying to construct the Hamiltonian from outside of Pennylane as you have suggested. I have not had much of a success yet. I am working on it and will update once I figure it out. If you there are any other suggestions/directions, kindly request you to share. Will keep things updated sir. Thank you very much.

Thanks @raghavv for the update. If you post the molecular geometry here, in xyz format for example, I can also give it a try.

Dear Sir @sjahangiri, Below I have given the coordinates of water which I am trying to use PySCF or any other chemistry package to construct Hamiltonian and have it written in the form of .HDF5 file format.

H -0.0399 -0.0038 0.0
O 1.5780 0.8540 0.0
H 2.7909 -0.5159 0.0

Thanks for your kind support sir.

Hi @raghavv. The following code creates the molecular Hamiltonian for water step by step. Could you please run the code for both water and PSPCz and let me know at what stage the code breaks? Please modify the molecular info given between the dashed lines for the new molecule. Thanks!

import pennylane as qml
import openfermion
from pennylane import numpy as np
from openfermionpyscf import run_pyscf

#------------------------
symbols = ["O", "H", "H"]
x = np.array([0.0000000000,  0.0000000000,  0.1501668459,
              0.0000000000,  0.7580826553, -0.4856334229,
              0.0000000000, -0.7580826553, -0.4856334229])
mult = 1
charge = 0
basis = 'sto-3g'
core = [0, 1, 2, 3]
active = [4, 5]
filename = './water'
#------------------------

geometry = [[s, tuple(x[3 * i : 3 * i + 3])] for i, s in enumerate(symbols)]

molecule = openfermion.MolecularData(geometry, basis, mult, charge, filename=filename)

run_pyscf(molecule, run_scf=1, verbose=0)

molecule_ = openfermion.MolecularData(filename=filename)

ht = molecule_.get_molecular_hamiltonian(occupied_indices=core, active_indices=active)

hf = openfermion.transforms.get_fermion_operator(ht) # fermion hamiltonian

hq = openfermion.transforms.jordan_wigner(hf) # qubit hamiltonian

hp = qml.qchem.convert_observable(hq) # PennyLane hamiltonian

print(hp)

The output for water should be

  (-74.19370628334303) [I0]
+ (-0.15460393042639708) [Z2]
+ (-0.15460393042639708) [Z3]
+ (0.13038589407848444) [Z0]
+ (0.13038589407848444) [Z1]
+ (0.138284189609364) [Z0 Z2]
+ (0.138284189609364) [Z1 Z3]
+ (0.14767808197773874) [Z0 Z3]
+ (0.14767808197773874) [Z1 Z2]
+ (0.14966950988795605) [Z2 Z3]
+ (0.22003977334376112) [Z0 Z1]
+ (-0.00939389236837476) [Y0 Y1 X2 X3]
+ (-0.00939389236837476) [X0 X1 Y2 Y3]
+ (0.00939389236837476) [Y0 X1 X2 Y3]
+ (0.00939389236837476) [X0 Y1 Y2 X3]

Dear Sir, @sjahangiri . Thanks for sharing the code. It ran smooth for both water and PSPCz. Earlier, I had tried within Pennylane using “qchem.molecular_hamiltonian” (as shown above) and it took me a very long time to complete the job and create a corresponding HDF5 with Hamiltonian written to it. It had ran on only one core. The code you suggested ran like a gem and completed within few minutes. Thank you so much for sharing the code. I have attached the code with PSPCz molecular coordinates for your kind reference. The qubit Hamiltonian that was output is pasted below for your quick reference.

PSPCz_Hamiltonian_Externally.py (3.2 KB)

(-322.1043666152533) [I0]

  • (8.87674183162617) [Z2]
  • (8.87674183162617) [Z3]
  • (10.218927690523312) [Z0]
  • (10.218927690523312) [Z1]
  • (-0.006447940283844799) [Y1 Y3]
  • (-0.006447940283844799) [X1 X3]
  • (0.004959753825945462) [Y0 Y2]
  • (0.004959753825945462) [X0 X2]
  • (0.02545094913478638) [Z0 Z2]
  • (0.02545094913478638) [Z1 Z3]
  • (0.02674734205767252) [Z0 Z3]
  • (0.02674734205767252) [Z1 Z2]
  • (0.07156347202814556) [Z0 Z1]
  • (0.08691649983952954) [Z2 Z3]
  • (0.14484917836670383) [Y0 Z1 Y2]
  • (0.14484917836670383) [X0 Z1 X2]
  • (0.14484917836670383) [Y1 Z2 Y3]
  • (0.14484917836670383) [X1 Z2 X3]
  • (-0.0064479402838447985) [Y0 Z1 Y2 Z3]
  • (-0.0064479402838447985) [X0 Z1 X2 Z3]
  • (-0.0012963929228861414) [Y0 Y1 X2 X3]
  • (-0.0012963929228861414) [X0 X1 Y2 Y3]
  • (0.0012963929228861414) [Y0 X1 X2 Y3]
  • (0.0012963929228861414) [X0 Y1 Y2 X3]
  • (0.004959753825945462) [Z0 Y1 Z2 Y3]
  • (0.004959753825945462) [Z0 X1 Z2 X3]

I am not sure why using “qchem.molecular_hamiltonian” took such a long time to create the Hamiltonian or it even crashed. Here, I am assuming calling the function ‘qchem.molecular_hamiltonian’ would still go to PySCF and ask for a Hartree-Fock level of calculation to be run. Then why does it take a longer time. Can you kindly clarify.

Please let me know if I am not clear.

Thank you.

Hi @raghavv. This is a very interesting molecule!

Yes, qml.qchem.molecular_hamiltonian just calls openfermion under the hood.

I computed the PSPCz Hamiltonian with both
qml.qchem.molecular_hamiltonian and the stepwise code using sto-3g basis and 2 active electrons and 2 active orbitals. Both methods take the same time, ~11 minutes, and the size of the hdf5 files are both 4.8G. I am not sure why qml.qchem.molecular_hamiltonian takes longer time in your examples with larger basis sets.

However, there is no fundamental difference between the two approaches. We are actually considering to refactor qml.qchem.molecular_hamiltonian and ask the users to call the external libraries on their side. So, you can safely use this code example if it works better for your system.

Please feel free to let us know if you have any other questions.

1 Like

Dear Sir @sjahangiri. Sorry for the delayed response. I was trying to understand the code and the physics behind it all. As of now working on it will update here. Thank you very much for kind responses. Will keep you posted.

Dear Sir @sjahangiri, hope you are well.

I tried for quite sometime to combine the Pennylane’s mol-geom-opt (Optimization of molecular geometries — PennyLane documentation) and adaptive circuit (Adaptive circuits for quantum chemistry — PennyLane documentation) tutorials to try to optimized molecule using Full Configuration Interaction method. After several attempts, I managed to combine the two codes , however I am unable to run optimization on the molecule. It only runs a single point calculation. I am not sure what I am missing here. It would be really helpful if you could kindly look into the code and suggest what I am missing.

I have downloaded the jupyter notebook file in .py format to be able to upload here.

Kindly let me know if I am not clear enough.

Thank youCombined_Code_That_Run-For-Question.py (7.8 KB)

Dear @raghavv,

Thank you very much for sharing your question.

I noted that you are using/mixing functionalities from PennyLane, OpenFermion and PySCF. This is typically not a good strategy, it makes the code less readable and harder to debug. All the functionalities you need for molecular geometry optimization are available in PennyLane.

I suggest the following:

1- For now, please keep the adaptive method to select the excitation operations independent from the geometry optization code. I understand from your message that this is working fine.

2- For the geometry optimization I recommend following the tutorial Optimization of molecular geometries.

  • I noted you have used multiple functions to build the Hamiltonian of LiH molecule. This should be simpler. For the LiH molecule in a minimal basis sto-3g, you have 4 electrons and a total of 6 orbitals ( => 12 qubits). That would give you a relatively large Hamiltonian with 631 Pauli strings. The latter can be reduced by defining an active space. For example, you could chose 2 electrons and 3 active orbitals (=> 6 qubits, 62 Pauli strings). This can be done as follows:
import pennylane as qml

symbols = ["Li", "H"]
x = np.array([0.0, 0.0, 0.0, 0.0, 0.0, 2.969280527], requires_grad=True)

def H(x):
    return qml.qchem.molecular_hamiltonian(symbols, x, active_electrons=2,  active_orbitals=3))[0]

Please, note that from this point on your algorithm (including the quantum circuit) to optimize the geometry of LiH, within this approximation, is identical to the one in the above mentioned tutorial.

Could you please try this and let us know whether it works for you or not? If it works, you can proceed to refine the simulation by including more orbitals and excitation operations.

Please, do not hesitate to come back to us. It is always a pleasure to help!

Best regards.

2 Likes

Dear sir @Alain_Delgado_Gran. Thank you for the kind response. I tried the method you suggested to calculate the Hamiltonian of the molecule.
I carried out the calculation for LiH and it worked. I was unable to construct Hamiltonian for the bigger molecule such as PSPCz which I had mentioned in the first comment. That is why @sjahangiri had suggested me a work around. However, as you mentioned when the Hamiltonian is borrowed from Openfermion, I guess the nuclear gradient did not run.

I am trying to optimize molecule PSPCz using Pennylane. I will keep you updated on that.

Meanwhile, I have a couple questions regarding the XYZ format that the code prints out and the molecular orbitals. I kindly request you to help me understand these.

  1. The following code has the number ‘3’ in it which properly prints out the coordinate in XYZ format. I am not sure what is the significance of the no. ‘3’ in it. Also, it goes like 3i , 3i+1 and 3*i+2. I am not sure what these help us achieve, can you please help me understand.

Also looks like the coordinates being printed are in Bohr units. I would like to convert them into angstrom units. I tried something like ({x[3 * i]:.4f} *0.529177210903), but it did not work.

for i, atom in enumerate(symbols):
print(f" {atom} {x[3 * i]:.4f} {x[3 * i + 1]:.4f} {x[3 * i + 2]:.4f}")

  1. Is there a way to extract molecular orbital information and their energies in Pennylane.

Thank you.

Dear @raghavv,

I am glad to hear that the geometry optimization of the LiH molecule went well.

Regarding your questions:

  1. The following code has the number ‘3’ in it which properly prints out the coordinate in XYZ format. I am not sure what is the significance of the no. ‘3’ in it. Also, it goes like 3 i , 3 i+1 and 3*i+2. I am not sure what these help us achieve, can you please help me understand.

This 3 comes from the fact that the array x is a one-dimensional array storing 3N coordinates [x_0, x_1, ... , x_{3N-1}] where N is the number of atoms in your molecule. Thus, if you want to print the nuclear coordinates in the XYZ format, you have to account for this. For example, you can access the coordinates of the first atom (i=0) as: x = x[0], y = x[1], z=x[2] .

For printing the coordinates in Angstroms you could simply do:

x_a = x*0.529177210903

print("%s %4s %8s %8s" % ("symbol", "x", "y", "z"))
for i, atom in enumerate(symbols):
    print(f"  {atom}    {x_a[3 * i]:.4f}   {x_a[3 * i + 1]:.4f}   {x_a[3 * i + 2]:.4f}")
  1. Is there a way to extract molecular orbital information and their energies in Pennylane.

Yes. You can use first the meanfield function to generate the file containing the molecular orbitals data for LiH. Next, you can use OpenFermion MolecularData class to access the energies and the wave functions of the orbitals. Please, see the example below:

import pennylane as qml
from pennylane import numpy as np
from openfermion import MolecularData

symb = ["Li", "H"]
x = np.array([0.0, 0.0, 0.0, 0.0, 0.0, 2.969280527], requires_grad=True)

lih = qml.qchem.meanfield(symb, x)
mol = MolecularData(filename=lih)

print("Number of orbitals: ", mol.n_orbitals)
print("Orbital energies: ", mol.orbital_energies)
print("Expansion coefficients: ", mol.canonical_orbitals)

FYI: New utility functions are being implemented in PennyLane to access this information natively, and also for visualizing atomic and molecular orbitals and much more! They will be released soon. Stay tuned!

Hope this helps. Do not hesitate to get back to us if you need further clarifications.

Thank you.

1 Like