Using Python's "Concurrent.futures" module for Multi-Processing VQE Code

Hello!
Given this small code for a VQE simulation:

    # Optimize the nuclear coordinate bond length
    theta.requires_grad = False
    r.requires_grad = True
    angle.requires_grad = False
    
    if (not r_optimized):
        _, r, _ = opt_r.step(cost_r, theta, r, angle, grad_fn=grad_r)
    
    # Optimize the nuclear coordinate bond angle
    theta.requires_grad = False
    r.requires_grad = False
    angle.requires_grad = True
    if (not angle_optimized):
        _, _, angle = opt_angle.step(cost_r, theta, r, angle, grad_fn=grad_angle)

I’m tried to convert it into code that can be multi-processed from external packages, rather then through Pennylane: (it should in theory be possible because the updates of r and the angle can happen independently of each other)

        with concurrent.futures.ProcessPoolExecutor() as executor:
            p1 = executor.submit(update_r, theta, r[0], angle[0])
            p2 = executor.submit(update_angle, theta, r[0], angle[0])

            r = p1.result()
            angle = p2.result()

where the two function calls are:

import multiprocessing

def update_r(theta, r, angle):
    # Optimize the nuclear coordinate bond length
    theta.requires_grad = False
    r.requires_grad = True
    angle.requires_grad = False 
    
    temp_variable = np.array([0.0], requires_grad=True)

    if (not r_optimized):
        _, temp_variable, _ = opt_r.step(cost_r, theta, r, angle, grad_fn=grad_r)
    #return r
    return temp_variable

def update_angle(theta, r, angle):
    # Optimize the nuclear coordinate bond angle
    theta.requires_grad = False
    r.requires_grad = False
    angle.requires_grad = True
    
    temp_variable = np.array([0.0], requires_grad=True)

    if (not angle_optimized):
        _, _, temp_variable= opt_angle.step(cost_r, theta, r, angle, grad_fn=grad_angle)
    #return angle
    return temp_variable

However, I keep getting this error:

---------------------------------------------------------------------------
BrokenProcessPool                         Traceback (most recent call last)
Cell In[16], line 98
     95         print(f"  {atom}    {x[3 * i]:.4f}   {x[3 * i + 1]:.4f}   {x[3 * i + 2]:.4f}")
     97 if __name__ == '__main__':
---> 98     main()

Cell In[16], line 57, in main()
     54     p1 = executor.submit(update_r, theta, r[0], angle[0])
     55     p2 = executor.submit(update_angle, theta, r[0], angle[0])
---> 57     r = p1.result()
     58     angle = p2.result()
     60 grad_end= time.time()

File ~/anaconda3/lib/python3.11/concurrent/futures/_base.py:456, in Future.result(self, timeout)
    454     raise CancelledError()
    455 elif self._state == FINISHED:
--> 456     return self.__get_result()
    457 else:
    458     raise TimeoutError()

File ~/anaconda3/lib/python3.11/concurrent/futures/_base.py:401, in Future.__get_result(self)
    399 if self._exception:
    400     try:
--> 401         raise self._exception
    402     finally:
    403         # Break a reference cycle with the exception in self._exception
    404         self = None

BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

Would anyone know how to fix this issue. Another issue that appears is that:

os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock. pid, fd = os.forkpty()

Which I think is indicative of a deadlock?

If any Pennylane experts could weigh in this, and the above issue, I’d highly appreciate that. And preferably offer some alternative code implementations of the concurrent.Futures or multiprocessing library. Thanks so much!

Hey @ImranNasrullah,

Can you attach your complete code so that I can copy-paste and try to replicate what’s going on? In the mean time, this stack overflow thread might help: python - All example concurrent.futures code is failing with "BrokenProcessPool" - Stack Overflow

Hey! Thanks for the reply. I ended up getting it working on a server that uses a Linux environment.

Another quick question, I’m trying to do VQE on a water dimer that uses 10 orbitals on the st3-g basis, so 20 qubits. Even with multi-processing, it’s still very slow. Would you know how I can speed it up? Like would using lightning.gpu work faster, or possibly other devices in your experience?

Thanks!

Hey! Thanks for the reply. I ended up getting it working on a server that uses a Linux environment.

Nice! Glad you figured it out :+1:

Another quick question …

Definitely check out our lightning suite! Whenever the qubit counts start to get large, that’s the tim you want to check out lightning. As you mentioned, lightning.gpu could work. We have distributed statevector support for lightning.gpu as well :sunglasses: .

You can also try out Catalyst: Catalyst — Catalyst 0.6.0 documentation. This is our quantum JIT compiler :slight_smile:. It might be a bit more effort to setup, but when used with PennyLane lightning it’s the speediest thing we offer :racing_car:. Let me know if you have any questions!

I made another post out of this too, but I figured out the part that is bottlenecking my program is adding the 2 Hamiltonians together. Would you know of a way a quick way to add 2 Hamiltonians together? Thanks

Hi @ImranNasrullah ,

Since you opened the new thread let’s continue the conversation there instead! Unless there’s something specific about your previous question that you want to discuss here.