Lightning.GPU Pickle error on Dask cluster

Hello! If applicable, put your complete code example down below. Make sure that your code:

  • is 100% self-contained — someone can copy-paste exactly what is here and run it to
    reproduce the behaviour you are observing
  • includes comments
# Put code here
import distributed
import pennylane as qml

wires = 4
layers = 1
dev = qml.device('lightning.gpu', wires=wires, shots=None)


@qml.qnode(dev)
def circuit(parameters):
    qml.StronglyEntanglingLayers(weights=parameters, wires=range(wires))
    return [qml.expval(qml.PauliZ(i)) for i in range(wires)]


def run_circuit():
    shape = qml.StronglyEntanglingLayers.shape(n_layers=layers, n_wires=wires)
    weights = qml.numpy.random.random(size=shape)
    val = circuit(weights)
    return val


if __name__ == "__main__":
    dask_client = distributed.Client()
    dask_client.scheduler_info()

    print(dask_client.gather(dask_client.map(lambda a: a * a, range(10))))
    print(dask_client.gather(dask_client.map(lambda a: run_circuit(), range(10))))
    dask_client.close()

If you want help with diagnosing an error, please put the full error message below:

# Put full error message here
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/global/homes/p/prmantha/.conda/envs/myenv/lib/python3.10/site-packages/distributed/protocol/serialize.py", line 350, in serialize
    header, frames = dumps(x, context=context) if wants_context else dumps(x)
  File "/global/homes/p/prmantha/.conda/envs/myenv/lib/python3.10/site-packages/distributed/protocol/serialize.py", line 73, in pickle_dumps
    frames[0] = pickle.dumps(
  File "/global/homes/p/prmantha/.conda/envs/myenv/lib/python3.10/site-packages/distributed/protocol/pickle.py", line 81, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/global/homes/p/prmantha/.conda/envs/myenv/lib/python3.10/site-packages/cloudpickle/cloudpickle_fast.py", line 73, in dumps
    cp.dump(obj)
  File "/global/homes/p/prmantha/.conda/envs/myenv/lib/python3.10/site-packages/cloudpickle/cloudpickle_fast.py", line 632, in dump
    return Pickler.dump(self, obj)
TypeError: cannot pickle 'pennylane_lightning_gpu.lightning_gpu_qubit_ops.LightningGPU_C128' object

And, finally, make sure to include the versions of your packages. Specifically, show us the output of qml.about().

Name: PennyLane
Version: 0.31.0
Summary: PennyLane is a Python quantum machine learning library by Xanadu Inc.
Home-page: https://github.com/PennyLaneAI/pennylane
Author: 
Author-email: 
License: Apache License 2.0
Location: /global/u1/p/prmantha/.conda/envs/myenv/lib/python3.10/site-packages
Requires: appdirs, autograd, autoray, cachetools, networkx, numpy, pennylane-lightning, requests, rustworkx, scipy, semantic-version, toml
Required-by: PennyLane-Lightning, PennyLane-Lightning-GPU

Platform info:           Linux-5.14.21-150400.24.46_12.0.73-cray_shasta_c-x86_64-with-glibc2.31
Python version:          3.10.11
Numpy version:           1.25.0
Scipy version:           1.10.0
Installed devices:
- default.gaussian (PennyLane-0.31.0)
- default.mixed (PennyLane-0.31.0)
- default.qubit (PennyLane-0.31.0)
- default.qubit.autograd (PennyLane-0.31.0)
- default.qubit.jax (PennyLane-0.31.0)
- default.qubit.tf (PennyLane-0.31.0)
- default.qubit.torch (PennyLane-0.31.0)
- default.qutrit (PennyLane-0.31.0)
- null.qubit (PennyLane-0.31.0)
- lightning.qubit (PennyLane-Lightning-0.31.0)
- lightning.gpu (PennyLane-Lightning-GPU-0.31.0)

Hi @QuantumMan

I think the main issue here is defining the device outside the dask environment, but aiming to call it inside. For something like this to work, the statevector would be copied across all spawning processes, and may be a limit as the number of qubits increases.

To express this in a way Dask likes, I suggest defining the device, QNode and weights all in a Dask-friendly callable, and pass that to the scheduler. I have attempted a quite rewrite of your script using lightning.qubit instead of lightning.gpu, but they should all work the same:

import distributed
import pennylane as qml

layers=2

def run_circuit(wires, layers):
    shape = qml.StronglyEntanglingLayers.shape(n_layers=layers, n_wires=wires)
    weights = qml.numpy.random.random(size=shape)
    dev = qml.device('lightning.qubit', wires=wires, shots=None)

    @qml.qnode(dev)
    def circuit(parameters):
        qml.StronglyEntanglingLayers(weights=parameters, wires=range(wires))
        return qml.math.hstack([qml.expval(qml.PauliZ(i)) for i in range(wires)])

    return circuit(weights)


if __name__ == "__main__":
    dask_client = distributed.Client()
    dask_client.scheduler_info()
    print(dask_client.gather(dask_client.map(lambda a: a * a, range(10))))
    print(dask_client.gather(dask_client.map(lambda w: run_circuit(w, layers), range(4,10))))
    dask_client.close()

In this situation I can pass arguments of the qubit count (and the layer count) to the function, which then creates the params and spins up the device. Since the parameters may not be bare numpy (they have autograd specific additions) it can be best to keep them on the same function as the dvice, though you can try serialising them multiple ways and seeing if keeping them on the host helps.

With the above, you should be able to spawn a Dask-CUDA cluster, and then have it call-back and register with a distributed scheduler, then let the lightning.gpu devices spin up on the available workers. Failing that, you can do the same with the default dask-distributed scheduler, and use lightning.qubit or lightning.kokkos to farm out CPU workers.

I hope this helps!

1 Like