Hello! If applicable, put your complete code example down below. Make sure that your code:
- is 100% self-contained — someone can copy-paste exactly what is here and run it to
reproduce the behaviour you are observing - includes comments
I am trying to run the scripts describe in blog - Distributing quantum simulations using lightning.gpu with NVIDIA cuQuantum | PennyLane Blog on NERSC machines, but facing problem running the script. Followed the script to install pennylane-lightning-gpu from source code.
from mpi4py import MPI
import pennylane as qml
from pennylane import numpy as np
from timeit import default_timer as timer
rank = comm.Get_rank()
size = comm.Get_size()
# Set number of runs for timing averaging
num_runs = 3
# Choose number of qubits (wires) and circuit layers
n_wires = 32
n_layers = 2
# Instantiate CPU (lightning.qubit) or GPU (lightning.gpu) device
# mpi=True to switch on distributed simulation
# batch_obs=True to reduce the device memory demand for adjoint backpropagation
dev = qml.device('lightning.gpu', wires=n_wires, mpi=True, batch_obs=True)
# Create QNode of device and circuit
@qml.qnode(dev, diff_method="adjoint")
def circuit_adj(weights):
qml.StronglyEntanglingLayers(weights, wires=list(range(n_wires)))
return qml.math.hstack([qml.expval(qml.PauliZ(i)) for i in range(n_wires)])
# Set trainable parameters for calculating circuit Jacobian at the rank=0 process
if rank == 0:
params = np.random.random(qml.StronglyEntanglingLayers.shape(n_layers=n_layers, n_wires=n_wires))
params = None
# Broadcast the trainable parameters across MPI processes from rank=0 process
params = comm.bcast(params, root=0)
# Run, calculate the quantum circuit Jacobian and average the timing results
timing = []
for t in range(num_runs):
start = timer()
jac = qml.jacobian(circuit_adj)(params)
end = timer()
timing.append(end - start)
# MPI barrier to ensure all calculations are done
if rank == 0:
print("num_gpus: ", size, " wires: ", n_wires, " layers ", n_layers, " time: ", qml.numpy.mean(timing))
If you want help with diagnosing an error, please put the full error message below:
mpirun -np 4 python test.py
*** The MPI_Comm_rank() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[nid008340:1176249] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** The MPI_Comm_rank() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[nid008340:1176248] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** The MPI_Comm_rank() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[nid008340:1176246] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** The MPI_Comm_rank() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[nid008340:1176247] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
prterun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [prterun-nid008340-1176242@1,0]
Exit code: 14
And, finally, make sure to include the versions of your packages. Specifically, show us the output of qml.about()
>>> qml.about()
Name: PennyLane
Version: 0.32.0
Summary: PennyLane is a Python quantum machine learning library by Xanadu Inc.
Home-page: https://github.com/PennyLaneAI/pennylane
License: Apache License 2.0
Location: /global/u1/p/prmantha/.local/perlmutter/python-3.10/lib/python3.10/site-packages
Requires: appdirs, autograd, autoray, cachetools, networkx, numpy, pennylane-lightning, requests, rustworkx, scipy, semantic-version, toml, typing-extensions
Required-by: PennyLane-Lightning, PennyLane-Lightning-GPU
Platform info: Linux-5.14.21-150400.24.81_12.0.86-cray_shasta_c-x86_64-with-glibc2.31
Python version: 3.10.12
Numpy version: 1.23.5
Scipy version: 1.11.3
Installed devices:
- default.gaussian (PennyLane-0.32.0)
- default.mixed (PennyLane-0.32.0)
- default.qubit (PennyLane-0.32.0)
- default.qubit.autograd (PennyLane-0.32.0)
- default.qubit.jax (PennyLane-0.32.0)
- default.qubit.tf (PennyLane-0.32.0)
- default.qubit.torch (PennyLane-0.32.0)
- default.qutrit (PennyLane-0.32.0)
- null.qubit (PennyLane-0.32.0)
- lightning.qubit (PennyLane-Lightning-0.32.0)
- lightning.gpu (PennyLane-Lightning-GPU-0.33.0.dev0)
CUDA details
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_May__3_18:49:52_PDT_2022
Cuda compilation tools, release 11.7, V11.7.64
Build cuda_11.7.r11.7/compiler.31294372_0