Lightning GPU 0.25 on Jetson Xavier NX

Hello:

I have installed PennyLane Lightning GPU 0.25 with the cuQuantum SDK
in a python 3.10 conda environment on a Jetson Xavier NX Development
Kit with a Volta GPU (Ubuntu 20.04.4 LTS OS). When I run the following
code (as python test.py):

test.py contents:

from pennylane import numpy as np
import pennylane as qml
from pennylane.templates.layers import StronglyEntanglingLayers
from numpy.random import random

n_wires=22
dev = qml.device(“lightning.gpu”, wires=n_wires)

@qml.qnode(dev, diff_method=“adjoint”)
def circuit(weights):
qml.StronglyEntanglingLayers(weights, wires=list(range(n_wires)))
return [qml.expval(qml.PauliZ(i)) for i in range(n_wires)]

param_shape = qml.StronglyEntanglingLayers.shape(n_layers=2, n_wires=22)
params = np.random.random(param_shape)
jac = qml.jacobian(circuit)(params)

I get the output:

pennylane_lightning_gpu/lightning_gpu.py:77: UserWarning: No module named ‘pennylane_lightning_gpu.lightning_gpu_qubit_ops’
warn(str(e), UserWarning)
/home/arthurlobo/pennylane-lightning-gpu/pennylane_lightning_gpu/lightning_gpu.py:422: RuntimeWarning:
!!!#####################################################################################
!!!
!!! WARNING: INSUFFICIENT SUPPORT DETECTED FOR GPU DEVICE WITH lightning.gpu
!!! DEFAULTING TO CPU DEVICE lightning.qubit
!!!
!!!#####################################################################################

warn(

How can I get the code to run on the lightning.gpu device?
Where can I find the module: pennylane_lightning_gpu.lightning_gpu_qubit_ops?

Output of qml.about():

Name: PennyLane
Version: 0.24.0
Summary: PennyLane is a Python quantum machine learning library by Xanadu Inc.
Home-page: https://github.com/XanaduAI/pennylane
Author:
Author-email:
License: Apache License 2.0
Location: /home/art/anaconda3/envs/qml_py310/lib/python3.10/site-packages
Requires: appdirs, autograd, autoray, cachetools, networkx, numpy, pennylane-lightning, retworkx, scipy, semantic-version, toml
Required-by: PennyLane-Lightning, PennyLane-Lightning-GPU

Platform info: Linux-5.10.65-tegra-aarch64-with-glibc2.31
Python version: 3.10.4
Numpy version: 1.22.3
Scipy version: 1.8.1
Installed devices:

  • lightning.gpu (PennyLane-Lightning-GPU-0.25.0.dev0)
  • lightning.qubit (PennyLane-Lightning-0.24.0)
  • default.gaussian (PennyLane-0.24.0)
  • default.mixed (PennyLane-0.24.0)
  • default.qubit (PennyLane-0.24.0)
  • default.qubit.autograd (PennyLane-0.24.0)
  • default.qubit.jax (PennyLane-0.24.0)
  • default.qubit.tf (PennyLane-0.24.0)
  • default.qubit.torch (PennyLane-0.24.0)

Hi @art, welcome to the forum!

I see that you’re using the development version of lightning.gpu. Do you get the same problem when using version 0.24.1? Also, what version of cuQuantum have you got installed?

Hi @art, my colleague Ali had 2 additional suggestions:

1 - Update the LD_LIBRARY_PATH environment variable as follows before running test.py:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/complete/path/to/cuquantum-sdk/lib/

Without updating LD_LIBRARY_PATH , the lightning.gpu cannot figure out where the library exists and it switches to lightning.qubit instead :slightly_smiling_face:

2 - Double-check the commands used to build lightning.gpu. For example, building the library using python setup.py should be like the following:

python setup.py build_ext --cuquantum=/path/to/cuquantum-sdk 
python setup.py install

Please let me know if this solves your problem!

@CatalinaAlbornoz my cuQuantum SDK version is 22.05, LD_LIBRARY_PATH includes the cuQuantum SDK libraries (such as libcustatevec.so, libcutensornet.so etc), built lightning_gpu 0.24.1 with the commands:

cmake -BBuild -DCUQUANTUM_SDK=path to sdk
cmake --build ./Build --verbose
python -m pip install wheel
python setup.py build_ext --cuquantum=path to sdk
python setup.py bdist_wheel

and ran the test code with the same error message as shown:

python test.py
Name: PennyLane
Version: 0.24.0
Summary: PennyLane is a Python quantum machine learning library by Xanadu Inc.
Home-page: https://github.com/XanaduAI/pennylane
Author:
Author-email:
License: Apache License 2.0
Location: /home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages
Requires: appdirs, autograd, autoray, cachetools, networkx, numpy, pennylane-lightning, retworkx, scipy, semantic-version, toml
Required-by: PennyLane-Lightning, PennyLane-Lightning-GPU

Platform info: Linux-5.10.65-tegra-aarch64-with-glibc2.31
Python version: 3.10.4
Numpy version: 1.22.3
Scipy version: 1.8.1
Installed devices:

  • lightning.gpu (PennyLane-Lightning-GPU-0.24.1)

  • default.gaussian (PennyLane-0.24.0)

  • default.mixed (PennyLane-0.24.0)

  • default.qubit (PennyLane-0.24.0)

  • default.qubit.autograd (PennyLane-0.24.0)

  • default.qubit.jax (PennyLane-0.24.0)

  • default.qubit.tf (PennyLane-0.24.0)

  • default.qubit.torch (PennyLane-0.24.0)

  • lightning.qubit (PennyLane-Lightning-0.24.0)
    /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/lightning_gpu.py:73: UserWarning: No module named ‘pennylane_lightning_gpu.lightning_gpu_qubit_ops’
    warn(str(e), UserWarning)
    /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/lightning_gpu.py:413: RuntimeWarning:
    !!!#####################################################################################
    !!!
    !!! WARNING: INSUFFICIENT SUPPORT DETECTED FOR GPU DEVICE WITH lightning.gpu
    !!! DEFAULTING TO CPU DEVICE lightning.qubit
    !!!
    !!!#####################################################################################

    warn(

This time I used the Jetson AGX Xavier Developer Kit.

Output of make test-python:

make test-python
python3 -I -m pytest tests --tb=short
===================================================================== test session starts =====================================================================
platform linux – Python 3.10.4, pytest-7.1.2, pluggy-1.0.0
rootdir: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1
plugins: mock-3.8.2, flaky-3.7.0
collected 564 items / 1 skipped

tests/test_adjoint_jacobian.py … [ 21%]
…sss… [ 27%]
tests/test_apply.py … [ 50%]
…s [ 69%]
tests/test_comparison.py … [ 77%]
tests/test_gates.py .s…s…ss…s…ss…s…s…s.s…s…ss…s…s…ss…s…ss…s…s…s.s…s…ss… [ 97%]
tests/test_sample.py … [ 98%]
tests/test_var.py … [100%]

====================================================================== warnings summary =======================================================================
tests/test_adjoint_jacobian.py::test_qchem_expvalcost_correct
/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/vqe/vqe.py:153: UserWarning: ExpvalCost is deprecated, use qml.expval() instead. For optimizing Hamiltonian measurements with measuring commuting terms in parallel, use the grouping_type keyword in qml.Hamiltonian.
warnings.warn(

tests/test_gates.py: 110 warnings
/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/ops/qubit/non_parametric_ops.py:1954: UserWarning: The control_wires keyword will be removed soon. Use wires = (control_wires, target_wire) instead. See the documentation for more information.
warnings.warn(

================================================== 532 passed, 33 skipped, 111 warnings in 71.06s (0:01:11) ===================================================
pl-device-test --device lightning.gpu --skip-ops --shots=20000
===================================================================== test session starts =====================================================================
platform linux – Python 3.10.4, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/devices/tests, configfile: pytest.ini
plugins: mock-3.8.2, flaky-3.7.0
collected 372 items

…/…/…/…/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/devices/tests/test_compare_default_qubit.py sssssssss [ 2%]
…/…/…/…/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/devices/tests/test_gates.py … [ 11%]
… [ 52%]
… [ 60%]
…/…/…/…/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/devices/tests/test_gates_with_expval.py … [ 66%]
… [ 79%]
…/…/…/…/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/devices/tests/test_measurements.py …s…sss… [ 86%]
…sss.ssss…ss.ss…ss [ 95%]
…/…/…/…/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/devices/tests/test_properties.py …s.X… [ 97%]
…/…/…/…/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/devices/tests/test_tracker.py sss [ 98%]
…/…/…/…/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/devices/tests/test_wires.py … [100%]

====================================================================== warnings summary =======================================================================
…/…/…/…/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/ops/qubit/non_parametric_ops.py:1954
/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/ops/qubit/non_parametric_ops.py:1954: UserWarning: The control_wires keyword will be removed soon. Use wires = (control_wires, target_wire) instead. See the documentation for more information.
warnings.warn(

=================================================== 341 passed, 30 skipped, 1 xpassed, 1 warning in 14.90s ====================================================
pl-device-test --device lightning.gpu --shots=None --skip-ops
===================================================================== test session starts =====================================================================
platform linux – Python 3.10.4, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/devices/tests, configfile: pytest.ini
plugins: mock-3.8.2, flaky-3.7.0
collected 372 items

…/…/…/…/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/devices/tests/test_compare_default_qubit.py sssss… [ 2%]
…/…/…/…/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/devices/tests/test_gates.py … [ 11%]
… [ 52%]
… [ 60%]
…/…/…/…/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/devices/tests/test_gates_with_expval.py … [ 66%]
… [ 79%]
…/…/…/…/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/devices/tests/test_measurements.py …sss… [ 86%]
…ssssssssssss.ss…ss [ 95%]
…/…/…/…/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/devices/tests/test_properties.py …s.X… [ 97%]
…/…/…/…/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/devices/tests/test_tracker.py sss [ 98%]
…/…/…/…/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/devices/tests/test_wires.py … [100%]

====================================================================== warnings summary =======================================================================
…/…/…/…/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/ops/qubit/non_parametric_ops.py:1954
/home/arthurlobo/.conda/envs/qml_py310/lib/python3.10/site-packages/pennylane/ops/qubit/non_parametric_ops.py:1954: UserWarning: The control_wires keyword will be removed soon. Use wires = (control_wires, target_wire) instead. See the documentation for more information.
warnings.warn(

=================================================== 343 passed, 28 skipped, 1 xpassed, 1 warning in 10.04s ====================================================

Output of deviceQuery command:

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: “Xavier”
CUDA Driver Version / Runtime Version 11.4 / 11.4
CUDA Capability Major/Minor version number: 7.2
Total amount of global memory: 14907 MBytes (15631454208 bytes)
(008) Multiprocessors, (064) CUDA Cores/MP: 512 CUDA Cores
GPU Max Clock rate: 1377 MHz (1.38 GHz)
Memory Clock rate: 1377 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 524288 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total shared memory per multiprocessor: 98304 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: Yes
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Managed Memory: Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 0 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.4, CUDA Runtime Version = 11.4, NumDevs = 1
Result = PASS

Hi @art,

It looks like an installation problem because the error says that it can’t find “lightning_gpu_qubit_ops”. Can you please try the following?

1 - Create a new virtual environment
2 - Install PennyLane v0.24 using python -m pip install pennylane
3 - Install cuQuantum v 22.05 using python -m pip install cuquantum-python
4 - Install pennylane-lightning-gpu using python -m pip install pennylane-lightning[gpu]
5 - Try running the simplest code possible:

import pennylane as qml
dev = qml.device("lightning.gpu", wires=1)
@qml.qnode(dev)
def circuit():
    qml.PauliX(0)
    return qml.expval(qml.PauliZ(0))
circuit()

Please let me know if you still get the warning after this change.

1 Like

@CatalinaAlbornoz I created a new conda environment and followed the three python -m pip install steps. I still get the same warning with the simple code.

I ran “make test-cpp” which created lightning_gpu_qubit_ops.cpython-310-aarch64-linux-gnu.so (~47 MB) in the BuildTests folder which could be the missing lightning_gpu_qubit_ops module.

However I got errors when linking the runner_gpu executable. I have included messages from the “make test-cpp” - from the point where lightning_gpu_qubit_ops is created to the runner_gpu link errors. Those appear to be custatevec library functions which are not found. I had also
copied the custatevec and cutensornet libraries to /usr/local/cuda-11.4/lib64 from the cuQuantum install directory.

make[3]: Entering directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
[ 74%] Building CXX object CMakeFiles/lightning_gpu_qubit_ops.dir/pennylane_lightning_gpu/src/bindings/Bindings.cpp.o
[ 75%] Linking CXX shared module lightning_gpu_qubit_ops.cpython-310-aarch64-linux-gnu.so
make[3]: Leaving directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
[ 75%] Built target lightning_gpu_qubit_ops
make[3]: Entering directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
make[3]: Leaving directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
make[3]: Entering directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
[ 76%] Building CXX object _deps/pennylane_lightning-build/CMakeFiles/lightning_qubit_ops.dir/pennylane_lightning/src/bindings/Bindings.cpp.o
[ 77%] Linking CXX shared module lightning_qubit_ops.cpython-310-aarch64-linux-gnu.so
make[3]: Leaving directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
[ 77%] Built target lightning_qubit_ops
make[3]: Entering directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
make[3]: Leaving directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
make[3]: Entering directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
[ 77%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/compile_time_tests.dir/compile_time_tests.cpp.o
[ 78%] Linking CXX executable compile_time_tests
make[3]: Leaving directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
[ 78%] Built target compile_time_tests
make[3]: Entering directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
make[3]: Leaving directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
make[3]: Entering directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
[ 79%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/CreateAllWires.cpp.o
[ 79%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_AdjDiff.cpp.o
[ 80%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_AlgUtil.cpp.o
[ 81%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_CompilerSupport.cpp.o
[ 81%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_DynamicDispatcher.cpp.o
[ 82%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_Error.cpp.o
[ 83%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_GateImplementations_CompareKernels.cpp.o
[ 83%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_GateImplementations_Generator.cpp.o
[ 84%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_GateImplementations_Inverse.cpp.o
[ 84%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_GateImplementations_Matrix.cpp.o
[ 85%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_GateImplementations_Nonparam.cpp.o
[ 86%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_GateImplementations_Param.cpp.o
[ 86%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_GateUtil.cpp.o
[ 87%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_Internal.cpp.o
[ 88%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_KernelMap.cpp.o
[ 88%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_Kokkos_Sparse.cpp.o
[ 89%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_LinearAlgebra.cpp.o
[ 89%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_Measures.cpp.o
[ 90%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_Measures_Sparse.cpp.o
[ 91%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_Observables.cpp.o
[ 91%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_OpToMemberFuncPtr.cpp.o
[ 92%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_RuntimeInfo.cpp.o
[ 93%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_StateVecAdjDiff.cpp.o
[ 93%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_StateVectorManagedCPU.cpp.o
[ 94%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_StateVectorRawCPU.cpp.o
[ 94%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/Test_Util.cpp.o
[ 95%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/tests/CMakeFiles/runner.dir/runner_main.cpp.o
[ 96%] Linking CXX executable runner
make[3]: Leaving directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
[ 96%] Built target runner
make[3]: Entering directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
make[3]: Leaving directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
make[3]: Entering directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
[ 96%] Building CXX object pennylane_lightning_gpu/src/tests/CMakeFiles/runner_gpu.dir/runner_main.cpp.o
[ 97%] Building CXX object pennylane_lightning_gpu/src/tests/CMakeFiles/runner_gpu.dir/Test_StateVectorCudaManaged_NonParam.cpp.o
[ 98%] Building CXX object pennylane_lightning_gpu/src/tests/CMakeFiles/runner_gpu.dir/Test_StateVectorCudaManaged_Param.cpp.o
[ 98%] Building CXX object pennylane_lightning_gpu/src/tests/CMakeFiles/runner_gpu.dir/Test_AdjointDiffGPU.cpp.o
[ 99%] Building CXX object pennylane_lightning_gpu/src/tests/CMakeFiles/runner_gpu.dir/Test_GateCache.cpp.o
[100%] Linking CXX executable runner_gpu
/usr/bin/ld: CMakeFiles/runner_gpu.dir/Test_StateVectorCudaManaged_NonParam.cpp.o: in function Pennylane::StateVectorCudaManaged<float>::applyDeviceMatrixGate(float2 const*, std::vector<unsigned long, std::allocator<unsigned long> > const&, std::vector<unsigned long, std::allocator<unsigned long> > const&, bool)': /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:1075: undefined reference to custatevecApplyMatrixGetWorkspaceSize’
/usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:1095: undefined reference to custatevecApplyMatrix' /usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:1095: undefined reference to custatevecApplyMatrix’
/usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:1075: undefined reference to custatevecApplyMatrixGetWorkspaceSize' /usr/bin/ld: CMakeFiles/runner_gpu.dir/Test_StateVectorCudaManaged_NonParam.cpp.o: in function Pennylane::StateVectorCudaManaged::applyDeviceMatrixGate(double2 const*, std::vector<unsigned long, std::allocator > const&, std::vector<unsigned long, std::allocator > const&, bool)’:
/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:1075: undefined reference to custatevecApplyMatrixGetWorkspaceSize' /usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:1095: undefined reference to custatevecApplyMatrix’
/usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:1095: undefined reference to custatevecApplyMatrix' /usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:1075: undefined reference to custatevecApplyMatrixGetWorkspaceSize’
/usr/bin/ld: CMakeFiles/runner_gpu.dir/Test_StateVectorCudaManaged_NonParam.cpp.o: in function Pennylane::StateVectorCudaManaged<float>::~StateVectorCudaManaged()': /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:221: undefined reference to custatevecDestroy’
/usr/bin/ld: CMakeFiles/runner_gpu.dir/Test_StateVectorCudaManaged_NonParam.cpp.o: in function Pennylane::StateVectorCudaManaged<float>::~StateVectorCudaManaged()': /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:221: undefined reference to custatevecDestroy’
/usr/bin/ld: CMakeFiles/runner_gpu.dir/Test_StateVectorCudaManaged_NonParam.cpp.o: in function Pennylane::StateVectorCudaManaged<double>::~StateVectorCudaManaged()': /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:221: undefined reference to custatevecDestroy’
/usr/bin/ld: CMakeFiles/runner_gpu.dir/Test_StateVectorCudaManaged_NonParam.cpp.o: in function Pennylane::StateVectorCudaManaged<double>::~StateVectorCudaManaged()': /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:221: undefined reference to custatevecDestroy’
/usr/bin/ld: CMakeFiles/runner_gpu.dir/Test_StateVectorCudaManaged_NonParam.cpp.o: in function Pennylane::StateVectorCudaManaged<float>::applyParametricPauliGate(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<unsigned long, std::allocator<unsigned long> >, std::vector<unsigned long, std::allocator<unsigned long> >, float, bool)': /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:1016: undefined reference to custatevecApplyPauliRotation’
/usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:1016: undefined reference to custatevecApplyPauliRotation' /usr/bin/ld: CMakeFiles/runner_gpu.dir/Test_StateVectorCudaManaged_NonParam.cpp.o: in function Pennylane::StateVectorCudaManaged::applyParametricPauliGate(std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > const&, std::vector<unsigned long, std::allocator >, std::vector<unsigned long, std::allocator >, double, bool)’:
/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:1016: undefined reference to custatevecApplyPauliRotation' /usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:1016: undefined reference to custatevecApplyPauliRotation’
/usr/bin/ld: CMakeFiles/runner_gpu.dir/Test_StateVectorCudaManaged_NonParam.cpp.o: in function Pennylane::StateVectorCudaManaged<double>::StateVectorCudaManaged(unsigned long)': /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:200: undefined reference to custatevecCreate’
/usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:200: undefined reference to custatevecCreate' /usr/bin/ld: CMakeFiles/runner_gpu.dir/Test_StateVectorCudaManaged_NonParam.cpp.o: in function Pennylane::StateVectorCudaManaged::StateVectorCudaManaged(unsigned long)’:
/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:200: undefined reference to custatevecCreate' /usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:200: undefined reference to custatevecCreate’
/usr/bin/ld: CMakeFiles/runner_gpu.dir/Test_StateVectorCudaManaged_Param.cpp.o: in function Pennylane::StateVectorCudaManaged<float>::generate_samples(unsigned long)': /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:893: undefined reference to custatevecSamplerCreate’
/usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:903: undefined reference to custatevecSamplerPreprocess' /usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:907: undefined reference to custatevecSamplerSample’
/usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:913: undefined reference to custatevecSamplerDestroy' /usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:893: undefined reference to custatevecSamplerCreate’
/usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:913: undefined reference to custatevecSamplerDestroy' /usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:907: undefined reference to custatevecSamplerSample’
/usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:903: undefined reference to custatevecSamplerPreprocess' /usr/bin/ld: CMakeFiles/runner_gpu.dir/Test_StateVectorCudaManaged_Param.cpp.o: in function Pennylane::StateVectorCudaManaged::generate_samples(unsigned long)’:
/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:893: undefined reference to custatevecSamplerCreate' /usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:903: undefined reference to custatevecSamplerPreprocess’
/usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:907: undefined reference to custatevecSamplerSample' /usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:913: undefined reference to custatevecSamplerDestroy’
/usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:893: undefined reference to custatevecSamplerCreate' /usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:913: undefined reference to custatevecSamplerDestroy’
/usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:907: undefined reference to custatevecSamplerSample' /usr/bin/ld: /media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp:903: undefined reference to custatevecSamplerPreprocess’
collect2: error: ld returned 1 exit status
make[3]: *** [pennylane_lightning_gpu/src/tests/CMakeFiles/runner_gpu.dir/build.make:176: pennylane_lightning_gpu/src/tests/runner_gpu] Error 1
make[3]: Leaving directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
make[2]: *** [CMakeFiles/Makefile2:1675: pennylane_lightning_gpu/src/tests/CMakeFiles/runner_gpu.dir/all] Error 2
make[2]: Leaving directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
make[1]: *** [Makefile:146: all] Error 2
make[1]: Leaving directory ‘/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/BuildTests’
make: *** [Makefile:79: test-cpp] Error 2

Hi @art thanks for trying out lightning.gpu on a Jetson device. Though, it is also worth mentioning that the library is written mostly for V100 and A100 GPUs, so the performance on an embedded platform may be limited as most workloads we focus on are FP64 rather than FP32.

We have not attempted to use lightning.gpu on any ARM-based hardware just yet, and so we cannot make any guarantees yet about how to get it working.

We do have some ideas that we will explore first, and aim to offer a suggestion as soon as we can confirm if they will work or not. In the mean-time, it should be possible to install cuquantum directly into the building Python env with pip install cuquantum, and have it available during the compilation stage.

We will post here again once we have a potential solution.

Hi @art we were not able to verify this locally, but I may suggest trying the following until we can offer ARM wheels for LightningGPU.


  • In the Lightning GPU repository, patch the Dockerfile builder from quay.io/pypa/manylinux2014_x86_64 to quay.io/pypa/manylinux2014_aarch64.
  • Install docker/podman locally, and attempt to build using these instructions.

If this works, please feel free to let us know. In the mean-time, we will aim to support ARM wheels for LightningGPU.

@mlxd Following are the first few lines in the Dockerfile. I also commented out the cuda-rhel7.repo download since it applies to x86_64.
The Jetson AGX already has cuda-11.4 from the JetPack 5.0.1 install of the NVIDIA SDK Manager.

Dockerfile:
FROM quay.io/pypa/manylinux2014_aarch64

commented out - install missing packages
commented out - RUN yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo -y
commented out - && yum clean all
commented out - && yum -y install cuda cmake git openssh wget

RUN yum -y install cmake git openssh wget # removed cuda from here

However I get “failed to find nvcc” errors when running the docker build.
nvcc is in /usr/local/cuda-11.4/bin and setting the cmake CUDAToolkit_ROOT variable does not help.

running build_ext
░█░░░▀█▀░█▀▀░█░█░▀█▀░█▀█░▀█▀░█▀█░█▀▀░░░░█▀▀░█▀█░█░█
░█░░░░█░░█░█░█▀█░░█░░█░█░░█░░█░█░█░█░░░░█░█░█▀▀░█░█
░▀▀▀░▀▀▀░▀▀▀░▀░▀░░▀░░▀░▀░▀▀▀░▀░▀░▀▀▀░▀░░▀▀▀░▀░░░▀▀▀

– The CXX compiler identification is GNU 10.2.1
– The C compiler identification is GNU 10.2.1
CMake Error at /opt/_internal/pipx/venvs/cmake/lib/python3.9/site-packages/cmake/data/share/cmake-3.22/Modules/CMakeDetermineCUDACompiler.cmake:179 (message):
Failed to find nvcc.

Compiler requires the CUDA toolkit. Please set the CUDAToolkit_ROOT
variable.

Call Stack (most recent call first):
CMakeLists.txt:16 (project)

– Configuring incomplete, errors occurred!
See also “/pennylane-lightning-gpu/build/temp.linux-aarch64-3.10/CMakeFiles/CMakeOutput.log”.
See also “/pennylane-lightning-gpu/build/temp.linux-aarch64-3.10/CMakeFiles/CMakeError.log”.
Traceback (most recent call last):
File “/pennylane-lightning-gpu/setup.py”, line 149, in
setup(classifiers=classifiers, **(info))
File “/pennylane-lightning-gpu/pyenv3.10/lib/python3.10/site-packages/setuptools/init.py”, line 153, in setup
return distutils.core.setup(**attrs)
File “/opt/_internal/cpython-3.10.5/lib/python3.10/distutils/core.py”, line 148, in setup
dist.run_commands()
File “/opt/_internal/cpython-3.10.5/lib/python3.10/distutils/dist.py”, line 966, in run_commands
self.run_command(cmd)
File “/opt/_internal/cpython-3.10.5/lib/python3.10/distutils/dist.py”, line 985, in run_command
cmd_obj.run()
File “/pennylane-lightning-gpu/pyenv3.10/lib/python3.10/site-packages/setuptools/command/build_ext.py”, line 79, in run
_build_ext.run(self)
File “/opt/_internal/cpython-3.10.5/lib/python3.10/distutils/command/build_ext.py”, line 340, in run
self.build_extensions()
File “/opt/_internal/cpython-3.10.5/lib/python3.10/distutils/command/build_ext.py”, line 449, in build_extensions
self._build_extensions_serial()
File “/opt/_internal/cpython-3.10.5/lib/python3.10/distutils/command/build_ext.py”, line 474, in _build_extensions_serial
self.build_extension(ext)
File “/pennylane-lightning-gpu/setup.py”, line 87, in build_extension
subprocess.check_call(
File “/opt/_internal/cpython-3.10.5/lib/python3.10/subprocess.py”, line 369, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command ‘[‘cmake’, ‘/pennylane-lightning-gpu’, ‘-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/pennylane-lightning-gpu/build/lib.linux-aarch64-3.10/pennylane_lightning_gpu’, ‘-DPYTHON_EXECUTABLE=/pennylane-lightning-gpu/pyenv3.10/bin/python3’, ‘-DCMAKE_BUILD_TYPE=RelWithDebInfo’, ‘-GNinja’, ‘-DCMAKE_MAKE_PROGRAM=/pennylane-lightning-gpu/pyenv3.10/bin/ninja’, ‘-DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc’, ‘-DENABLE_CLANG_TIDY=0’]’ returned non-zero exit status 1.

The CMake errors in the last post were due to the CUDA install location not being visible inside the docker build container. After copying the CUDA toolkit directory to the pennylane lightning-gpu directory the “COPY ./ /pennylane-lightning-gpu” inside the Dockerfile makes the CUDA toolkit folders available for the docker build. Also CUDAToolkit_ROOT, CMAKE_CUDA_COMPILER, CUDA_TOOLKIT_ROOT_DIR and CUQUANTUM_SDK variables need to be set relative to the container path /pennylane-lightning-gpu in setup.py.

Lightning-gpu wheels for Python 3.7-3.10 are built.

Last few lines of the docker build:

—> Running in bda48adc4921
INFO:auditwheel.main_repair:Repairing PennyLane_Lightning_GPU-0.24.1-cp310-cp310-linux_aarch64.whl
INFO:auditwheel.wheeltools:Previous filename tags: linux_aarch64
INFO:auditwheel.wheeltools:New filename tags: manylinux_2_17_aarch64, manylinux2014_aarch64
INFO:auditwheel.wheeltools:Previous WHEEL info tags: cp310-cp310-linux_aarch64
INFO:auditwheel.wheeltools:New WHEEL info tags: cp310-cp310-manylinux_2_17_aarch64, cp310-cp310-manylinux2014_aarch64
INFO:auditwheel.main_repair:
Fixed-up wheel written to /wheelhouse/PennyLane_Lightning_GPU-0.24.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Excluding [‘libcudart.so.11.0’, ‘libcublasLt.so.11’, ‘libcublas.so.11’, ‘libcustatevec.so.1’]
INFO:auditwheel.main_repair:Repairing PennyLane_Lightning_GPU-0.24.1-cp37-cp37m-linux_aarch64.whl
INFO:auditwheel.wheeltools:Previous filename tags: linux_aarch64
INFO:auditwheel.wheeltools:New filename tags: manylinux_2_17_aarch64, manylinux2014_aarch64
INFO:auditwheel.wheeltools:Previous WHEEL info tags: cp37-cp37m-linux_aarch64
INFO:auditwheel.wheeltools:New WHEEL info tags: cp37-cp37m-manylinux_2_17_aarch64, cp37-cp37m-manylinux2014_aarch64
Excluding [‘libcudart.so.11.0’, ‘libcublasLt.so.11’, ‘libcublas.so.11’, ‘libcustatevec.so.1’]
INFO:auditwheel.main_repair:
Fixed-up wheel written to /wheelhouse/PennyLane_Lightning_GPU-0.24.1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
INFO:auditwheel.main_repair:Repairing PennyLane_Lightning_GPU-0.24.1-cp38-cp38-linux_aarch64.whl
INFO:auditwheel.wheeltools:Previous filename tags: linux_aarch64
INFO:auditwheel.wheeltools:New filename tags: manylinux_2_17_aarch64, manylinux2014_aarch64
INFO:auditwheel.wheeltools:Previous WHEEL info tags: cp38-cp38-linux_aarch64
INFO:auditwheel.wheeltools:New WHEEL info tags: cp38-cp38-manylinux_2_17_aarch64, cp38-cp38-manylinux2014_aarch64
INFO:auditwheel.main_repair:
Fixed-up wheel written to /wheelhouse/PennyLane_Lightning_GPU-0.24.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Excluding [‘libcudart.so.11.0’, ‘libcublasLt.so.11’, ‘libcublas.so.11’, ‘libcustatevec.so.1’]
INFO:auditwheel.main_repair:Repairing PennyLane_Lightning_GPU-0.24.1-cp39-cp39-linux_aarch64.whl
INFO:auditwheel.wheeltools:Previous filename tags: linux_aarch64
INFO:auditwheel.wheeltools:New filename tags: manylinux_2_17_aarch64, manylinux2014_aarch64
INFO:auditwheel.wheeltools:Previous WHEEL info tags: cp39-cp39-linux_aarch64
INFO:auditwheel.wheeltools:New WHEEL info tags: cp39-cp39-manylinux_2_17_aarch64, cp39-cp39-manylinux2014_aarch64
INFO:auditwheel.main_repair:
Fixed-up wheel written to /wheelhouse/PennyLane_Lightning_GPU-0.24.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Excluding [‘libcudart.so.11.0’, ‘libcublasLt.so.11’, ‘libcublas.so.11’, ‘libcustatevec.so.1’]
Removing intermediate container bda48adc4921
—> fa8e127df52e
Successfully built fa8e127df52e
Successfully tagged lightning-gpu-wheels:latest

docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
lightning-gpu-wheels latest fa8e127df52e 29 minutes ago 9.16GB

ls -lt wheelhouse
-rw-r–r-- 1 root root 16506230 Jul 27 01:01 PennyLane_Lightning_GPU-0.24.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
-rw-r–r-- 1 root root 16255981 Jul 27 01:01 PennyLane_Lightning_GPU-0.24.1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
-rw-r–r-- 1 root root 16454057 Jul 27 01:01 PennyLane_Lightning_GPU-0.24.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
-rw-r–r-- 1 root root 16536699 Jul 27 01:01 PennyLane_Lightning_GPU-0.24.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl

After installing the 3.10 wheel and running the script:
import pennylane as qml
dev = qml.device(“lightning.gpu”, wires=1)
@qml.qnode(dev)
def circuit():
qml.PauliX(0)
return qml.expval(qml.PauliZ(0))
circuit()

I get the same warning:

/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/lightning_gpu.py:73: UserWarning: No module named ‘pennylane_lightning_gpu.lightning_gpu_qubit_ops’
warn(str(e), UserWarning)
/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/lightning_gpu.py:413: RuntimeWarning:
!!!#####################################################################################
!!!
!!! WARNING: INSUFFICIENT SUPPORT DETECTED FOR GPU DEVICE WITH lightning.gpu
!!! DEFAULTING TO CPU DEVICE lightning.qubit
!!!
!!!#####################################################################################

warn(

make test-python
outputs an undefined symbol warning:

…/…/…/…/home/arthurlobo/.conda/envs/qml_py310_b/lib/python3.10/site-packages/pennylane_lightning_gpu/lightning_gpu.py:73
/home/arthurlobo/.conda/envs/qml_py310_b/lib/python3.10/site-packages/pennylane_lightning_gpu/lightning_gpu.py:73: UserWarning: /home/arthurlobo/.conda/envs/qml_py310_b/lib/python3.10/site-packages/pennylane_lightning_gpu/lightning_gpu_qubit_ops.cpython-310-aarch64-linux-gnu.so: undefined symbol: custatevecSamplerDestroy

Hi @art, this looks like it’s still related to CUDA because the error says undefined symbol: custatevecSamplerDestroy

It may be related to the ARM architecture.

@CatalinaAlbornoz the custatevecSamplerDestroy function is defined in the cuQuantum libcustatevec.so.1.0.0.41.so library:

From readelf -sW of the .so file (to display the symbol table):
Num: Value Size Type Bind Vis Ndx Name
.
.
.
20538: 00000000000a7308 536 FUNC GLOBAL DEFAULT 12 custatevecSamplerDestroy
.
.
.
but is not getting linked in the docker build.

Hi @art
When we build the wheels for PennyLane we explicitly strip out any CUDA-based libraries from the final product. We do this as it ensures users can install the libraries on their preferred machine, and the dynamic libs should be picked up at runtime (assuming they are on the user’s path).

It looks to me that the cuquantum library is not available on your system we trying to run the above example, or if it is available the binary cannot see it. With the latest cuQuantum release, the path to the libraries in the pip install cuquantum install method move from lib to lib64. It may be possible that the fix we added for this is not present yet in the version you have cloned for compilation.

Can you try explicitly setting the LD_LIBRARY_PATH environment variable to the location of the cuquantum libs on your system (as well as the other associated CUDA libs) and see if that makes a difference to the above issue at run-time?

@mlxd thanks for pointing out the .so dynamic vs .a static distinction. I knew it but phrased my question incorrectly. Checking on the LD_LIBRARY_PATH variable.

Definitely get back to us when you’ve checked!

@isaacdevlugt the CUDA 11.7 and cuQuantum 22.07 libraries (including libcustatevec.so) are in

~/.conda/envs/qml_py38/lib (LD_LIBRARY_PATH variable was set to this path)

and the pennylane lightning gpu libraries:
lightning_gpu_qubit_ops.cpython-38-aarch64-linux-gnu.so and lightning_qubit_ops.cpython-38-aarch64-linux-gnu.so

are in

~/.conda/envs/qml_py38/lib/python3.8/site-packages/pennylane_lightning_gpu

where qml_py38 is my python 3.8 conda environment created
specifically to install cupy-cuda11x (pip install cupy-cuda11x -f https://pip.cupy.dev/aarch64). I ended up having to uninstall cupy-cuda11x since it conflicts with the cupy 11.0 installed when cuQuantum was installed (conda install -c conda-forge cuquantum-python==22.07).

I get a custatevec not initialized error. Following is the output from the
test script:

Name: PennyLane
Version: 0.24.0
Summary: PennyLane is a Python quantum machine learning library by Xanadu Inc.
Home-page: https://github.com/XanaduAI/pennylane
Author:
Author-email:
License: Apache License 2.0
Location: /home/arthurlobo/.conda/envs/qml_py38/lib/python3.8/site-packages
Requires: appdirs, autograd, autoray, cachetools, networkx, numpy, pennylane-lightning, retworkx, scipy, semantic-version, toml
Required-by: PennyLane-Lightning, PennyLane-Lightning-GPU

Platform info: Linux-5.10.65-tegra-aarch64-with-glibc2.26
Python version: 3.8.13
Numpy version: 1.19.5
Scipy version: 1.9.0
Installed devices:

  • lightning.gpu (PennyLane-Lightning-GPU-0.24.1)
  • lightning.qubit (PennyLane-Lightning-0.24.0)
  • default.gaussian (PennyLane-0.24.0)
  • default.mixed (PennyLane-0.24.0)
  • default.qubit (PennyLane-0.24.0)
  • default.qubit.autograd (PennyLane-0.24.0)
  • default.qubit.jax (PennyLane-0.24.0)
  • default.qubit.tf (PennyLane-0.24.0)
  • default.qubit.torch (PennyLane-0.24.0)
    Traceback (most recent call last):
    File “test2.py”, line 3, in
    dev = qml.device(“lightning.gpu”, wires=1)
    File “/home/arthurlobo/.conda/envs/qml_py38/lib/python3.8/site-packages/pennylane/init.py”, line 316, in device
    dev = plugin_device_class(*args, **options)
    File “/media/arthurlobo/QML/pennylane-lightning-gpu-0.24.1/pennylane_lightning_gpu/lightning_gpu.py”, line 109, in init
    self._gpu_state = _gpu_dtype(self._state.dtype)(self._state)
    pennylane_lightning_gpu.lightning_gpu_qubit_ops.PLException: [/pennylane-lightning-gpu/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp][Line:200][Method:StateVectorCudaManaged]: Error in PennyLane Lightning: custatevec not initialized

For the earlier python 3.10 environment
“pytest tests” from the cuQuantum/python directory runs.
(Ref: https://docs.nvidia.com/cuda/cuquantum/python/README.html#running)

test session starts =============================================================
platform linux – Python 3.10.4, pytest-7.1.2, pluggy-1.0.0
rootdir: /media/arthurlobo/QML/cuQuantum/python
plugins: mock-3.8.2, flaky-3.7.0
collected 30031 items

tests/cuquantum_tests/custatevec_tests/test_custatevec.py …EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE [ 0%]
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE [ 0%]
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEsssss [ 1%]
sssssssssssEEEEssssssssEEEE…F [ 1%]
tests/cuquantum_tests/cutensornet_tests/test_circuit_converter.py ssssssss [ 1%]
tests/cuquantum_tests/cutensornet_tests/test_contract.py FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 1%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 1%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 2%]
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF [ 2%]
.
.
.
.
currently at 22%

Hey @art! Thanks for providing this information. I can confirm that our performance team (including @mlxd) is looking into this further. We will update you as soon as we can!

Hi @art
Just to jump back onto this: there was a CUDA context leaked which caused the above errors, as no new GPU RAM was able to be allocated. This will be fixed for the upcoming release of Lightning GPU and go live at the start of next week. I’ll tag here once the PR is merged into the main branch if you wish to explore this early.

1 Like

@mlxd Nice! :muscle: