Problems with using lightning.gpu

Daniel_Wang · January 10, 2024, 1:27pm

Hello,

I tried to use lightning.gpu to have a small test using the following code on Colab T4 GPU. Its CUDA compute capacity as I found online is 7.5. I attached the code in the following so you can easily reproduce the results:>.

! nvidia-smi
! nvcc -V
! pip install pennylane cuquantum-cu12 pennylane-lightning-gpu

import pennylane as qml
import torch
import pdb
import math
from sklearn.datasets import make_moons
import numpy as np
import torch.nn as nn

qml.about()

device = torch.device(‘cuda’ if torch.cuda.is_available() else ‘cpu’)
print(device)

torch.set_default_dtype(torch.float32)
torch.set_default_tensor_type(torch.FloatTensor)

X, y = make_moons(n_samples=200, noise=0.1)

y = y[:, np.newaxis]

data_loader = torch.utils.data.DataLoader(
list(zip(X, y)), batch_size=2, shuffle=True, drop_last=True
)

class QuantumLayer(nn.Module):

def __init__(self, n_qubits):
    super().__init__()
    self.n_qubits = n_qubits
    self.sim_dev = qml.device('lightning.gpu', wires=n_qubits)
    self.show_plot = True

    self.weights = nn.Parameter(torch.rand(2, 2, 3, device = device) * 2 * math.pi)

def QNode(self, inputs, weights):

    @qml.qnode(self.sim_dev, interface = 'torch', diff_method = 'adjoint')
    def qnode(inputs, params):
        qml.templates.AngleEmbedding(inputs, wires=[0, 1])
        qml.templates.StronglyEntanglingLayers(weights, wires=[0, 1])
        return [qml.expval(qml.PauliZ(i)) for i in range(2)]

    return qnode(inputs, weights)

def forward(self, x):
    res = self.QNode(x, self.weights)
    if torch.numel(res[0]) == 1:
        q_out = torch.stack(res).reshape(self.n_qubits, -1).T.float()
    elif torch.numel(res[0]) != 1:
        q_out = torch.cat(res).reshape(self.n_qubits, -1).T.float()
    return q_out

q_layer = QuantumLayer(2)
clayer_2 = torch.nn.Linear(2, 1)

layers = [q_layer, clayer_2]
model = torch.nn.Sequential(*layers)

model.to(device)

opt = torch.optim.SGD(model.parameters(), lr=0.2)
loss = torch.nn.L1Loss()

epochs = 6

for epoch in range(epochs):
running_loss = 0
for xs, ys in data_loader:

    xs, ys = xs.to(device), ys.to(device)

    opt.zero_grad()

    loss_evaluated = loss(model(xs), ys)

    loss_evaluated.backward()
    opt.step()

    running_loss += loss_evaluated
print(running_loss)

The result is the following:

±--------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
±--------------------------------------------------------------------------------------+

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:02:13_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0

Collecting pennylane
Downloading PennyLane-0.34.0-py3-none-any.whl (1.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 14.1 MB/s eta 0:00:00
Collecting cuquantum-cu12
Downloading cuquantum_cu12-23.10.0-py3-none-manylinux2014_x86_64.whl (7.0 kB)
Collecting pennylane-lightning-gpu
Downloading PennyLane_Lightning_GPU-0.34.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.9/6.9 MB 33.2 MB/s eta 0:00:00
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from pennylane) (1.23.5)
Requirement already satisfied: scipy in /usr/local/lib/python3.10/dist-packages (from pennylane) (1.11.4)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from pennylane) (3.2.1)
Collecting rustworkx (from pennylane)
Downloading rustworkx-0.13.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 62.4 MB/s eta 0:00:00
Requirement already satisfied: autograd in /usr/local/lib/python3.10/dist-packages (from pennylane) (1.6.2)
Requirement already satisfied: toml in /usr/local/lib/python3.10/dist-packages (from pennylane) (0.10.2)
Requirement already satisfied: appdirs in /usr/local/lib/python3.10/dist-packages (from pennylane) (1.4.4)
Collecting semantic-version>=2.7 (from pennylane)
Downloading semantic_version-2.10.0-py2.py3-none-any.whl (15 kB)
Collecting autoray>=0.6.1 (from pennylane)
Downloading autoray-0.6.7-py3-none-any.whl (49 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 49.9/49.9 kB 5.8 MB/s eta 0:00:00
Requirement already satisfied: cachetools in /usr/local/lib/python3.10/dist-packages (from pennylane) (5.3.2)
Collecting pennylane-lightning>=0.34 (from pennylane)
Downloading PennyLane_Lightning-0.34.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.1/18.1 MB 62.2 MB/s eta 0:00:00
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from pennylane) (2.31.0)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from pennylane) (4.5.0)
Collecting custatevec-cu12==1.5.0 (from cuquantum-cu12)
Downloading custatevec_cu12-1.5.0-py3-none-manylinux2014_x86_64.whl (38.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 38.4/38.4 MB 5.3 MB/s eta 0:00:00
Collecting cutensornet-cu12==2.3.0 (from cuquantum-cu12)
Downloading cutensornet_cu12-2.3.0-py3-none-manylinux2014_x86_64.whl (2.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.2/2.2 MB 34.6 MB/s eta 0:00:00
Collecting cutensor-cu12<2,>=1.6.1 (from cutensornet-cu12==2.3.0->cuquantum-cu12)
Downloading cutensor_cu12-1.7.0-py3-none-manylinux2014_x86_64.whl (146.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 146.8/146.8 MB 5.1 MB/s eta 0:00:00
Requirement already satisfied: future>=0.15.2 in /usr/local/lib/python3.10/dist-packages (from autograd->pennylane) (0.18.3)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->pennylane) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->pennylane) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->pennylane) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->pennylane) (2023.11.17)
Installing collected packages: cutensor-cu12, custatevec-cu12, semantic-version, rustworkx, cutensornet-cu12, autoray, cuquantum-cu12, pennylane-lightning, pennylane, pennylane-lightning-gpu
Successfully installed autoray-0.6.7 cuquantum-cu12-23.10.0 custatevec-cu12-1.5.0 cutensor-cu12-1.7.0 cutensornet-cu12-2.3.0 pennylane-0.34.0 pennylane-lightning-0.34.0 pennylane-lightning-gpu-0.34.0 rustworkx-0.13.2 semantic-version-2.10.0
Name: PennyLane
Version: 0.34.0
Summary: PennyLane is a Python quantum machine learning library by Xanadu Inc.
Home-page: GitHub - PennyLaneAI/pennylane: PennyLane is a cross-platform Python library for differentiable programming of quantum computers. Train a quantum computer the same way as a neural network.
Author:
Author-email:
License: Apache License 2.0
Location: /usr/local/lib/python3.10/dist-packages
Requires: appdirs, autograd, autoray, cachetools, networkx, numpy, pennylane-lightning, requests, rustworkx, scipy, semantic-version, toml, typing-extensions
Required-by: PennyLane-Lightning, PennyLane-Lightning-GPU

Platform info: Linux-6.1.58±x86_64-with-glibc2.35
Python version: 3.10.12
Numpy version: 1.23.5
Scipy version: 1.11.4
Installed devices:

default.gaussian (PennyLane-0.34.0)
default.mixed (PennyLane-0.34.0)
default.qubit (PennyLane-0.34.0)
default.qubit.autograd (PennyLane-0.34.0)
default.qubit.jax (PennyLane-0.34.0)
default.qubit.legacy (PennyLane-0.34.0)
default.qubit.tf (PennyLane-0.34.0)
default.qubit.torch (PennyLane-0.34.0)
default.qutrit (PennyLane-0.34.0)
null.qubit (PennyLane-0.34.0)
lightning.gpu (PennyLane-Lightning-GPU-0.34.0)
lightning.qubit (PennyLane-Lightning-0.34.0)
cuda
/usr/local/lib/python3.10/dist-packages/torch/init.py:614: UserWarning: torch.set_default_tensor_type() is deprecated as of PyTorch 2.1, please use torch.set_default_dtype() and torch.set_default_device() as alternatives. (Triggered internally at …/torch/csrc/tensor/python_tensor.cpp:451.)
_C._set_default_tensor_type(t)
/usr/local/lib/python3.10/dist-packages/pennylane_lightning/lightning_gpu/lightning_gpu.py:74: UserWarning: libcublas.so.11: cannot open shared object file: No such file or directory
warn(str(e), UserWarning)
/usr/local/lib/python3.10/dist-packages/pennylane_lightning/lightning_gpu/lightning_gpu.py:995: UserWarning:
"Pre-compiled binaries for lightning.gpu are not available. Falling back to "
"using the Python-based default.qubit implementation. To manually compile from "
"source, follow the instructions at "
“https://pennylane-lightning.readthedocs.io/en/latest/installation.html.”,

warn(

My question is that I dont understand why it is not using lightning.gpu. I see there are similar questions posed by other users, but I did not see how those would help this case. Since this is implemented in COLAB, it should easily be reproduced. Thank you for your help!:>

mlxd · January 10, 2024, 7:46pm

Hi @Daniel_Wang

Can you return this with cuquantum-cu11 or with custatevec-cu11. LightningGPU does not currently support CUDA 12.

mlxd · January 10, 2024, 7:48pm

Note, that if you have the CUDA 12 SDK installed, you will also need to install the CUDA 11 runtime libraries to support LightningGPUs operation. You can get all of the CUDA 11 required libraries as:

pip install custatevec_cu11 nvidia-cuda-runtime-cu11 nvidia-cusparse-cu11 nvidia-cublas-cu11

Daniel_Wang · January 10, 2024, 8:21pm

Thanks! That solved the problem. But it is just slower than using lightning.qubit which I guess is due to initiating the transfer between CPU and GPU?

isaacdevlugt · January 11, 2024, 5:37pm

Hey @Daniel_Wang, lightning.qubit might outperform lightning.gpu for some problems sizes due to overheads (which is what I think you’re saying at the end of your last post). As you increase num_qubits, things should eventually trade-off in the favour of lightning.gpu (this also depends on the calibre of GPU and CPU that you’re simulating on… @mlxd would be the expert there).

Hope this helps!

Daniel_Wang · January 12, 2024, 10:39am

Yes, that is what I am thinking about! Thanks

isaacdevlugt · January 12, 2024, 2:20pm

Let us know if you have any other issues

Topic		Replies	Views
Pennylane lightning.gpu error PennyLane Help	2	180	May 22, 2024
Issues Installing and Running PennyLane Lightning GPU PennyLane Help	17	3035	February 20, 2025
Pennylane Lightning GPU not working PennyLane Help	3	468	August 1, 2023
Lightning.gpu doesn't seem to use GPUs PennyLane Lightning	3	562	January 17, 2024
The error when using lightning.gpu PennyLane Lightning	3	475	November 8, 2023

Problems with using lightning.gpu

Related topics