Hi @roysuman088
Unfortunately lightning.gpu
does not support the NVIDIA M40 Tesla cards, as these are not compatible with NVIDIA cuQuantum, which we rely on for the GPU calls. The minimum supported GPU generation for cuQuantum is the SM7.0 cards, which are V100s and newer. The M40 series are SM5.2. See WARNING: INSUFFICIENT SUPPORT DETECTED FOR GPU DEVICE WITH `lightning.gpu` - #7 by mlxd for another case where this is discussed.
I suspect this may be partly the reason that the runtime is taking a long time, as it is likely falling back to CPU-only operation.
Do you see a warning message when instantiating a lightning.gpu
device on that machine? It should let you know that a compatible GPU device is not found, and so will fall-back to CPU-only operation.
What could potentially be a good work-around is trying to use our lightning.kokkos
research device. This card may support the older GPU generation, and allow you to get better performance. To try this out, feel free to try the following guide, though, you may need to ensure both CUDA libraries and binaries are located on your PATH
and LD_LIBRARY_PATH
first. I will assume you are using Ubuntu, and add the paths as they tend to be found on that OS:
git clone https://github.com/PennyLaneAI/pennylane-lightning-kokkos
cd pennylane-lightning-kokkos & git checkout v0.29.1
export PATH=$PATH:/usr/local/cuda/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib:/usr/local/cuda/lib64
python -m venv pyenv && source ./pyenv/bin/activate
python -m pip install pennylane
BACKEND="CUDA" python -m pip install -e .
The above process creates a local Python virtual environment, installs PennyLane and will try to build Lightning Kokkos for your given GPU architecture. If the generation is too old, it may not be supported.