Lightning-GPU for POWER9

I am unable to install Lightning-GPU 0.27 on an IBM HPC POWER9/RHEL system with access to V100 GPUs.

Conda environment python3.8
PyTorch 1.10.2_cuda11.2_py38_1
cuQuantum 22.07.1.14
gcc version 8.4.1 20200928 (Red Hat 8.4.1-1) (GCC)
cmake version 3.22.1

I get the output as in the following log after running “python -m pip install -e .” from the pennylane-lightning-gpu folder.

Any suggestions?

Log:

(qml_py38_test) python
Python 3.8.13 (default, Mar 28 2022, 11:00:56)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type “help”, “copyright”, “credits” or “license” for more information.

import pennylane as qml
qml.about()
Name: PennyLane
Version: 0.26.0
Summary: PennyLane is a Python quantum machine learning library by Xanadu Inc.
Home-page: GitHub - PennyLaneAI/pennylane: PennyLane is a cross-platform Python library for differentiable programming of quantum computers. Train a quantum computer the same way as a neural network.
Author:
Author-email:
License: Apache License 2.0
Location: /p/home/aplobo/miniconda3/envs/qml_py38_test/lib/python3.8/site-packages
Requires: appdirs, autograd, autoray, cachetools, networkx, numpy, pennylane-lightning, retworkx, scipy, semantic-version, toml
Required-by: PennyLane-Lightning

Platform info: Linux-4.18.0-305.65.1.el8_4.ppc64le-ppc64le-with-glibc2.17
Python version: 3.8.13
Numpy version: 1.22.3
Scipy version: 1.7.3
Installed devices:

  • lightning.gpu (PennyLane-Lightning-GPU-0.27.0.dev0)
  • default.gaussian (PennyLane-0.26.0)
  • default.mixed (PennyLane-0.26.0)
  • default.qubit (PennyLane-0.26.0)
  • default.qubit.autograd (PennyLane-0.26.0)
  • default.qubit.jax (PennyLane-0.26.0)
  • default.qubit.tf (PennyLane-0.26.0)
  • default.qubit.torch (PennyLane-0.26.0)
  • default.qutrit (PennyLane-0.26.0)
  • lightning.qubit (PennyLane-Lightning-0.26.0)

Hi @art, I see that you’re using the development version of lightning-gpu. Do you get the same error if you use PennyLane-Lightning-GPU-0.26.2? This is the stable version.

@CatalinaAlbornoz 0.26.2 is not available:

pip install pennylane-lightning[gpu]==0.26.2
ERROR: Could not find a version that satisfies the requirement pennylane-lightning[gpu]==0.26.2 (from versions: 0.11.0, 0.12.0, 0.14.0, 0.15.0, 0.15.1, 0.17.0, 0.18.0, 0.19.0, 0.20.0, 0.20.1, 0.20.2, 0.21.0, 0.22.0, 0.22.1, 0.23.0, 0.24.0, 0.25.0, 0.25.1, 0.26.0, 0.26.1)
ERROR: No matching distribution found for pennylane-lightning[gpu]==0.26.2

0.26.1 gives similar errors

however qml.about() shows:

qml.about()
Name: PennyLane
Version: 0.26.0
Summary: PennyLane is a Python quantum machine learning library by Xanadu Inc.
Home-page: GitHub - PennyLaneAI/pennylane: PennyLane is a cross-platform Python library for differentiable programming of quantum computers. Train a quantum computer the same way as a neural network.
Author:
Author-email:
License: Apache License 2.0
Location: /p/home/aplobo/miniconda3/envs/qml_py38_test/lib/python3.8/site-packages
Requires: appdirs, autograd, autoray, cachetools, networkx, numpy, pennylane-lightning, retworkx, scipy, semantic-version, toml
Required-by: PennyLane-Lightning

Platform info: Linux-4.18.0-305.65.1.el8_4.ppc64le-ppc64le-with-glibc2.17
Python version: 3.8.13
Numpy version: 1.22.3
Scipy version: 1.7.3
Installed devices:

  • lightning.qubit (PennyLane-Lightning-0.26.1)
  • default.gaussian (PennyLane-0.26.0)
  • default.mixed (PennyLane-0.26.0)
  • default.qubit (PennyLane-0.26.0)
  • default.qubit.autograd (PennyLane-0.26.0)
  • default.qubit.jax (PennyLane-0.26.0)
  • default.qubit.tf (PennyLane-0.26.0)
  • default.qubit.torch (PennyLane-0.26.0)
  • default.qutrit (PennyLane-0.26.0)

test code (test.py):
import pennylane as qml
dev = qml.device(“lightning.gpu”, wires=1)
@qml.qnode(dev)
def circuit():
qml.PauliX(0)
return qml.expval(qml.PauliZ(0))
circuit()

output:
python test.py
Traceback (most recent call last):
File “test.py”, line 2, in
dev = qml.device(“lightning.gpu”, wires=1)
File “/p/home/aplobo/miniconda3/envs/qml_py38_test/lib/python3.8/site-packages/pennylane/init.py”, line 332, in device
raise DeviceError(“Device does not exist. Make sure the required plugin is installed.”)
pennylane._device.DeviceError: Device does not exist. Make sure the required plugin is installed.

Hi @art, this is very strange because our stable version is v0.26.2. Have you tried creating a clean environment and trying again?

@CatalinaAlbornoz 0.26.2 is not available. See my earlier post.

While doing the build using the following lines from

#To build a wheel from the package sources using the direct SDK path:

cmake -BBuild -DENABLE_CLANG_TIDY=on -DCUQUANTUM_SDK=
cmake --build ./Build --verbose
python -m pip install wheel
python setup.py build_ext --cuquantum=
python setup.py bdist_wheel

This line goes to 83% and spits out an error:
cmake --build ./Build --verbose

Here is the relevant output:

[ 83%] Building CXX object _deps/pennylane_lightning-build/pennylane_lightning/src/gates/CMakeFiles/lightning_gates_register_kernels_default.dir/RegisterKernels_Default.cpp.o
cd /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/pennylane_lightning-build/pennylane_lightning/src/gates && /p/app/compiler/ppc64le/el8/nvidia/hpc_sdk/Linux_ppc64le/22.2/compilers/bin/nvc++ -DKOKKOS_DEPENDENCE -I/p/home/aplobo/pennylane-lightning-gpu/Build/_deps/pennylane_lightning-src/pennylane_lightning/src/gates -I/p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkos-build -I/p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkos-build/core/src -I/p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkos-src/core/src -I/p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkos-build/containers/src -I/p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkos-src/containers/src -I/p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkos-build/algorithms/src -I/p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkos-src/algorithms/src -I/p/home/aplobo/pennylane-lightning-gpu/Build/_deps/pennylane_lightning-src/pennylane_lightning/src/util -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-build/src -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-src/src -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-src/src/impl -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-build/src/impl -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-src/src/impl/tpls -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-src/src/blas -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-src/src/blas/impl -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-src/src/sparse -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-src/src/sparse/impl -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-src/src/graph -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-src/src/graph/impl -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-src/src/batched -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-src/src/batched/dense -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-src/src/batched/dense/impl -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-src/src/batched/sparse -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-src/src/batched/sparse/impl -isystem /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/kokkoskernels-src/src/common -O2 -gopt -fPIC -mp -Wall -Wextra -Werror -D_ENABLE_KOKKOS=1 -std=gnu++20 -MD -MT _deps/pennylane_lightning-build/pennylane_lightning/src/gates/CMakeFiles/lightning_gates_register_kernels_default.dir/RegisterKernels_Default.cpp.o -MF CMakeFiles/lightning_gates_register_kernels_default.dir/RegisterKernels_Default.cpp.o.d -o CMakeFiles/lightning_gates_register_kernels_default.dir/RegisterKernels_Default.cpp.o -c /p/home/aplobo/pennylane-lightning-gpu/Build/_deps/pennylane_lightning-src/pennylane_lightning/src/gates/RegisterKernels_Default.cpp
“/p/home/aplobo/pennylane-lightning-gpu/Build/_deps/pennylane_lightning-src/pennylane_lightning/src/util/ConstantUtil.hpp”, line 25: catastrophic error: cannot open source file “compare”
#include
^

1 catastrophic error detected in the compilation of “/p/home/aplobo/pennylane-lightning-gpu/Build/_deps/pennylane_lightning-src/pennylane_lightning/src/gates/RegisterKernels_Default.cpp”.

gcc version used in the conda environment is 12.2.0.

Has lightning-gpu been built for a powerpc64le/RHEL system before and you would have the versions of the build tools used including a list of build from source commands which specify versions?

Hi @art

I have some comments and usggestions that may help here:

  • Just a quick comment: lightning.gpu is currently only released to PyPI for x86_64. For ARM/PPC64LE you need to build from scratch (as you are likely aware). We will change this in future, but for now this is the preferred method of delivery on PowerPC due to lack of available hardware for validation.
  • To build lightning.gpu we also require lightning.qubit as a dependency, as they share some overlapping utilities. One requirement we have for building on such platforms is that warnings are enabled as errors, so that we can always catch issues early and often. It may be possible for the build to proceed by disabling this behaviour with the CMake arguments -DENABLE_CLANG_TIDY=off -DENABLE_WARNINGS=off. I’d suggest first trying this and letting us know.
  • Due to some problems we have with GCC 12 (mostly on x86_64, and similar to here https://github.com/pytorch/pytorch/issues/77939), we recommend only building with GCC 11.x currently. Though, we are not sure how well the PowerPC target is supported in GCC 11.x for CPP20 features, so your mileage here may vary.
  • It also looks like you are building using the nvc++ compiler, rather than GCC. We do not directly test nor explicitly support that compiler, as it tends to lag behind both GCC and Clang for C++17 and C++20 features. I’d recommend explicitly setting the compiler to GCC with CXX=g++-11 cmake -BBuild -DENABLE_CLANG_TIDY=on -DCUQUANTUM_SDK=... -DCMAKE_CXX_COMPILER=g++-11 and continuing the build process.
  • If you cannot switch to GCC directly, and must use nvc++, I’d suggest upgrading the compiler version. It appears the span header #include<span> cannot be found with the version you are using, and this is a required header to build lightning.qubit.

Feel free to try the above and let us know how things go.

1 Like

@mlxd gcc 11.3 gives argument (-mtune and -march) errors while running configure:

C=powerpc64le-conda-linux-gnu-gcc CXX=powerpc64le-conda-linux-gnu-g++ cmake -BBuild -DENABLE_CLANG_TIDY=off -DENABLE_WARNINGS=off -DCMAKE_CXX_COMPILER=powerpc64le-conda-linux-gnu-g++ -DCMAKE_C_COMPILER=powerpc64le-conda-linux-gnu-gcc -DCUQUANTUM_SDK=/p/home/aplobo/cuquantum-linux-ppc64le-22.07.1.14-archive
░█░░░▀█▀░█▀▀░█░█░▀█▀░█▀█░▀█▀░█▀█░█▀▀░░░░█▀▀░█▀█░█░█
░█░░░░█░░█░█░█▀█░░█░░█░█░░█░░█░█░█░█░░░░█░█░█▀▀░█░█
░▀▀▀░▀▀▀░▀▀▀░▀░▀░░▀░░▀░▀░▀▀▀░▀░▀░▀▀▀░▀░░▀▀▀░▀░░░▀▀▀

– The CXX compiler identification is GNU 11.3.0
– The C compiler identification is GNU 11.3.0
– The CUDA compiler identification is NVIDIA 11.2.152
– Detecting CXX compiler ABI info
– Detecting CXX compiler ABI info - done
– Check for working CXX compiler: /p/home/aplobo/miniconda3/envs/py38/bin/powerpc64le-conda-linux-gnu-g++ - skipped
– Detecting CXX compile features
– Detecting CXX compile features - done
– Detecting C compiler ABI info
– Detecting C compiler ABI info - failed
– Check for working C compiler: /p/home/aplobo/miniconda3/envs/py38/bin/powerpc64le-conda-linux-gnu-gcc
– Check for working C compiler: /p/home/aplobo/miniconda3/envs/py38/bin/powerpc64le-conda-linux-gnu-gcc - broken
CMake Error at /p/home/aplobo/.local/lib/python3.6/site-packages/cmake/data/share/cmake-3.24/Modules/CMakeTestCCompiler.cmake:69 (message):
The C compiler

"/p/home/aplobo/miniconda3/envs/py38/bin/powerpc64le-conda-linux-gnu-gcc"

is not able to compile a simple test program.

It fails with the following output:

Change Dir: /p/home/aplobo/pennylane-lightning-gpu/Build/CMakeFiles/CMakeTmp

Run Build Command(s):/usr/bin/gmake -f Makefile cmTC_92e58/fast && /usr/bin/gmake  -f CMakeFiles/cmTC_92e58.dir/build.make CMakeFiles/cmTC_92e58.dir/build
gmake[1]: Entering directory '/p/home/aplobo/pennylane-lightning-gpu/Build/CMakeFiles/CMakeTmp'
Building C object CMakeFiles/cmTC_92e58.dir/testCCompiler.c.o
/p/home/aplobo/miniconda3/envs/py38/bin/powerpc64le-conda-linux-gnu-gcc   -mcpu=power8 -mtune=power8 -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O3 -pipe -isystem /p/home/aplobo/miniconda3/envs/py38/include -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /p/home/aplobo/miniconda3/envs/py38/include -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /p/home/aplobo/miniconda3/envs/py38/include  -o CMakeFiles/cmTC_92e58.dir/testCCompiler.c.o -c /p/home/aplobo/pennylane-lightning-gpu/Build/CMakeFiles/CMakeTmp/testCCompiler.c
powerpc64le-conda-linux-gnu-gcc: error: unrecognized argument in option '-mtune=haswell'
powerpc64le-conda-linux-gnu-gcc: note: valid arguments to '-mtune=' are: 401 403 405 405fp 440 440fp 464 464fp 476 476fp 505 601 602 603 603e 604 604e 620 630 740 7400 7450 750 801 821 823 8540 8548 860 970 G3 G4 G5 a2 cell e300c2 e300c3 e500mc e500mc64 e5500 e6500 ec603e native power10 power3 power4 power5 power5+ power6 power6x power7 power8 power9 powerpc powerpc64 powerpc64le rs64 titan
powerpc64le-conda-linux-gnu-gcc: error: unrecognized argument in option '-mtune=haswell'
powerpc64le-conda-linux-gnu-gcc: note: valid arguments to '-mtune=' are: 401 403 405 405fp 440 440fp 464 464fp 476 476fp 505 601 602 603 603e 604 604e 620 630 740 7400 7450 750 801 821 823 8540 8548 860 970 G3 G4 G5 a2 cell e300c2 e300c3 e500mc e500mc64 e5500 e6500 ec603e native power10 power3 power4 power5 power5+ power6 power6x power7 power8 power9 powerpc powerpc64 powerpc64le rs64 titan
powerpc64le-conda-linux-gnu-gcc: error: unrecognized command-line option '-march=nocona'
powerpc64le-conda-linux-gnu-gcc: error: unrecognized command-line option '-march=nocona'
gmake[1]: *** [CMakeFiles/cmTC_92e58.dir/build.make:78: CMakeFiles/cmTC_92e58.dir/testCCompiler.c.o] Error 1
gmake[1]: Leaving directory '/p/home/aplobo/pennylane-lightning-gpu/Build/CMakeFiles/CMakeTmp'
gmake: *** [Makefile:127: cmTC_92e58/fast] Error 2

CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
CMakeLists.txt:12 (project)

– Configuring incomplete, errors occurred!
See also “/p/home/aplobo/pennylane-lightning-gpu/Build/CMakeFiles/CMakeOutput.log”.
See also “/p/home/aplobo/pennylane-lightning-gpu/Build/CMakeFiles/CMakeError.log”.


powerpc64le-conda-linux-gnu-g++ --version
powerpc64le-conda-linux-gnu-g++ (conda-forge gcc 11.3.0-19) 11.3.0
Copyright © 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

(py38) powerpc64le-conda-linux-gnu-gcc --version
powerpc64le-conda-linux-gnu-gcc (conda-forge gcc 11.3.0-19) 11.3.0
Copyright © 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Hi @art

I think there may be a few things going on here:

  • It looks as though somewhere in your command there are targets being set for x86_64 CPUs, which clearly aren’t supported on PowerPC. Was there a configuration script run on an x86_64 processor prior to aiming to compiler for PowerPC? -mtune=haswell will attempt to optimize the build for older Intel architecture processors. We do not set any such strict requirements in our build-files, so maybe it is worth trying to delete all cmake caches and trying again. To avoid populating the source directory directly with CMake generated files, I suggest using the -BBuild argument to CMake, which will create a Build directory and place everything in there. The compilation can then happen with cmake --build ./Build.

  • In my experience conda-supplied C and C++ compilers can be notorious for interfering with system libraries (supplying their own libc for example) and options (potentially providing additional flags with their compiler wrapper). I suspect those compilers you have specified may instead be wrappers to conda-supplied gcc/g++, and may be potentially causing problems. Is it possible to use a compiler that is provided by the system instead? Since you are on a RedHat system, it should be possible to run the following:

yum install centos-release-scl-rh -y
yum install devtoolset-11-gcc-c++ -y
source /opt/rh/devtoolset-11/enable -y

The first 2 lines require sudo privileges, so if you are not able to run them, I suggest asking the sysadmins to enable support for the above toolchains. The last line will enable GCC 11 on your terminal, and allow you to build natively with the system-provided compiler.

Feel free to let us know your progress on this.

@mlxd sysadmin provided access to system gcc/g++ 11.2 compilers.
Now there is an error:
“unsupported GNU version! gcc versions later than 10 are
not supported!”

gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/rh/gcc-toolset-11/root/usr/libexec/gcc/ppc64le-redhat-linux/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: ppc64le-redhat-linux
Configured with: …/configure --enable-bootstrap --enable-languages=c,c++,fortran,lto --prefix=/opt/rh/gcc-toolset-11/r oot/usr --mandir=/opt/rh/gcc-toolset-11/root/usr/share/man --infodir=/opt/rh/gcc-toolset-11/root/usr/share/info --with- bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-ta rgets=powerpcle-linux --disable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enab le-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugi n --enable-initfini-array --with-isl=/builddir/build/BUILD/gcc-11.2.1-20210728/obj-ppc64le-redhat-linux/isl-install --e nable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-secureplt --with-long-do uble-128 --with-cpu-32=power8 --with-tune-32=power8 --with-cpu-64=power8 --with-tune-64=power8 --build=ppc64le-redhat-l inux
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.2.1 20210728 (Red Hat 11.2.1-1) (GCC)

g++ --version
g++ (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1)
Copyright © 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

nvcc --version
nvcc: NVIDIA ® Cuda compiler driver
Copyright © 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:13:23_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

C=gcc CXX=g++ cmake -BBuild -DENABLE_CLANG_TIDY=off -DENABLE_WARNINGS=off -DCMAKE_CXX_COMPILER=g++ -DCMAKE_C_COMPILER=gcc -DCUQUANTUM_SDK=/p/home/aplobo/cuquantum-linux-ppc64le-22.07.1.14-archive ░█░░░▀█▀░█▀▀░█░█░▀█▀░█▀█░▀█▀░█▀█░█▀▀░░░░█▀▀░█▀█░█░█
░█░░░░█░░█░█░█▀█░░█░░█░█░░█░░█░█░█░█░░░░█░█░█▀▀░█░█
░▀▀▀░▀▀▀░▀▀▀░▀░▀░░▀░░▀░▀░▀▀▀░▀░▀░▀▀▀░▀░░▀▀▀░▀░░░▀▀▀

– The CXX compiler identification is GNU 11.2.1
– The C compiler identification is GNU 11.2.1
CMake Error at /p/home/aplobo/.local/lib/python3.6/site-packages/cmake/data/share/cmake-3.24/Modules/CMakeDetermineCompilerId.cmake:739 (message):
Compiling the CUDA compiler identification source file
“CMakeCUDACompilerId.cu” failed.

Compiler: /usr/local/cuda/bin/nvcc

Build flags:

Id flags: --keep;–keep-dir;tmp -v

The output was:

1

#$ NVVM_BRANCH=nvvm

#$ SPACE=

#$ CUDART=cudart

#$ HERE=/usr/local/cuda/bin

#$ THERE=/usr/local/cuda/bin

#$ TARGET_SIZE=

#$ TARGET_DIR=

#$ TARGET_DIR=targets/ppc64le-linux

#$ TOP=/usr/local/cuda/bin/…

#$ NVVMIR_LIBRARY_DIR=/usr/local/cuda/bin/…/nvvm/libdevice

#$
LD_LIBRARY_PATH=/usr/local/cuda/bin/…/lib:/opt/rh/gcc-toolset-11/root/usr/lib64:/opt/rh/gcc-toolset-11/root/usr/lib/ :/opt/rh/gcc-toolset-11/root/usr/mkl/lib/:/usr/local/cuda/lib64:/p/app/openmpi/ppc64le/el8/nvhpc/22.2/openmpi-4.1.2/lib :/p/app/compiler/ppc64le/el8/nvidia/hpc_sdk/Linux_ppc64le/22.2/comm_libs/nvshmem/lib:/p/app/compiler/ppc64le/el8/nvidia /hpc_sdk/Linux_ppc64le/22.2/comm_libs/nccl/lib:/p/app/compiler/ppc64le/el8/nvidia/hpc_sdk/Linux_ppc64le/22.2/comm_libs/ openmpi4/openmpi-4.0.5/lib:/p/app/compiler/ppc64le/el8/nvidia/hpc_sdk/Linux_ppc64le/22.2/math_libs/lib64:/p/app/compile r/ppc64le/el8/nvidia/hpc_sdk/Linux_ppc64le/22.2/compilers/lib:/p/app/compiler/ppc64le/el8/nvidia/hpc_sdk/Linux_ppc64le/ 22.2/cuda/lib64:/opt/ibm/lsf/10.1/linux3.10-glibc2.17-ppc64le/lib

#$
PATH=/usr/local/cuda/bin/…/nvvm/bin:/usr/local/cuda/bin:/opt/rh/gcc-toolset-11/root/usr/bin/:/p/home/aplobo/minicond a3/envs/py38/bin:/p/home/aplobo/miniconda3/condabin:/p/home/aplobo/.local/bin:/p/home/aplobo/bin:/opt/ibm/lsf/10.1/linu x3.10-glibc2.17-ppc64le/etc:/opt/ibm/lsf/10.1/linux3.10-glibc2.17-ppc64le/bin:/usr/local/cuda/bin:/p/app/openmpi/ppc64l e/el8/nvhpc/22.2/openmpi-4.1.2/bin:/p/app/compiler/ppc64le/el8/nvidia/hpc_sdk/Linux_ppc64le/22.2/comm_libs/nvshmem/bin: /p/app/compiler/ppc64le/el8/nvidia/hpc_sdk/Linux_ppc64le/22.2/comm_libs/nccl/bin:/p/app/compiler/ppc64le/el8/nvidia/hpc _sdk/Linux_ppc64le/22.2/comm_libs/openmpi4/openmpi-4.0.5/bin:/p/app/compiler/ppc64le/el8/nvidia/hpc_sdk/Linux_ppc64le/2 2.2/profilers/bin:/p/app/compiler/ppc64le/el8/nvidia/hpc_sdk/Linux_ppc64le/22.2/compilers/bin:/p/app/compiler/ppc64le/e l8/nvidia/hpc_sdk/Linux_ppc64le/22.2/cuda/bin:/usr/cta/unsupported/BC:/usr/share/Modules/bin:/usr/brl/bin:/usr/krb5/bin :/usr/brl/bin:/usr/sbin:/sbin:/usr/bin:/bin:/usr/local/bin:/usr/local/sbin

#$ INCLUDES="-I/usr/local/cuda/bin/…/targets/ppc64le-linux/include"

#$ LIBRARIES= “-L/usr/local/cuda/bin/…/targets/ppc64le-linux/lib/stubs”
“-L/usr/local/cuda/bin/…/targets/ppc64le-linux/lib”

#$ CUDAFE_FLAGS=

#$ PTXAS_FLAGS=

#$ rm tmp/a_dlink.reg.c

#$ gcc -D__CUDA_ARCH__=520 -E -x c++ -DCUDA_DOUBLE_MATH_FUNCTIONS
-D__CUDACC__ -D__NVCC__
“-I/usr/local/cuda/bin/…/targets/ppc64le-linux/include”
-D__CUDACC_VER_MAJOR__=11 -D__CUDACC_VER_MINOR__=2
-D__CUDACC_VER_BUILD__=152 -D__CUDA_API_VER_MAJOR__=11
-D__CUDA_API_VER_MINOR__=2 -include “cuda_runtime.h”
“CMakeCUDACompilerId.cu” -o “tmp/CMakeCUDACompilerId.cpp1.ii”

In file included from
/usr/local/cuda/bin/…/targets/ppc64le-linux/include/cuda_runtime.h:83,

               from <command-line>:

/usr/local/cuda/bin/…/targets/ppc64le-linux/include/crt/host_config.h:139:2:
error: #error – unsupported GNU version! gcc versions later than 10 are
not supported! The nvcc flag ‘-allow-unsupported-compiler’ can be used to
override this version check; however, using an unsupported host compiler
may cause compilation failure or incorrect run time execution. Use at your
own risk.

139 | #error -- unsupported GNU version! gcc versions later than 10 are not supported! The nvcc flag '-allow-unsupp                                      orted-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compi                                      lation failure or incorrect run time execution. Use at your own risk.
    |  ^~~~~

–error 0x1 –

Call Stack (most recent call first):
/p/home/aplobo/.local/lib/python3.6/site-packages/cmake/data/share/cmake-3.24/Modules/CMakeDetermineCompilerId.cmake: 6 (CMAKE_DETERMINE_COMPILER_ID_BUILD)
/p/home/aplobo/.local/lib/python3.6/site-packages/cmake/data/share/cmake-3.24/Modules/CMakeDetermineCompilerId.cmake: 48 (__determine_compiler_id_test)
/p/home/aplobo/.local/lib/python3.6/site-packages/cmake/data/share/cmake-3.24/Modules/CMakeDetermineCUDACompiler.cmake:307 (CMAKE_DETERMINE_COMPILER_ID)
CMakeLists.txt:12 (project)

– Configuring incomplete, errors occurred!
See also “/p/home/aplobo/pennylane-lightning-gpu/Build/CMakeFiles/CMakeOutput.log”.
See also “/p/home/aplobo/pennylane-lightning-gpu/Build/CMakeFiles/CMakeError.log”.

Hi @art

PowerPC systems can always be somewhat of a challenge. It looks like the installed version of the CUDA runtime (and potentially the driver) may be too old to support this work. Your installed version of CUDA 11.2 seems to be unsupported with newer versions of GCC.

In this case, there are two options:

  • See if you can request a lower version of GCC to be installed, potentially with yum install devtoolset-10-gcc-c++ -y && source /opt/rh/devtoolset-10/enable -y
  • OR, the preferred method, update the CUDA installation to a more recent version, as supported by the cuQuantum docs. We perform most R&D with CUDA 11.7, and can guarantee it works well with newer versions of GCC. If a system-wide install isn’t allowed/supported it should be possible to do a local installation of a more updated CUDA runtime, and build against that. Though, I’d recommend a sysadmin to aid with this in a controlled manner if possible.

Let me know if you require further help with this.

I am using the lightning.qubit device since the lightning.gpu was not compiling.
When will you test/finalize PPC install of lightning.gpu? I noticed the star next to PPC on slide 10 (Large scale Hybrid Quantum Workflows with PennyLane - YouTube). For hybrid quantum classical DNN training (clayer, clayer, qlayer, clayer) on multiple (6x, 12x) V100 GPUs using PyTorch Distributed Data Parallel library with 4 qubits for data and 4 ancilla qubits for measurement what is the speed improvement of lightning.gpu vs lightning.qubit? qlayer is made up of 10 layer deep (repeated) H gates, RY gates for data encoding, parametric RY, RZ gates, CNOT gates for entangling and the Pauli-Z measurements.

Hi @art , I’m sorry we hadn’t responded in a long time! Lee or myself will be back soon with an answer.

Hi @art
It will not be possible to give an estimate on improvement of the workload for lightning.qubit vs lightning.gpu. Right now we can confirm lightning.gpu builds and runs on Power machines, but this must be done manually due to the general unavailability of these machines in CI/CD infrastructure. To build lightning.qubit requires running an emulated PPC64LE environment, which is unfortunately too slow to allow build of lightning.gpu right now. We are trying to identify a way to better support this, but currently the manual build process is the best way to get up and running.

Taking a guess however, you shouldn’t see much benefit in using lightning.gpu below 20 qubits. This is largely due to the overheads in setting up the GPU devices on the CUDA side, and something that we cannot really mitigate. For single-digit qubit counts, lightning.qubit should be faster, due to the less overall overheads needed to get it running.

We have a guide for the SOSCIP system Mist (V100 + Power9) that should work here also, assuming the CUDA version and compilers are sufficiently modern on your system:

  1. Load the required modules for running on the cluster. module load anaconda3
  2. Create a local conda environment, and ensure you install a modern Python version. Next, load that environment:
conda create -p conda_pl python=3.10
conda activate ./conda_pl/
  1. Install numpy and scipy built for Power systems through conda — this is required due to https://pypi.org/project/scipy/ not releasing wheels for the Power platform:
conda install numpy scipy
  1. Install PennyLane and PennyLane Lightning as normal:
python -m pip install pennylane

The next step may require some adjustment, depending on your system compilers, and CUDA version available. Though, it is necessary to use at least GCC10, and preferably CUDA 11.5 or above

  1. Ensure the CUDA and GCC10 toolkits are loaded:
module load cuda/11.6.2 gcc/10.3.0
  1. We require the NVIDIA cuQuantum library to support the build, which is installed via:
conda install -c conda-forge cuquantum
  1. Next, clone the pennylane-lightning repository, checkout the latest release version and build explicitly with the given CUDA and GCC versions:
git clone https://github.com/PennyLaneAI/pennylane-lightning-gpu
cd pennylane-lightning-gpu
git checkout v0.27.0
  1. Next, we need to set environment variables to expose the cuQuantum libraries, and build the package:
export CUQUANTUM_SDK=<path to your conda env>/conda_pl/
python -m pip install -e ./
  1. The above should ideally pull down all required packages, and compile
    **lightning.gpu**. Once completed, verify no errors are generated by
    running the following on a login node (or compute node if you are
    working in your project directory):
python -c "import pennylane as qml; qml.device(\"lightning.qubit\", wires=1); qml.device(\"lightning.gpu\", wires=1)"

Feel free to let me know if you have further questions. Also, aan you provide us with the errors you observed when compiling? We may be able to offer some suggestions.

@mlxd the highest cuda version available is 11.2 and gcc 11.2.1 is available.
Sysadmins have not provided a timeframe for upgrades. Any tweak to the pennylane lightning-gpu build procedure possible preferably with python 3.8 conda environment? I have Distributed Data Parallel code (for accelerating the classical side of the hybrid networks) running with lightning.qubit and another one with strawberryfields.fock (for qumode circuits) devices in the above environment with few wires (<10) but am planning for >20.