Default.qubit.tf

Hi All,

I am trying to run the first block of the example code on the default_qubit_tf doc page. If I turn off cuda with
os.environ[“CUDA VISIBLE DEVICES”]="-1", the code runs fine.

However, if I don’t turn off cuda, I run into problems. I tried to look through the documentation and the forum and it is not clear to me whether I can use default.qubit.tf with GPU. Does anyone know?

Hi @sss441803! GPU support is something we are striving to improve, so any feedback with regards to GPU use is greatly appreciated :slightly_smiling_face:

However, if I don’t turn off cuda, I run into problems

Could you post the code you are executing, as well as the full traceback you are receiving?

The code is copied from here:

import pennylane as qml
import tensorflow as tf
 
dev = qml.device("default.qubit.tf", wires=1)
     
@qml.qnode(dev, interface="tf", diff_method="backprop")
def circuit(x):
    qml.RX(x[1], wires=0)
    qml.Rot(x[0], x[1], x[2], wires=0
    return qml.expval(qml.PauliZ(0))
     
weights = tf.Variable([0.2, 0.5, 0.1])
with tf.GradientTape() as tape:
    res = circuit(weights)
     
print(tape.gradient(res, weights))

I have tensorflow==2.2.0, PennyLane==0.16.0, cudatoolkit 11.3.1 and NVIDIA GeForce 930M. With CPU only, it works perfectly fine. With the GPU enabled, I get the following output and error:

(hybrid) D:\Cloud\pennylane>C:/Users/sss44/anaconda3/envs/hybrid/python.exe d:/Cloud/pennylane/qlstm/test.py
2021-08-05 07:55:02.069300: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2021-08-05 07:55:08.795924: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2021-08-05 07:55:09.313342: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce 930M computeCapability: 5.0
coreClock: 0.941GHz coreCount: 3 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 13.41GiB/s
2021-08-05 07:55:09.341061: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2021-08-05 07:55:09.409922: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2021-08-05 07:55:09.470460: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2021-08-05 07:55:09.494312: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2021-08-05 07:55:09.545484: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2021-08-05 07:55:09.583136: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2021-08-05 07:55:09.692277: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2021-08-05 07:55:09.864209: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-08-05 07:55:09.881092: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2021-08-05 07:55:09.940469: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x228aa4d7aa0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-08-05 07:55:09.954351: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-08-05 07:55:09.971339: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce 930M computeCapability: 5.0
coreClock: 0.941GHz coreCount: 3 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 13.41GiB/s
2021-08-05 07:55:09.994472: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2021-08-05 07:55:10.006000: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2021-08-05 07:55:10.016734: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2021-08-05 07:55:10.026879: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2021-08-05 07:55:10.040650: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2021-08-05 07:55:10.054665: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2021-08-05 07:55:10.067391: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2021-08-05 07:55:10.084270: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-08-05 07:55:19.152138: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-08-05 07:55:19.163469: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0
2021-08-05 07:55:19.169765: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N 
2021-08-05 07:55:19.184526: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1374 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce 930M, pci bus id: 0000:01:00.0, compute capability: 5.0)
2021-08-05 07:55:19.230140: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x228c61ad3d0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-08-05 07:55:19.244410: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce 930M, Compute Capability 5.0
2021-08-05 07:55:20.731979: E tensorflow/stream_executor/cuda/cuda_driver.cc:939] could not synchronize on CUDA context: CUDA_ERROR_ILLEGAL_INSTRUCTION: an illegal instruction was encountered :: 0x00007FFAAB908805   tensorflow::CurrentStackTrace
0x00007FFAAB639E3E      tensorflow::ConfigProto::HasBitSetters::graph_options
0x00007FFAAB64056E      stream_executor::StreamExecutor::EnablePeerAccessTo
0x00007FFA9842B8C8      tensorflow::StepStats::internal_default_instance
0x00007FFA9843C9F4      google::protobuf::RepeatedPtrField<tensorflow::InterconnectLink>::Add
0x00007FFA981770A2      std::vector<tensorflow::DtypeAndPartialTensorShape,std::allocator<tensorflow::DtypeAndPartialTensorShape> >::operator=
0x00007FFA981670FF      absl::inlined_vector_internal::DefaultValueAdapter<std::allocator<tensorflow::TensorValue> >::ConstructNext
0x00007FFA981892EC      tensorflow::EagerExecutor::~EagerExecutor
0x00007FFA981665D5      absl::inlined_vector_internal::DefaultValueAdapter<std::allocator<tensorflow::TensorValue> >::ConstructNext
0x00007FFA98161097      absl::inlined_vector_internal::DefaultValueAdapter<std::allocator<tensorflow::TensorValue> >::ConstructNext
0x00007FFA98160628      absl::inlined_vector_internal::DefaultValueAdapter<std::allocator<tensorflow::TensorValue> >::ConstructNext
0x00007FFA98167CE2      absl::inlined_vector_internal::DefaultValueAdapter<std::allocator<tensorflow::TensorValue> >::ConstructNext
0x00007FFA9816409E      absl::inlined_vector_internal::DefaultValueAdapter<std::allocator<tensorflow::TensorValue> >::ConstructNext
0x00007FFA98161639      absl::inlined_vector_internal::DefaultValueAdapter<std::allocator<tensorflow::TensorValue> >::ConstructNext
0x00007FFA92F50C58      tensorflow::gtl::FlatMap<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::shared_ptr<tensorflow::FunctionLibraryDefinition::FunctionDefAndOpRegistration>,tensorflow::hash<std::basic_string<char,std::char_traits<
0x00007FFA92F60F2A      TFE_Execute
0x00007FFA92CBE929      TFE_Py_FastPathExecute_C
0x00007FFB263B546F      (unknown)
0x00007FFB2639A356      (unknown)
0x00007FFB263C6246      (unknown)
0x00007FFB682B6FE0      PyMethodDef_RawFastCallKeywords
0x00007FFB682B5FA6      PyObject_MakeTpCall
0x00007FFB683935B8      PyEval_GetFuncDesc
0x00007FFB68390131      PyEval_EvalFrameDefault
0x00007FFB683921D9      PyEval_EvalCodeWithName
0x00007FFB682B676F      PyFunction_Vectorcall
0x00007FFB68393593      PyEval_GetFuncDesc
0x00007FFB68390131      PyEval_EvalFrameDefault
0x00007FFB682B6315      PyObject_Call
0x00007FFB682B6698      PyFunction_Vectorcall
0x00007FFB682B60EE      PyVectorcall_Call
0x00007FFB68393851      PyEval_GetFuncDesc
0x00007FFB68390271      PyEval_EvalFrameDefault
0x00007FFB682B6315      PyObject_Call
0x00007FFB682B6698      PyFunction_Vectorcall
0x00007FFB68393593      PyEval_GetFuncDesc
0x00007FFB6839014A      PyEval_EvalFrameDefault
0x00007FFB682B6315      PyObject_Call
0x00007FFB682B6698      PyFunction_Vectorcall
0x00007FFB68393593      PyEval_GetFuncDesc
0x00007FFB6839014A      PyEval_EvalFrameDefault
0x00007FFB683921D9      PyEval_EvalCodeWithName
0x00007FFB682B676F      PyFunction_Vectorcall
0x00007FFB682B9129      PyCell_Set
0x00007FFB682B93EA      PyMethod_Self
0x00007FFB682B60EE      PyVectorcall_Call
0x00007FFB68393851      PyEval_GetFuncDesc
0x00007FFB68390271      PyEval_EvalFrameDefault
0x00007FFB683921D9      PyEval_EvalCodeWithName
0x00007FFB682B676F      PyFunction_Vectorcall
0x00007FFB68393593      PyEval_GetFuncDesc
0x00007FFB6839014A      PyEval_EvalFrameDefault
0x00007FFB683921D9      PyEval_EvalCodeWithName
0x00007FFB682B676F      PyFunction_Vectorcall
0x00007FFB682B92E9      PyMethod_Self
0x00007FFB68393593      PyEval_GetFuncDesc
0x00007FFB68390171      PyEval_EvalFrameDefault
0x00007FFB683921D9      PyEval_EvalCodeWithName
0x00007FFB682B676F      PyFunction_Vectorcall
0x00007FFB682B92E9      PyMethod_Self
0x00007FFB68393593      PyEval_GetFuncDesc
0x00007FFB68390171      PyEval_EvalFrameDefault
0x00007FFB683921D9      PyEval_EvalCodeWithName
0x00007FFB682B676F      PyFunction_Vectorcall

Traceback (most recent call last):
  File "d:/Cloud/pennylane/qlstm/test.py", line 34, in <module>
    res = circuit(weights)
  File "C:\Users\sss44\anaconda3\envs\hybrid\lib\site-packages\pennylane\qnode.py", line 598, in __call__
    res = self.qtape.execute(device=self.device)
  File "C:\Users\sss44\anaconda3\envs\hybrid\lib\site-packages\pennylane\tape\tape.py", line 1323, in execute
    return self._execute(params, device=device)
  File "C:\Users\sss44\anaconda3\envs\hybrid\lib\site-packages\pennylane\tape\tape.py", line 1354, in execute_device
    res = device.execute(self)
  File "C:\Users\sss44\anaconda3\envs\hybrid\lib\site-packages\pennylane\_qubit_device.py", line 192, in execute
    self.apply(circuit.operations, rotations=circuit.diagonalizing_gates, **kwargs)
  File "C:\Users\sss44\anaconda3\envs\hybrid\lib\site-packages\pennylane\devices\default_qubit.py", line 200, in apply
    self._state = self._apply_operation(self._state, operation)
  File "C:\Users\sss44\anaconda3\envs\hybrid\lib\site-packages\pennylane\devices\default_qubit.py", line 225, in _apply_operation
    matrix = self._get_unitary_matrix(operation)
  File "C:\Users\sss44\anaconda3\envs\hybrid\lib\site-packages\pennylane\devices\default_qubit_tf.py", line 230, in _get_unitary_matrix
    mat = self.parametric_ops[op_name](*unitary.parameters)
  File "C:\Users\sss44\anaconda3\envs\hybrid\lib\site-packages\pennylane\devices\tf_ops.py", line 78, in RX
    return tf.cos(theta / 2) * I + 1j * tf.sin(-theta / 2) * X
  File "C:\Users\sss44\anaconda3\envs\hybrid\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 8890, in sin
    _ops.raise_from_not_ok_status(e, name)
  File "C:\Users\sss44\anaconda3\envs\hybrid\lib\site-packages\tensorflow\python\framework\ops.py", line 6653, in raise_from_not_ok_status
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InternalError: Failed copying input tensor from /job:localhost/replica:0/task:0/device:GPU:0 to /job:localhost/replica:0/task:0/device:CPU:0 in order to run Sin: GPU sync failed [Op:Sin]

Hi @sss441803,

Are you able to successfully run other tensorflow code on your GPU independently of PennyLane with this setup? I’ve tested the code snippet locally and it successfully runs, so I think that it may be an issue with configuration. (For reference I am in an environment using CUDA 11.2, cuDNN 8.2.1, tensorflow 2.5, python 3.9, all on a GeForce GTX 1060 with NVIDIA driver 460.91.03 on Linux).

My first thought is that there might be a mismatch between the versions of tensorflow and CUDA that you have installed. Tensorflow 2.2 seems to require CUDA 10.1, while the higher versions use CUDA 11.x. That said, looking at your error messages, it looks like the libraries for CUDA 10.1 are what actually get loaded (do you have multiple versions installed?).

Could you please try upgrading your tensorflow to v2.4 or newer and see if that fixes the issue?

EDIT: After doing so please post the output of running just import tensorflow as tf

Hi @glassnotes,

I am able to successfully run other tf codes on GPU, yet pennylane still fails. I am glad that some configuration can work (like yours), and that is enough for me to justify moving my work to a workplace workstation instead of my laptop, and tell my boss that it is possible to have everything on GPU. Thank you very much for our help!

No problem @sss441803 ! As Josh mentioned, we’re definitely looking to improve our GPU support, so please feel free to reach out if you have questions or let us know if you run into any issues along the way :slight_smile: