Keras hybrid model lightning.gpu with adjoint differentiation

Hi everyone,

I have a question regarding lightning.gpu when building hybrid models.
I’m using keras to build a Sequential model with a quantum layer, e.g. something like:

qlayer = qml.qnnKerasLayer(circuit, weight_shapes, output_dim=16)

clayer = tf.keras.Conv2D(10, 3, strides=2, padding='valid', activation='relu')
flatten = tf.keras.Flatten()
dense = tf.keras.layers.Dense(81)
reshape = tf.keras.layers.Reshape((9,9,1))
out = tf.keras.layers.Conv2D(1, 2, strides=1, padding='same', activation='sigmoid')

model = tf.keras.models.Sequential([clayer, flatten, qlayer, dense, reshape, out])

model.compile(opt, loss="bce)
model.train(x_train, x_train, epochs=5)

Where circuit is some quantum circuit.
When I use adjoint differentiation for the circuit with lightning.qubit everything works fine.
When I use parameter-shift with lightning.gpu it works as well.
But when I try to use adjoint differentiation with lightning.gpu I get the following error:

Error Trace

PLException Traceback (most recent call last)
/tmp/ipykernel_176419/2056065315.py in
1 es = tf.keras.callbacks.EarlyStopping(monitor=‘val_loss’, patience=2,min_delta=0.0001)
----> 2 fitting = model.fit(x_train_small, x_train_small, epochs=20, batch_size=50, steps_per_epoch=50, validation_data=(x_test_small, x_test_small))

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/keras/utils/traceback_utils.py in error_handler(*args, kwargs)
65 except Exception as e: # pylint: disable=broad-except
66 filtered_tb = _process_traceback_frames(e.traceback)
—> 67 raise e.with_traceback(filtered_tb) from None
68 finally:
69 del filtered_tb

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/pennylane/qnn/keras.py in call(self, inputs)
300 reconstructor =
301 for x in tf.unstack(inputs):
→ 302 reconstructor.append(self.call(x))
303 return tf.stack(reconstructor)
304

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/pennylane/qnn/keras.py in call(self, inputs)
303 return tf.stack(reconstructor)
304
→ 305 return self._evaluate_qnode(inputs)
306
307 def _evaluate_qnode(self, x):

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/pennylane/qnn/keras.py in _evaluate_qnode(self, x)
318 {k: 1.0 * w for k, w in self.qnode_weights.items()},
319 }
→ 320 return self.qnode(**kwargs)
321
322 def compute_output_shape(self, input_shape):

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/pennylane/qnode.py in call(self, *args, **kwargs)
665 gradient_kwargs=self.gradient_kwargs,
666 override_shots=override_shots,
→ 667 **self.execute_kwargs,
668 )
669

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/pennylane/interfaces/execution.py in execute(tapes, device, gradient_fn, interface, mode, gradient_kwargs, cache, cachesize, max_diff, override_shots, expand_fn, max_expansion, device_batch_transform)
442
443 res = _execute(
→ 444 tapes, device, execute_fn, gradient_fn, gradient_kwargs, _n=1, max_diff=max_diff, mode=_mode
445 )
446

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/pennylane/interfaces/tensorflow.py in execute(tapes, device, execute_fn, gradient_fn, gradient_kwargs, _n, max_diff, mode)
87 with qml.tape.Unwrap(*tapes):
88 # Forward pass: execute the tapes
—> 89 res, jacs = execute_fn(tapes, **gradient_kwargs)
90
91 for i, tape in enumerate(tapes):

~/miniconda3/envs/tfqf/lib/python3.7/contextlib.py in inner(*args, **kwds)
72 def inner(*args, **kwds):
73 with self._recreate_cm():
—> 74 return func(*args, **kwds)
75 return inner
76

Tom Magorsch, [25.08.22 11:58]
~/miniconda3/envs/tfqf/lib/python3.7/site-packages/pennylane/_device.py in execute_and_gradients(self, circuits, method, **kwargs)
552 # gradient computation (if applicable).
553 res.append(self.batch_execute([circuit])[0])
→ 554 jacs.append(gradient_method(circuit, **kwargs))
555
556 return res, jacs

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/pennylane_lightning_gpu/lightning_gpu.py in adjoint_jacobian(self, tape, starting_state, use_device_state)
299 tp_shift = [i - 1 for i in tp_shift]
300
→ 301 jac = adj.adjoint_jacobian(self._gpu_state, obs_serialized, ops_serialized, tp_shift)
302 jac = np.array(jac) # only for parameters differentiable with the adjoint method
303 jac = jac.reshape(-1, len(tp_shift))

PLException: Exception encountered when calling layer “keras_layer” (type KerasLayer).

[/pennylane-lightning-gpu/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp][Line:200][Method:StateVectorCudaManaged]: Error in PennyLane Lightning: custatevec not initialized

Call arguments received:
• inputs=tf.Tensor(shape=(50, 81), dtype=float64)

Training the same circuit with adjoint diff. on lightning.gpu but without embedding it in a keras hybrid model things work as well, so I suppose the problem should not lie in the circuit but in the keras integration.

Any ideas would be greatly appreciated!

Greetings
Tom

Hi @ToMago!

I see that you’re trying to use lightning.gpu. Please make sure that you have a compatible device and that you have the cuQuantum SDK installed. You can learn more in the docs.

Otherwise, I’m finding a few errors.

  1. When you call “qml.qnnKerasLayer” it actually requires a dot so it’s qml.qnn.KerasLayer
  2. In the “model.compile(opt, loss=“bce”)” line you’re missing the closing quotation marks.
  3. You haven’t share your full code so I’m assuming you’re using the same code as in our qnn demo. If that’s not the case then please share a minimal working example of your code.
  4. tf.keras.Conv2D should actually be “tf.keras.layers.Conv2D”
  5. tf.keras.Flatten should be “tf.keras.layers.Flatten”

Other than this, it’s likely that you will have a dimension or shape mismatch depending on what data you’re using. Please share this data too so that I can try to reproduce your problem.

Hopefully this can help you get going.

Hi @CatalinaAlbornoz and thanks for your answer!

As I mentioned, my code is running fine when using lightning.gpu with parameter-shift differentiation and also when training circuits without qml.nn.keraslayer.
Therefore I assume, that cuQuantum is configured correctly.

Furthermore the code is also working with lightning.qubit therefore I think the dimensions should just work fine.

The code for the circuit and my data is a bit lengthy but of course I can provide a minimal working example with MNIST:

import pennylane as qml
import tensorflow as tf

tf.keras.backend.set_floatx('float64')

DATA_QBITS = 16
INPUT_SIZE = 12

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train/255.0, x_test/255.0
x_train = 
x_train.reshape((x_train.shape[0],x_train.shape[1],x_train.shape[2],1))
x_test = 
x_test.reshape((x_test.shape[0],x_test.shape[1],x_test.shape[2],1))
x_train_small = tf.image.resize(x_train[:100], 
(INPUT_SIZE,INPUT_SIZE)).numpy()
x_test_small = tf.image.resize(x_test[:100], (INPUT_SIZE,INPUT_SIZE)).numpy()

dev1 = qml.device('lightning.gpu', wires=DATA_QBITS)

@qml.qnode(dev1, diff_method="adjoint")
def circuit(inputs, weights):

    qml.AngleEmbedding(inputs, wires=range(DATA_QBITS))
    qml.BasicEntanglerLayers(weights, wires=range(DATA_QBITS))

    return [qml.expval(qml.PauliZ(i)) for i in range(DATA_QBITS)]

weight_shapes = {"weights": (3,DATA_QBITS)}
qlayer = qml.qnn.KerasLayer(circuit, weight_shapes, 
output_dim=DATA_QBITS)

inputs = tf.keras.layers.Input(shape=(12,12,1))
clayer_1 = tf.keras.layers.Conv2D(10, 2, strides=3, padding='valid', activation="relu")
clayer_2 = tf.keras.layers.Conv2D(1, 2, strides=1, padding='same', activation="relu")
dress1 = tf.keras.layers.Flatten()

dress2 = tf.keras.layers.Dense(16)
re = tf.keras.layers.Reshape((4,4,1))
clayer_3 = tf.keras.layers.Conv2DTranspose(10, 3, strides=3, 
padding='valid', activation="relu")
clayer_4 = tf.keras.layers.Conv2DTranspose(10, 2, strides=1, 
padding='same', activation="relu")
out_layer = tf.keras.layers.Conv2D(1, 2, strides=1, padding='same', activation="sigmoid")

model = tf.keras.models.Sequential([inputs, clayer_1, clayer_2, dress1, qlayer, dress2, re, clayer_3, clayer_4, out_layer])

opt = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(opt, loss="bce")

hist = model.fit(x_train_small, x_train_small, epochs=5, batch_size=20, validation_data=(x_test_small, x_test_small))

This code throws the error Error in PennyLane Lightning: custatevec not initialized mentioned in the original post:

Error trace

PLException Traceback (most recent call last)
/tmp/ipykernel_192120/3504964397.py in
44 model.compile(opt, loss=“bce”)
45
—> 46 hist = model.fit(x_train_small, x_train_small, epochs=5, batch_size=20, validation_data=(x_test_small, x_test_small))

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/keras/utils/traceback_utils.py in error_handler(*args, kwargs)
65 except Exception as e: # pylint: disable=broad-except
66 filtered_tb = _process_traceback_frames(e.traceback)
—> 67 raise e.with_traceback(filtered_tb) from None
68 finally:
69 del filtered_tb

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/pennylane/qnn/keras.py in call(self, inputs)
300 reconstructor =
301 for x in tf.unstack(inputs):
→ 302 reconstructor.append(self.call(x))
303 return tf.stack(reconstructor)
304

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/pennylane/qnn/keras.py in call(self, inputs)
303 return tf.stack(reconstructor)
304
→ 305 return self._evaluate_qnode(inputs)
306
307 def _evaluate_qnode(self, x):

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/pennylane/qnn/keras.py in _evaluate_qnode(self, x)
318 {k: 1.0 * w for k, w in self.qnode_weights.items()},
319 }
→ 320 return self.qnode(**kwargs)
321
322 def compute_output_shape(self, input_shape):

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/pennylane/qnode.py in call(self, *args, **kwargs)
665 gradient_kwargs=self.gradient_kwargs,
666 override_shots=override_shots,
→ 667 **self.execute_kwargs,
668 )
669

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/pennylane/interfaces/execution.py in execute(tapes, device, gradient_fn, interface, mode, gradient_kwargs, cache, cachesize, max_diff, override_shots, expand_fn, max_expansion, device_batch_transform)
442
443 res = _execute(
→ 444 tapes, device, execute_fn, gradient_fn, gradient_kwargs, _n=1, max_diff=max_diff, mode=_mode
445 )
446

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/pennylane/interfaces/tensorflow.py in execute(tapes, device, execute_fn, gradient_fn, gradient_kwargs, _n, max_diff, mode)
87 with qml.tape.Unwrap(*tapes):
88 # Forward pass: execute the tapes
—> 89 res, jacs = execute_fn(tapes, **gradient_kwargs)
90
91 for i, tape in enumerate(tapes):

~/miniconda3/envs/tfqf/lib/python3.7/contextlib.py in inner(*args, **kwds)
72 def inner(*args, **kwds):
73 with self._recreate_cm():
—> 74 return func(*args, **kwds)
75 return inner
76

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/pennylane/_device.py in execute_and_gradients(self, circuits, method, **kwargs)
552 # gradient computation (if applicable).
553 res.append(self.batch_execute([circuit])[0])
→ 554 jacs.append(gradient_method(circuit, **kwargs))
555
556 return res, jacs

~/miniconda3/envs/tfqf/lib/python3.7/site-packages/pennylane_lightning_gpu/lightning_gpu.py in adjoint_jacobian(self, tape, starting_state, use_device_state)
299 tp_shift = [i - 1 for i in tp_shift]
300
→ 301 jac = adj.adjoint_jacobian(self._gpu_state, obs_serialized, ops_serialized, tp_shift)
302 jac = np.array(jac) # only for parameters differentiable with the adjoint method
303 jac = jac.reshape(-1, len(tp_shift))

PLException: Exception encountered when calling layer “keras_layer” (type KerasLayer).

[/pennylane-lightning-gpu/pennylane_lightning_gpu/src/simulator/StateVectorCudaManaged.hpp][Line:200][Method:StateVectorCudaManaged]: Error in PennyLane Lightning: custatevec not initialized

Call arguments received:
• inputs=tf.Tensor(shape=(20, 16), dtype=float64)

Note again, that for me this works with lightning.qubit and also with lightning.gpu with parameter-shift as diff_method.

Thanks!
Tom

Thanks for the clarification @ToMago!

I will get back to you tomorrow with an official answer but it does look like you’ve run into an unsupported case.

Hi @ToMago, can you please post the output of qml.about()? Adjoint with lightning.gpu is in fact supported in the latest version of PennyLane.

Hi,
glad to hear its supported!

Here is the output of qml.about()

Name: PennyLane
Version: 0.25.1
Summary: PennyLane is a Python quantum machine learning library by Xanadu Inc.
Home-page: https://github.com/XanaduAI/pennylane
Author: 
Author-email: 
License: Apache License 2.0
Location: /home/tom/miniconda3/envs/tfqf/lib/python3.7/site-packages
Requires: appdirs, autograd, autoray, cachetools, networkx, numpy, pennylane-lightning, retworkx, scipy, semantic-version, toml
Required-by: PennyLane-Lightning, PennyLane-Lightning-GPU

Platform info:           Linux-5.18.0-1-amd64-x86_64-with-debian-bookworm-sid
Python version:          3.7.13
Numpy version:           1.21.6
Scipy version:           1.7.3
Installed devices:
- lightning.qubit (PennyLane-Lightning-0.25.1)
- default.gaussian (PennyLane-0.25.1)
- default.mixed (PennyLane-0.25.1)
- default.qubit (PennyLane-0.25.1)
- default.qubit.autograd (PennyLane-0.25.1)
- default.qubit.jax (PennyLane-0.25.1)
- default.qubit.tf (PennyLane-0.25.1)
- default.qubit.torch (PennyLane-0.25.1)
- default.qutrit (PennyLane-0.25.1)
- lightning.gpu (PennyLane-Lightning-GPU-0.25.0)

Hi @ToMago

Thanks for posting the above info. I have attempted to run your earlier script using adjoint with lightning.gpu v0.25.0 and was able to run it to completion.

The GPU used for this was a 40GB A100, which is the general target GPU for our cuQuantum backed workloads, along with the other Tesla-grade V100.

The error you reported is likely due to there being insufficient GPU memory available to initialize the CUDA contexts necessary for cuQuantum, and so fails to complete. Can you indicate which GPU you are running this on, as it may help us to track down the problem.

Also, while running the script, it may be worth examining the GPU memory use through nvidia-smi to help idnetify if this is the root cause. I prefer to poll this every second using watch -n 1 nvidia-smi to keep track over the script execution time.

1 Like

Hi @mlxd,

you are indeed right, the error is due to a lack of memory.

Thanks for uncovering this!

Greetings
Tom