Differentiation method and Amplitude embedding


I was trying the diff-method = "adjoint" with AmplitudeEmbedding but it does not seem to be working (error-attached screenshot) however, it works fine with AngleEmbedding. And backprop is working with both.

Below is my code:

n_train = 10000

mnist_dataset = keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist_dataset.load_data()
x_train = x_train[:n_train]
y_train = y_train[:n_train]
x_test = x_test[:n_test]
y_test = y_test[:n_test]
x_train, x_test = (x_train / 255.0), (x_test / 255.0)

x_train = x_train.reshape(-1, 784) # 784 = 28x28
x_test = x_test.reshape(-1, 784)

n_qubits = 1
dev = qml.device("default.qubit",  wires=n_qubits)

@qml.qnode(dev, interface = "tf", diff_method="adjoint")
def qnode(inputs, weights):
    qml.templates.AmplitudeEmbedding(inputs, wires=range(n_qubits), pad_with=0.,  normalize = True)
#     qml.templates.AngleEmbedding(inputs, wires=range(n_qubits))
    qml.templates.BasicEntanglerLayers(weights, wires=range(n_qubits),rotation = qml.RY)
#     qml.templates.StronglyEntanglingLayers(weights, wires=range(n_qubits))
    return [qml.expval(qml.PauliZ(wires=i)) for i in range(n_qubits)]

n_layers = 1
# weight_shapes = {"weights": (n_layers, n_qubits, 3)}

weight_shapes = {"weights": (n_layers, n_qubits)}
# re-define the layers

qlayer_1 = qml.qnn.KerasLayer(qnode, weight_shapes, output_dim=n_qubits)
qlayer_2 = qml.qnn.KerasLayer(qnode, weight_shapes, output_dim=n_qubits)
qlayer_3 = qml.qnn.KerasLayer(qnode, weight_shapes, output_dim=n_qubits)
qlayer_4 = qml.qnn.KerasLayer(qnode, weight_shapes, output_dim=n_qubits)

inputs = tf.keras.Input(shape=(784,))

x = tf.keras.layers.Dense(4)(inputs)

x_1, x_2, x_3, x_4 = tf.split(x, 4, axis=1)
x_1 = qlayer_1(x_1)
x_2 = qlayer_2(x_2)
x_3 = qlayer_3(x_3)
x_4 = qlayer_4(x_4)
x = tf.concat([x_1, x_2, x_3, x_4 ], axis=1)

outputs =  tf.keras.layers.Dense(10, activation="softmax")(x)

model = tf.keras.Model(inputs=inputs, outputs=outputs)

from tensorflow.keras.optimizers import SGD, Adam
opt = Adam(learning_rate=0.01)
model.compile(opt, loss="sparse_categorical_crossentropy", metrics=["accuracy"])
history = model.fit(x_train, y_train, epochs=100, batch_size=16, validation_data=(x_test, y_test))

Is the model above computes quantum gradients as well and optimize the loss, since I can not see any significant difference in performance when compared with its classical counterpart, except the convergence time of hybrid model (above) is significantly higher that classical one.
Moreover, if we increase the dataset size (currently MNIST (10k,3k)), should the percent increase of convergence time of hybrid model should be less than that of classical model (that is not the case with above model), because that is what quantum computation promises, faster computation?

is there any way that I can run this same model (above) on IBM real quantum device to better compare the performance and convergence time. I am curious if the quantum processor would recognize the keras commands like compile, fit etc…

Any help would be appreciated.

Hi @Muhammad_Kashif,

Thank you for posting your questions here in the forum. Quantum computing has potential to provide significant speed-ups in certain cases, although it’s very much still a research question whether quantum machine learning can perform better than its classical counterpart, and, if so, where it excels. You shouldn’t expect any improvements by running quantum simulations, like this one, and it’s expected that a classical simulation would optimize faster.

Unfortunately, there’s no quantum hardware that would be able to use Keras intrinsically, although you could attempt to run the circuit on hardware while still using Keras to do the optimizations classically. If you want to use the IBM quantum device with PennyLane, you can read more about it here.

Regarding the issue when using the adjoint method, I’m not sure why that is. I had to set the floating point precision for Keras to float64 (with tf.keras.backend.set_floatx('float64')) to get it to work with backprop, though. Thanks for bringing it to our attention. :slight_smile:

Let me know if you have any further questions!

Hi @theodor,

Thanks for answering.

I tried running the same model (previous message in this thread) using the tutorial you directed. After loading my account token etc, I added the following line: all the rest of the code is same.

dev = qml.device('qiskit.ibmq', wires=1, backend = 'ibmq_manila')

and tried running the model using model.fit but it is stuck (screenshot below) for around 24 hours. I am not sure if I am doing some mistake or i am in queue or what? Does the above line mean to run the quantum circuit on quantum processor?
I am really curious if there is any speed improvement if we run hybrid network on quantum computer.
Screenshot of stucked simulation on real quantum device:

regarding the diff_method, the backprop is working with both (amplitude and angle embedding) but adjoint is not working with amplitude embedding. Please let me know if there is any progress on this issue.

thanks for the help.

Hi @Muhammad_Kashif. It’s likely that the job is stuck in a queue waiting for access to the quantum device. Either way, it’ll run much slower on quantum hardware devices since there will need to be a lot of slow device calls for hardware-evaluated gradients. Unfortunately, I wouldn’t expect any real improvements running on current hardware devices, even though it’s always exciting to experiment with it.

Regarding using the adjoint method, I believe it might actually have to do with AmplitudeEmbedding not being differentiable, and thus complaining when attempting gradient-based optimization methods. I’m not sure why it doesn’t complain when using the “backprop” differentiation method. A work-around could be using the Möttönen state preparation method instead, which can be seen as a decomposition of an amplitude embedding.

Let me know if this works out!