Gradient Propagation Through Hybrid Quantum/Classical Networks

Corey · February 29, 2024, 4:39pm

Hello! Previously I thought that if I used gradient tape on a mixed quantum/classical network, then gradients w.r.t to inputs (not parameters) I calculated would include the entire network - both quantum and classical - however, recently I tried to wrap a quantum layer between two simple input/output keras layers with no activation to try just a quantum network (no classical parameters) and noticed that the second order derivatives were coming back NULL. This seems to be telling me that the network is linear - which means it is only calculating the gradients of the classical network (since the quantum layer is def not linear and has second-order derivatives). If I include a tanh to either the input or output keras classical layers, it does gives me back second-order derivatives, which makes sense as the classical space is then non-linear. The derivatives on print out are clearly not constant, so it is a bug?

So why are the second-order derivatives Null for this? I should say that if I just use QML (not tensorflow wrapped) and calculate tape gradients on the quantum network I get back second-order derivatives just fine. It is only a problem when I wrap in TF model.

Thank you!

# Put code here

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras import metrics
import matplotlib.pyplot as plt
from scipy.stats import qmc
import pennylane as qml
from pennylane import numpy as np
tf.keras.backend.set_floatx("float64")
tf.keras.backend.clear_session()

n_qubits = 3
layers = 3
data_dimension = 1 # output
param = {'num_epochs': 400}
xmin = 0
xmax = 1 
lr = 0.1
opt = tf.keras.optimizers.Adam(learning_rate=lr)
tf.keras.initializers.RandomUniform(minval=0, maxval=2*np.pi, seed=200) # for quantun parameter intialization
solution_scale = 1

def fun(x,y):
    return ( solution_scale * (1/4) * (x*x + y*y + x*y + 1) )


Lx = 1
Ly = 1
nc = 11
xx = np.linspace(0,Lx,nc)
yy = np.linspace(0,Ly,nc)
x_inf, y_inf = np.meshgrid(xx, yy)
y_c = fun(x_inf,y_inf)
x1_c, x2_c = x_inf.reshape(-1, 1), y_inf.reshape(-1, 1)

##################################################
##################################################

dev = qml.device("lightning.qubit", wires=n_qubits)
@qml.qnode(dev, diff_method='adjoint')
def qnode(inputs, weights):
    #print("qnode inputs: ",inputs) # why do these change?  Is it because I have a layer before?  Is it optimizing w.r.t these?
    qml.templates.AngleEmbedding(inputs, wires=range(n_qubits))
    qml.templates.StronglyEntanglingLayers(weights, wires=range(n_qubits))
    return [qml.expval(qml.PauliZ(i)) for i in range(n_qubits)]  # why isn't this: return qml.expval(qml.PauliZ(0))
    #return qml.expval(qml.PauliZ(0))
weight_shapes = {"weights": (layers, n_qubits,3)}

##################################################
##################################################

# This will not give back second derivatives unless I add a tanh activation (adds nonlinearity), which means it must not be including spacial dependence in Quantum network

input_layer  = tf.keras.layers.Input(shape=(2,))
hidden0      = qml.qnn.KerasLayer(qnode, weight_shapes, output_dim=n_qubits)(input_layer)
output_layer = tf.keras.layers.Dense(1, activation=None)(hidden0)
model = tf.keras.Model(input_layer, output_layer)
model.summary()

# This works fine
# neuron_per_layer = 1
# input_layer  = tf.keras.layers.Input(shape=(2,))
# hidden0      = qml.qnn.KerasLayer(qnode, weight_shapes, output_dim=n_qubits)(input_layer)
# hidden1 = tf.keras.layers.Dense(neuron_per_layer, activation="tanh")(hidden0)
# output_layer = tf.keras.layers.Dense(1, activation=None)(hidden1)
# #output_layer = tf.keras.layers.Dense(1, activation=None)(hidden0)
# model = tf.keras.Model(input_layer, output_layer)
# model.summary()

##################################################
##################################################

p_c = tf.Variable(tf.concat([x1_c, x2_c], axis=1), dtype=tf.float64)
print("p_c: ",p_c)
with tf.GradientTape(persistent=True) as tape2:
    with tf.GradientTape(persistent=True) as tape1:
        yc = model(p_c)
        print("yc: ",yc)
    df = tape1.gradient(yc,p_c)
    u_x = tape1.gradient(yc,p_c)[:,0]
    u_y = tape1.gradient(yc,p_c)[:,1]
    print("df: ",df)
    print("u_x: ",u_x)
    print("u_y: ",u_y)
du2 = tape2.gradient(df,p_c)
u_xx = tape2.gradient(u_x,p_c)[:,0]
u_yy = tape2.gradient(u_y,p_c)[:,1]
print("du2: ",du2)
print("u_xx: ",u_xx)
print("u_yy: ",u_yy)

TypeError Traceback (most recent call last)
Cell In[3], line 83
81 print("u_y: ",u_y)
82 du2 = tape2.gradient(df,p_c)
—> 83 u_xx = tape2.gradient(u_x,p_c)[:,0]
84 u_yy = tape2.gradient(u_y,p_c)[:,1]
85 print("du2: ",du2)

TypeError: ‘NoneType’ object is not subscriptable
*** This happens because the 2nd-order derivative is 0

qml.about().
Name: PennyLane
Version: 0.34.0
Summary: PennyLane is a Python quantum machine learning library by Xanadu Inc.
Home-page: GitHub - PennyLaneAI/pennylane: PennyLane is a cross-platform Python library for differentiable programming of quantum computers. Train a quantum computer the same way as a neural network.
Author:
Author-email:
License: Apache License 2.0
Location: /Users/corey/Library/Python/3.10/lib/python/site-packages
Requires: appdirs, autograd, autoray, cachetools, networkx, numpy, pennylane-lightning, requests, rustworkx, scipy, semantic-version, toml, typing-extensions
Required-by: PennyLane-Cirq, PennyLane-Lightning, PennyLane-qiskit

Platform info: macOS-12.7.2-x86_64-i386-64bit
Python version: 3.10.13
Numpy version: 1.26.3
Scipy version: 1.11.4
Installed devices:

cirq.mixedsimulator (PennyLane-Cirq-0.34.0)
cirq.pasqal (PennyLane-Cirq-0.34.0)
cirq.qsim (PennyLane-Cirq-0.34.0)
cirq.qsimh (PennyLane-Cirq-0.34.0)
cirq.simulator (PennyLane-Cirq-0.34.0)
qiskit.aer (PennyLane-qiskit-0.34.0)
qiskit.basicaer (PennyLane-qiskit-0.34.0)
qiskit.ibmq (PennyLane-qiskit-0.34.0)
qiskit.ibmq.circuit_runner (PennyLane-qiskit-0.34.0)
qiskit.ibmq.sampler (PennyLane-qiskit-0.34.0)
qiskit.remote (PennyLane-qiskit-0.34.0)
lightning.qubit (PennyLane-Lightning-0.34.0)
default.gaussian (PennyLane-0.34.0)
default.mixed (PennyLane-0.34.0)
default.qubit (PennyLane-0.34.0)
default.qubit.autograd (PennyLane-0.34.0)
default.qubit.jax (PennyLane-0.34.0)
default.qubit.legacy (PennyLane-0.34.0)
default.qubit.tf (PennyLane-0.34.0)
default.qubit.torch (PennyLane-0.34.0)
default.qutrit (PennyLane-0.34.0)
null.qubit (PennyLane-0.34.0)

Corey · February 29, 2024, 10:42pm

I found this issue was the lightning qubit. If I change this
dev = qml.device(“lightning.qubit”, wires=n_qubits)

to this

dev = qml.device(“default.qubit”, wires=n_qubits)

This code works fine …

Tom_Bromley · March 1, 2024, 8:42pm

Thanks @Corey for posting! Yes this looks like a potential bug. Would you mind waiting for us to take a deeper look into things next week? We’re a little busy right now preparing for the PennyLane v0.35 release on Tuesday next week

Tom_Bromley · March 7, 2024, 2:28pm

Hi @Corey! Here is a simple version of the code to reproduce the error:

import pennylane as qml
import tensorflow as tf

dev = qml.device("lightning.qubit", wires=1)

@qml.qnode(dev, diff_method="adjoint")
def f(x):
    qml.RX(x, 0)
    return qml.expval(qml.PauliZ(0))

x = tf.Variable(0.3, dtype=tf.float64)

with tf.GradientTape(persistent=True) as tape1:
    with tf.GradientTape(persistent=True) as tape0:
        res = f(x)

    grad0 = tape0.gradient(res, x)

grad1 = tape1.gradient(grad0, x)
assert grad1 is not None

This fails, but you can get things to work with parameter-shift by updating your @qml.qnode() line:

@qml.qnode(dev, diff_method="parameter-shift", max_diff=2)

However, this will be slower due to use of parameter shift gradient method. If you’re happy to stick with a small number of qubits, you could do:

dev = qml.device("default.qubit")

@qml.qnode(dev, diff_method="backprop")

Apologies for this issue with the adjoint method. If you’re up for considering another interface like PyTorch or JAX you might have better luck. TensorFlow is supported, but our focus is on other interfaces.

Tom_Bromley · March 7, 2024, 2:32pm

I created this issue to track the problem.

Corey · March 11, 2024, 4:39pm

Thank you. That worked ok.

Topic		Replies	Views
Quantum Tape for Spatial Gradients in Physic Informed NNs PennyLane Help	8	781	February 8, 2024
Hybrid Quantum-Classical network PennyLane Help	20	1760	October 10, 2023
TF/Pennylane Regression PennyLane Help	3	348	February 2, 2024
Hybrid Quantum-Classical network with pytorch PennyLane Help	18	3243	November 4, 2020
Quantum circuit between classical layers with Tensorflow PennyLane Help	3	629	February 7, 2021

Gradient Propagation Through Hybrid Quantum/Classical Networks

Related topics