Incorrect gradients when using TensorFlow gradient tape

Hello,

I had some results with my algorithms that seem incorrect when changing the number of qubits. Turns out the reason is some unexpected behaviour with tape.jacobian from TensorFlow. I’ve included a MWE below showing some of this incorrect behaviour, the output of the model is a scalar and we differentiate w.r.t that so we expect tape.jacobian and tape.gradient to return the same values. This works perfectly fine unless the number of qubits is 7. When there are 7 qubits the values are different.

!python -m pip install pennylane 
!pip install custatevec-cu12
!pip install tensorflow==2.15
!pip install silence-tensorflow

from silence_tensorflow import silence_tensorflow
silence_tensorflow()

import numpy as np
import keras
from keras import layers,initializers
from scipy.stats import unitary_group
import pennylane as qml
import tensorflow as tf

tf.keras.backend.set_floatx('float64')



def model_creation(n_wires):
  xlength = 1
  dev = qml.device("default.qubit", wires=n_wires)
  @qml.qnode(dev, interface='tf')
  def circuit(inputs, weights):
      inputs = tf.transpose(inputs)

      for i in range(n_wires):
        qml.RX(10*weights[0,i,0:xlength],wires=i)
        qml.CNOT(wires=[i, (i + 1) % n_wires])

      return [qml.expval(qml.sum(*[qml.PauliZ(i) for i in range(n_wires)]))]


  weight_shapes = {"weights": (1,n_wires, xlength)}
  qlayer = qml.qnn.KerasLayer(circuit, weight_shapes, output_dim=1) 

  def modelx():
    inputs = keras.Input(shape=(1,))
    x = qlayer(inputs)
    model = keras.Model(inputs=inputs, outputs=x)
    return model

  return modelx()

model = model_creation(n_wires = 7) #Change argument of this, bug if n_wires =7 
points = 1
x = tf.convert_to_tensor(np.asarray([0.5]))
points = tf.convert_to_tensor(points, dtype=tf.int32)

def grad_distributions1(model):
  with tf.GradientTape() as tape:
    r = tf.squeeze(model(x))
  grads = tape.jacobian(r,model.trainable_variables)
  del tape
  return grads

grad = grad_distributions1(model)
grad = np.array(grad).flatten()
print(grad)

def grad_distributions2(model):
  with tf.GradientTape() as tape:
    r = tf.squeeze(model(x))
  grads = tape.gradient(r,model.trainable_variables)
  del tape
  return grads

grad = grad_distributions2(model)
grad = np.array(grad).flatten()
print(grad)

If you remove “qml.CNOT(wires=[i, (i + 1) % n_wires])” then the bug no longer happens, note the bug still happens even if the arguments of CNOT are two fixed qubits. This was a very hard bug to find and in producing a MWE some of the strange behaviour doesn’t happen. I think that for my full models strange behaviour happens for different numbers of qubits and architectures, also occasionally the gradients from tape.jacobian are clearly incorrect returning values like -2.34569401 \times 10^{241}.

I had this bug on one of my own local machines but the code above reproduces this issue on google colab so it’ll be easier for you.

Thanks

edit: I forgot to remove the inputs argument this MWE

Hi @Bnesh,

Thank you for question. We’re experiencing a high volume of questions. I’ll respond as soon as I can.