QNN Model Not Learning

Hello! The above code is to train the QNN model on binarized MNIST dataset and 1000 randomly selected training images. However, when i run the above code i get the same validation accuracy in every epoch, even the training accuracy seems very random.
I have tried with different quantum circuits, even built-in templates but to no success. Is there anything I am doing wrong in here? Any pointers would be appreciated. Thanks

PS: the same setting seems to work well when there is no quantum layer.

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Preprocess the data
train_images = train_images.reshape((60000, 784))
test_images = test_images.reshape((10000, 784))

# Normalize pixel values to be between 0 and 1
train_images, test_images = train_images / 255.0, test_images / 255.0

# Convert labels to binary: 0 for even digits, 1 for odd digits
train_labels = train_labels % 2
test_labels = test_labels % 2

# Create a subset of 1000 training images randomly
subset_indices = np.random.choice(len(train_images), 1000, replace=False)
train_images_subset = train_images[subset_indices]
train_labels_subset = train_labels[subset_indices]




n_qubits = 10 
layers = 4
dev = qml.device("lightning.qubit", wires=n_qubits)
@qml.qnode(dev, interface="tf", diff_method='best')
def qnode(inputs, weights):
    qml.templates.AmplitudeEmbedding([a for a in inputs], wires=range(n_qubits), pad_with=0., normalize=True)
    for j in range(layers):
        for i in range(n_qubits):
            qml.RX(weights[i], wires=i)
            qml.RY(weights[i], wires=i)

        for k in range(n_qubits-1):
            qml.CZ(wires=[k,k+1])
#     return [qml.expval(qml.PauliZ(wires=[i])) for i in range(1)]  
    return qml.expval(qml.PauliZ(0) @ qml.PauliZ(1)) 


weight_shapes = {"weights": (n_qubits)}


weights = np.random.normal(0, np.pi, size=(n_qubits))


tf.keras.backend.set_floatx('float64')
inputs = tf.keras.Input(shape=(784,))
x = qml.qnn.KerasLayer(qnode, weight_shapes, output_dim=1)(inputs)
model = tf.keras.Model(inputs=inputs, outputs=x)

# Compile the model
opt = tf.keras.optimizers.legacy.Adam(learning_rate=0.01)
model.compile(opt,
              loss='binary_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(train_images_subset, train_labels_subset, epochs=25, validation_split=0.2)

the output of qml.about().
`Platform info: macOS-13.5.2-arm64-arm-64bit
Python version: 3.11.4
Numpy version: 1.23.5
Scipy version: 1.10.1
Installed devices:

  • default.gaussian (PennyLane-0.32.0)
  • default.mixed (PennyLane-0.32.0)
  • default.qubit (PennyLane-0.32.0)
  • default.qubit.autograd (PennyLane-0.32.0)
  • default.qubit.jax (PennyLane-0.32.0)
  • default.qubit.tf (PennyLane-0.32.0)
  • default.qubit.torch (PennyLane-0.32.0)
  • default.qutrit (PennyLane-0.32.0)
  • null.qubit (PennyLane-0.32.0)
  • lightning.qubit (PennyLane-Lightning-0.32.0)
  • qiskit.aer (PennyLane-qiskit-0.31.0)
  • qiskit.basicaer (PennyLane-qiskit-0.31.0)
  • qiskit.ibmq (PennyLane-qiskit-0.31.0)
  • qiskit.ibmq.circuit_runner (PennyLane-qiskit-0.31.0)
  • qiskit.ibmq.sampler (PennyLane-qiskit-0.31.0)
  • qiskit.remote (PennyLane-qiskit-0.31.0)’

Hello @Muhammad_Kashif,

It looks like you’re doing two types of embedding: 1) AmplitudeEmbedding, and 2) Angle Embedding represented by RX and RY Layers, where commonly one strategy is used. This model tended to give more stochastic results run to run, however both classical and quantum accuracies simultaneously were improvable, with the quantum layer typically having lower loss (No. 2 and 4 in b) update). Best of success.

References:
a) Quantum Embedding | QML. Yes you, we are Itinerants, exploring… | by Amit Nikhade | MLearning.ai | Medium
b) ChemicalQDevice - ChemicalQDevice April 2023 R&D 135,000+ Impressions

Hi @kevinkawchak,

Thanks for your response. I believe I am encoding the input in amplitudes whereas RX and RY are trainable weights which are randomly initialized. I believe, it seems to have something with the observable that QNode is returning as a result because the following update in QNode seems to have reasonable performance, why, I dont know, if someone can please look into it, would be great,
H = np.zeros((2 ** 1, 2 ** 1))
H[0, 0] = 1
wirelist = [i for i in range(1)]
return qml.expval(qml.Hermitian(H, wirelist))

Hi @Muhammad_Kashif,

The way you’re using AmplitudeEmbedding is a bit strange. You would normally run qml.AngleEmbedding(inputs, wires=range(n_qubits)) instead of using [a for a in inputs]

@kevinkawchak , what @Muhammad_Kashif is doing is using AmplitudeEmbedding to encode the inputs and RX and RY to create the layers which include the weights (variational parameters). This is ok.

Is there a reason why you need to use a legacy optimizer? Have you tried changing it to the Adam optimizer for example?

Let me know if changing this fixes your issue.
Otherwise the examples here, here, as well as the info on the TensorFlow interface can help you. This demo is more complicated but it might help you too.

Hi @CatalinaAlbornoz,

Thanks for looking into this. Indeed, normally amplitude embedding is used how you have mentioned, however if I dont use [a for a in inputs] i get the following error:

ValueError: Exception encountered when calling layer 'keras_layer_1' (type KerasLayer).

AmplitudeEmbedding does not support batched Tensorflow features.

Call arguments received by layer 'keras_layer_1' (type KerasLayer):
  • inputs=tf.Tensor(shape=(32, 1024), dtype=float64)

About the use of legacy optimizer I was constantly being warned about using adam from tf.keras.optimizers.legacy.Adam instead of this tf.keras.optimizers.Adam for M1/M2 MACs as it will be more faster. So I used that. Nevertheless, the issue persists either ways.
Still wondering why is this the case. There should be some change in performance atleast for initial iterations and then we can probably analyze if there is overfitting or underfitting happening or something.
Thanks

Hey @Muhammad_Kashif!

Sorry for the delayed response here. I think the issue is that you’re trying to pass in a batch of inputs to AmplitudeEmbedding or you’re accidentally passing in an array shape that isn’t length one. In the documentation, the features argument (what you’re calling inputs) in AmplitudeEmbedding must be:

input tensor of dimension (2^len(wires),), or less if pad_with is specified

In other words, it’s an array without a leading dimension that can be broadcasted over. I would print out what inputs is inside qnodeprint([a for a in inputs]) — and see if that coincides with what’s in the documentation :slight_smile:. It could be that you’re not trying to pass in a batch of inputs and that you just need to process inputs differently to make it the correct shape.