Circuit not optimizing parameters

import tensorflow as tf
from tensorflow.keras.datasets import mnist

# Load MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Filter images and labels for digits 0 to 7
train_mask = y_train <= 7
test_mask = y_test <= 7

x_train = x_train[train_mask]
y_train = y_train[train_mask]
x_test = x_test[test_mask]
y_test = y_test[test_mask]
print(x_train.shape, y_train.shape, x_test.shape, y_test.shape)
# Normalize pixel values to [0, 1]
x_train = x_train / 255.0
x_test = x_test / 255.0

x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)

# Resize images to 16x16 using tf.image.resize
x_train_resized = tf.image.resize(x_train, size=(16, 16))
x_test_resized = tf.image.resize(x_test, size=(16, 16))

# Convert to numpy arrays
x_train_resized = x_train_resized.numpy()
x_test_resized = x_test_resized.numpy()

# Flatten the images
x_train_flat = x_train_resized.reshape(x_train_resized.shape[0], -1)
x_test_flat = x_test_resized.reshape(x_test_resized.shape[0], -1)

# Convert class labels to one-hot encoded vectors
y_train = tf.keras.utils.to_categorical(y_train, num_classes=8)  # 8 classes now
y_test = tf.keras.utils.to_categorical(y_test, num_classes=8)

# Convert 0s to -1s in the labels and cast to int8
y_train[y_train == 0] = -1
y_train = y_train.astype(np.int8)

y_test[y_test == 0] = -1
y_test = y_test.astype(np.int8)
x_train=x_train_flat[0:1000]
y_train=y_train[0:1000]
print(x_train.shape, y_train.shape, x_test.shape, y_test.shape)

import time
start = time.time()

num_qubits = 8

dev = qml.device('default.qubit', wires = num_qubits)

@qml.qnode(dev)
def circuit(parameters, data):
    #for i in range(num_qubits):
    #    qml.Hadamard(wires = i)
  
    AmplitudeEmbedding(features = data, wires = range(num_qubits), normalize=True)
        
    qml.BasicEntanglerLayers(weights = parameters, wires = range(num_qubits))
    
    return [qml.expval(qml.PauliZ(i)) for i in range(8)]

def variational_classifier(weights, bias, x):
    return circuit(weights, x) + bias


def square_loss(y_true, y_pred):
    return np.mean(np.square(y_true - y_pred))

# Define custom accuracy metric
def accuracy(y_true, y_pred):
    # Convert predicted probabilities to labels (-1 or 1)
    y_pred_labels = np.sign(y_pred)
    
    # Count correct predictions
    correct_predictions = np.sum(y_true == y_pred_labels)
    
    # Calculate accuracy
    accuracy = correct_predictions / y_true.size
    
    return accuracy



def cost(weights, bias, X, Y):
    predictions = [variational_classifier(weights, bias, x) for x in X]
    return square_loss(Y, predictions)

#basic
num_layers = 2
shape = qml.BasicEntanglerLayers.shape(n_layers=num_layers, n_wires=8)
weights_init  = np.random.random(size=shape)
bias_init = np.array(0.0, requires_grad=True)
weights_init

opt = AdamOptimizer(stepsize=0.1, beta1=0.9, beta2=0.99)
#opt = AdamOptimizer()
batch_size = 128

wbest = 0
bbest = 0
abest = 0
weights = weights_init
bias = bias_init

for it in range(10):

    # weights update by one optimizer step

    batch_index = np.random.randint(0, len(x_train), (batch_size,))
    X_batch = x_train[batch_index]
    Y_batch = y_train[batch_index]
    weights, bias, _, _ = opt.step(cost, weights, bias, X_batch, Y_batch)

    # Compute the accuracy
    predictions = [variational_classifier(weights, bias, x) for x in x_train]
    
    if accuracy(y_train, predictions) > abest:
        wbest = weights
        bbest = bias
        abest = accuracy(y_train, predictions)
        print('New best')

    acc = accuracy(y_train, predictions)
  
    print(
        "Iter: {:5d} | Cost: {:0.7f} | Accuracy: {:0.7f} ".format(
            it + 1, cost(weights, bias, x_train, y_train), acc
       )
    )

i tried the circuit for 8 classes. but loss/accuracy remains constant. I increased layers. still things not changes. Is there anything I am missing. Can you please check.

Hi @Amandeep,
The first time you run this you will notice a warning saying that the “Output seems independent of input.”

This is an indication that your program is struggling with something that is non-differentiable.

Take a look at this section of the docs to see if some of your operations aren’t supported.

I would recommend different courses of action: if you don’t care too much about using numpy and tensorflow you could try using Torch instead. It’s less likely that you’ll see these issues. We have demos on using Torch which can help you.

If you want to keep your current setup then the best is to start from a minimal example. Make the tiniest version of your code which works, for example using 2 classes instead of 8, etc. Then you can start adding complexity until you reach the error. This will help you see where the error lies. If you’re still struggling you can send us your minimal example and we can take another look to try to uncover the issue.

Let us know if you have any further questions!

Hi @CatalinaAlbornoz Thank you for your response. I did this code for binary. I works perfect.
But when i did for multiclass. loss/accuracy remains constant on taking measurement from 10 qubits for 10 classes.

Hi @Amandeep, can you please send me the minimal code that you made? And what it is that you changed when it stopped working? I can try to find a workaround.