AngleEmbedding and 1 layer tensor network doesn't change the weights

I’m trying to adapt the code from the variational classifier demo (https://pennylane.ai/qml/demos/tutorial_variational_classifier.html) to use on a different dataset of 13 features and binary label. I MinMaxScaler().fit_transform() the features, then use AngleEmbedding(rotation=‘Y’) to encode 13 features on 13 wires. I used the basic tensor network circuit (RX0,RX1,CX01) as in https://pennylane.ai/qml/demos/tutorial_tn_circuits.html as my variation part. Thus I have a [12, 2] tensor as the weights. I followed the variational classifier demo to optimize the weights and bias, but as iteration goes on, the weights doesn’t change as I print them out to first two decimal. What’s the most probable cause of this behavior?

import pennylane as qml
from pennylane import numpy as np
from pennylane.optimize import NesterovMomentumOptimizer
from sklearn.preprocessing import MinMaxScaler

with open('heart.csv') as f:
    temp = np.loadtxt(f, delimiter=',', skiprows=1)

X = temp[:,:-1]
y = temp[:,-1]
X = MinMaxScaler().fit_transform(X)

num_data = len(y)
num_train = int(0.75 * num_data)
index = np.random.permutation(range(num_data))
feats_train = X[index[:num_train]]
Y_train = y[index[:num_train]]
feats_val = X[index[num_train:]]
Y_val = y[index[num_train:]]


num_wires = 13
dev = qml.device("default.qubit", wires=num_wires)

def block(weights, wires):
    qml.RX(weights[0], wires=wires[0])
    qml.RY(weights[1], wires=wires[1])
    qml.CNOT(wires=wires)

@qml.qnode(dev)
def circuit(weights,feature_vector,layer_num=1):
    for i in range(layer_num):
        qml.AngleEmbedding(features=feature_vector, wires=range(num_wires), rotation='Y')
        qml.MPS(
            wires=range(num_wires),
            n_block_wires=2,
            block=block,
            n_params_block=2,
            template_weights=weights,
        )
    return qml.expval(qml.PauliZ(wires=num_wires-1))

weights = np.random.random(size=[12, 2], requires_grad=True)
bias = np.array(0.0, requires_grad=True)


def variational_classifier(weights, bias, feature_vector):
    return circuit(weights, feature_vector) + bias

def square_loss(labels, predictions):
    loss = 0
    for l, p in zip(labels, predictions):
        loss +=  (l - p) ** 2
    return loss/ len(labels)

def cost(weights, bias, features, labels):
    predictions = [variational_classifier(weights, bias, f) for f in features]
    return square_loss(labels, predictions)

def accuracy(labels, predictions):
    success = 0
    for l, p in zip(labels, predictions):
        if abs(l - p) < 1e-1: # I even changed this to 1e-1
            success += 1
    return success/ len(labels)

opt = NesterovMomentumOptimizer()
# train the variational classifier
for it in range(60):
    # Update the weights by one optimizer step
    batch_index = np.random.randint(0, num_train, (20,))
    feats_train_batch = feats_train[batch_index]
    Y_train_batch = Y_train[batch_index]
    weights, bias, _, _ = opt.step(cost, weights, bias, feats_train_batch, Y_train_batch)

if it % 5 == 0:
    # Compute predictions on train and validation set
    predictions_train = [np.sign(variational_classifier(weights, bias, f)) for f in feats_train]
    predictions_val = [np.sign(variational_classifier(weights, bias, f)) for f in feats_val]
    # Compute accuracy on train and validation set
    acc_train = accuracy(Y_train, predictions_train)
    acc_val = accuracy(Y_val, predictions_val)
    print(
        "Iter: {:5d} | Cost: {:0.7f} | Acc train: {:0.7f} | Acc validation: {:0.7f} "
        "".format(it + 1, cost(weights, bias, X, y), acc_train, acc_val)+str([ '%.2f' % elem for elem in weights.flatten().tolist() ])+str('%.2f' % bias)
    )

Hi @Jiakai_Wang, welcome to the forum!

It’s hard to identify what is happening without seeing your data. I would have a few suggestions:

  1. Make a smaller version of your problem (maybe just 2 features) and create some data where you know the classification task is possible. If everything goes well you can increase the dimension of your problem to the original dimension, and test with this fake data where you know that the classification task is possible. This will help you understand whether the problem is in your dataset or somewhere else.
  2. Test with a different optimizer
  3. Test a different embedding and a different ansatz. Unfortunately there’s no ‘one size fits all’ solution so this may be causing your problems.

I hope this can help you start looking deeper into solutions. Feel free to post your data here or any insights that you have gained. Also, make sure that you’re using the latest version of PennyLane by looking at qml.about()

Please let me know if you figure out the problem or if you have additional questions!

Thanks, I’ve made progress by specifying the circuit structure from [
arXiv:1905.10876] gate-by-gate and using a 1-d array of weights. Also I found that np.sign() returns +1 and -1 while my labels are 0 and 1. That’s the reason my accuracy() is not changing. Using AmplitudeEmbedding for the 13 features on 4 wires leads to an accuracy of 80%. Although I still don’t know how to initialize the weights for 2 layer of tensor network, I think I’m in good shape and don’t need further help.

This is awesome @Jiakai_Wang!

Thank you for posting your insights here. They are great finds.

Also, feel free to open a topic to ask about the initialization of the weights.

Enjoy using PennyLane! :smiley: