QNGOptimizer with Variational Classifiers

I am trying to implement the example in https://pennylane.ai/qml/demos/tutorial_data_reuploading_classifier.html with QNGOptimizer because I learned that QNGOptimizer is faster. Then I got the error, The QNG optimizer supports single QNodes or ExpvalCost objects as objective functions. Alternatively, the metric tensor can directly be provided to the step() method of the optimizer, using the metric_tensor_fn argument.

By searching the doc and forum, I also learn that QNGOptimizer cannot work with hybrid cost function.

The cost function in the tutorial is

def cost(params, x, y, state_labels=None):
    """Cost function to be minimized.

    Args:
        params (array[float]): array of parameters
        x (array[float]): 2-d array of input vectors
        y (array[float]): 1-d array of targets
        state_labels (array[float]): array of state representations for labels

    Returns:
        float: loss value to be minimized
    """
    # Compute prediction for each input in data batch
    loss = 0.0
    dm_labels = [density_matrix(s) for s in state_labels]
    for i in range(len(x)):
        f = qcircuit(params, x[i], dm_labels[y[i]])
        loss = loss + (1 - f) ** 2
    return loss / len(x)

In terms of this cost function, does it mean there are 2 Qnodes considering the two types of observable?

Also, could you shed light on combing data_reuploading_classifier with QNGOptimizer?

Thank you in advance.
Ban

One idea is that I should change the circuit from

@qml.qnode(dev, diff_method='adjoint')
def qcircuit(params, x, y):
    """A variational quantum circuit representing the Universal classifier.

    Args:
        params (array[float]): array of parameters
        x (array[float]): single input vector
        y (array[float]): single output state density matrix

    Returns:
        float: fidelity between output state and input
    """
    for p in params:
        qml.Rot(*x, wires=0)
        qml.Rot(*p, wires=0)
    return qml.expval(qml.Hermitian(y, wires=[0]))

to

@qml.qnode(dev, diff_method='adjoint')
def qcircuit(params, x, y):
    """A variational quantum circuit representing the Universal classifier.

    Args:
        params (array[float]): array of parameters
        x (array[float]): single input vector
        y (array[float]): single output state density matrix

    Returns:
        float: fidelity between output state and input
    """
    for p in params:
        qml.Rot(*x, wires=0)
        qml.Rot(*p, wires=0)
    return qml.expval(qml.PauliZ(wires=[0]))

And then redefine the cost function.
Do you think it is feasible?

Hi @Ban_Wong, welcome to the Forum!

It’s a strong claim to say that the QNG Optimizer will always be faster. This is probably not the case. However it’s interesting exploring whether or not it’s faster for this particular application.

In the case of the data reuploading classifier demo there’s a single qnode so this is not the source of the problem.

If you change the return to use a PauliZ instead of using the Hermitian of the density matrix y, you will notice that your code won’t train.

The real issue is that the return function of your qnode changes with each iteration because it’s tied to y. This is why the error message is that you have multiple qnodes.

If you go to the documentation for the QNG Optimizer you will find that the suggestion is to use qml.metric_tensor (note that ExpvalCost is deprecated). However you may need to do a lot of workarounds to get this to work.

It probably won’t be an easy fix so maybe instead of using the QNG Optimizer you can keep using the Adam Optimizer.

I hope this will be helpful for you.

@CatalinaAlbornoz I am also working with the QNG optimizer and variational circuit. But some of the diff_method does not work. could you please tell which device, diff_method can compute results quickly with QNG optimizer?

Hi @Amandeep ,

Optimization problems are problem-specific, meaning that it’s very hard to tell which optimizer will give better or faster results than another. There’s no one-size-fits-all.

That being said, I can give you some pointers.

I made a QNGO test script that you can see below. For that script the table below answers the question: “Does this diff_method and device combination work?”

diff_method default.qubit lightning.qubit
backprop Yes No
adjoint No No
parameter-shift Yes Yes
finite-diff Yes Yes
hadamard Yes Yes
device No No
spsa Yes Yes
best Yes Yes

Not all combinations will work equally fast. Generally default.qubit and backprop will be fastest. However when you have 15 qubits or more lightning.qubit might be fastest as seen on our performance page. However, parameter-shift can be very slow, so you’ll have to test yourself what works better for your particular use case.

# Data
data = pnp.array([0.,1.],requires_grad=False)

# Device
n_qubits=2
# We create a device with one extra wire because we need an auxiliary wire when using QNGO
dev = qml.device('default.qubit', wires=n_qubits+1)

# QNode
diff_method='backprop'

@qml.qnode(dev,diff_method=diff_method)
def circuit(params):
  # Data embedding
  qml.RX(data[0],wires=0)
  qml.RX(data[1],wires=1)

  # Parametrized layer
  qml.Rot(params[0],params[1],params[2],wires=0)
  qml.Rot(params[0],params[1],params[2],wires=1)
  qml.Hadamard(wires=0)
  qml.CNOT(wires=[0,1])

  # Measurement
  return qml.expval(qml.Z(0))

# Initial value of the parameters
params = pnp.array([1.,2.,3.],requires_grad=True)

# Initial value of the circuit
print(circuit(params))

# Cost function
def cost_f(params):
  return pnp.abs(circuit(params))

# Optimizer
opt = qml.QNGOptimizer()

# If we're using QNGO we need to define a metric tensor function
mt_fn = qml.metric_tensor(circuit)
print(mt_fn(params))

# Optimization loop
for it in range(10):
  params = opt.step(cost_f,params,metric_tensor_fn=mt_fn)
  print(params)
  print('Cost: ', cost_f(params))

Note that the non-trainable data here is passed directly into the inside of the function and not as an argument, otherwise things start breaking. I hope this helps!

@CatalinaAlbornoz Thank you so much. It is really helpful.

I’m glad to hear @Amandeep !

@CatalinaAlbornoz I try to train some data by passing into a circuit and using metric tensor. but did not work out due to some shape error.

import pennylane as qml
from pennylane import numpy as np
from sklearn.preprocessing import OneHotEncoder
from sklearn.model_selection import train_test_split

num_samples = 100
num_features = 16
inputs = np.random.random((num_samples, num_features))
labels = np.random.randint(0, 4, num_samples)


encoder = OneHotEncoder(categories='auto')
labels_onehot = encoder.fit_transform(labels.reshape(-1, 1)).toarray()


n_qubits = 4
dev = qml.device("default.qubit", wires=n_qubits)

@qml.qnode(dev, diff_method="backprop")
def circuit(weights, inputs):
    qml.AmplitudeEmbedding(inputs, wires=range(n_qubits), normalize=True)
    qml.BasicEntanglerLayers(weights=weights, wires=range(n_qubits))
    return [qml.expval(qml.PauliZ(i)) for i in range(n_qubits)]


np.random.seed(42)
num_layers = 3 
weights = np.random.random((num_layers, n_qubits))


def cost(weights, inputs, labels):
    predictions = np.array([circuit(weights, x) for x in inputs])
    loss = np.mean((predictions - labels) ** 2)
    return loss

train_inputs, val_inputs, train_labels, val_labels = train_test_split(inputs, labels_onehot, test_size=0.2, random_state=42)

# Optimize the circuit parameters
opt = qml.QNGOptimizer(stepsize=0.01)
epochs = 10
#mt_fn = qml.metric_tensor(circuit)

for epoch in range(epochs):
    cost_fn = lambda w: cost(w, train_inputs, train_labels)
        
    mt_fn = qml.metric_tensor(circuit)
    # Perform optimization step
    weights = opt.step_and_cost(cost_fn, weights,train_inputs, metric_tensor_fn=mt_fn)
        
    #weights = opt.step(cost,weights,metric_tensor_fn=mt_fn)
    
    loss = cost(weights, train_inputs, train_labels)
    print(f"Epoch {epoch + 1}/{epochs} - Training loss: {loss:.4f}")

# Evaluate the trained model on validation data
predictions_val = np.array([circuit(weights, x) for x in val_inputs])
accuracy = np.mean(np.argmax(predictions_val, axis=1) == np.argmax(val_labels, axis=1))
print(f"Validation accuracy: {accuracy * 100:.2f}%")

Hi @Amandeep , can you please post the error that you get?

Adapting the code to your data is an important part of the process. You need to understand the shape of your data and how to modify the code and the data to make them match.

My guess is that your inputs don’t have the right shape required by AmplitudeEmbedding. Maybe you need to add pad_with=0. or reduce the size of your inputs in case it’s larger than 2^n, where n is the number of qubits. The documentation can help you see coded examples of how to use these.

I hope this helps!

@CatalinaAlbornoz I am using QNGD and adam optimizer from pennylane. Could you please tell me what is the default batch size of both. Usually, in tensorflow optimizer its 32. When I do not set batch size in both optimizer from pennylane, Adam is better than QNGD. and when i set batch size 32 in both optimizers, Adam shows much variations in accuracy as compared to QNGD.

Hi @Amandeep ,

Are you sure you’re referring to batch size? Or are you maybe confusing it with step size? These optimizers don’t have a batch size that you can set. The batch size is defined by you and how you process your data. You can see examples in the demos here. However, this can complicate things for you so I’d recommend going for simpler workflows without batches until you get everything working as you expect it.

I mean with batch size which I am setting. If i don’t process data batch wise in QNGD, then it did not perform. But ADAM does.

I need to compare the performance of QNGD, QDAM and QSPSA.

but when i process data batch wise, ADAM shows fluctuations in results, QNGD works okay
And when i process data without batches, everything is fast, but adam works very well and QNGD struggles and takes much epochs to match performance of ADAM.

What can be the best way to match their performance and compare

One more when i work with QSPSA, the below code which was given in last discussion, it works well for one data sample

import pennylane as qml
from pennylane import numpy as pnp

dev = qml.device("default.qubit", wires=3)

@qml.qnode(dev)
def circuit(params, data):
    qml.AngleEmbedding(data, wires=[0, 1, 2])
    qml.StronglyEntanglingLayers(params, wires=[0, 1, 2])
    return qml.expval(qml.PauliZ(2))

data = pnp.random.random([3], requires_grad=False)
params = pnp.random.random(qml.StronglyEntanglingLayers.shape(3, 3), requires_grad=True)

opt = qml.QNSPSAOptimizer()

for it in range(10):
    [params, data], loss = opt.step_and_cost(circuit, params, data) # modified

    print(f"Epoch: {it} | Loss: {loss} |")

When I have multiple samples and loop over, it gives an error of shapes. How it can work and optimzie for more than one sample.

Hi @Amandeep ,

Since this code works it’s hard for me to figure out what’s going wrong.

For each of these situations where you’re getting an error or unexpected result, can you please share the following information?

  1. The output of qml.about()

  2. A minimal (but self-contained) working example
    This is the simplest version of the code that reproduces the problem. It should include all necessary imports, data, functions, etc., so that we can copy-paste the code and reproduce the problem. However it shouldn’t contain any unnecessary data, functions, …, for example gates and functions that can be removed to simplify the code.

  3. The full error traceback (or the result you get in case there’s no error).

If you’re not sure what these mean then make sure to check out this video.

I hope this will help us find the solution!

For anyone looking into using QNGO, PennyLane now supports multi-argument functions with QNGOptimizer!

If multiple arguments are trainable, the metric tensor and gradients are processed individually per parameter.

See PR #5926 for more details.