Quantum Machine Learning in Feature Hilbert Spaces

I am trying to implement the variational quantum classifier(explicit approach- Figs. 4 and 5) as described in the paper using Pennylane. The circuit architecture is provided in the supplementary information.

I have coded everything and it seems okay to me. However, it doesn’t work and I get the UserWarning: Output seems independent of input.

Can there be a possible error in encoding the inputs and circuit parameters?
@Maria_Schuld

Code snippet:

dev_fock = qml.device("strawberryfields.fock", wires=2, cutoff_dim=3)

def layer(W):
    qml.Beamsplitter(W[0,0], W[1,0], wires=[0,1])
    qml.Displacement(1.0,W[0,1],wires=0)
    qml.Displacement(1.0,W[1,1],wires=1)
    qml.QuadraticPhase(W[0,2],wires=0)
    qml.QuadraticPhase(W[1,2],wires=1)
    qml.CubicPhase(W[0,3],wires=0)
    qml.CubicPhase(W[1,3],wires=1)

@qml.qnode(dev_fock)
def circuit_p0(weights, x):
    qml.templates.embeddings.SqueezingEmbedding(x,wires=[0,1],method='phase',c=1.5)
    for W in weights:
        layer(W)
    return expval(qml.FockStateProjector(np.array([2,0]), wires=[0,1]))

@qml.qnode(dev_fock)
def circuit_p1(weights, x):
    qml.templates.embeddings.SqueezingEmbedding(x,wires=[0,1],method='phase',c=1.5)
    for W in weights:
        layer(W)
    return expval(qml.FockStateProjector(np.array([0,2]), wires=[0,1]))

def variational_classifier(var, x):
    weights=var[0]
    bias=var[1]
    p0= circuit_p0(weights,x)+bias
    p1= circuit_p1(weights,x)+bias
    prob_y0= p0/(p0+p1)
    prob_y1= p1/(p0+p1)
    if prob_y0> prob_y1:
        ans=-1
    else:
        ans=1
    return ans

def square_loss(labels, predictions):
    loss = 0
    for l, p in zip(labels, predictions):
        loss = loss + (l - p) ** 2
    loss = loss / len(labels)
    return loss

def accuracy(labels, predictions):
    acc = 0
    for l, p in zip(labels, predictions):
        if abs(l - p) < 1e-5:
            acc = acc + 1
    acc = acc / len(labels)
    return acc

def cost(var, X, Y):
    predictions = [variational_classifier(var, x=x) for x in X]
    return square_loss(Y, predictions)

X, Y= sklearn.datasets.make_moons(n_samples=200, shuffle=True, noise=0.1, random_state=None)
Y = Y * 2 - np.ones(len(Y))  # shift label from {0, 1} to {-1, 1}

np.random.seed(0)
num_qubits = 2
num_layers = 4
var_init = (np.random.randn(num_layers, num_qubits,4),0.0)

batch_size = 5
opt = AdamOptimizer(0.01, beta1=0.9, beta2=0.99)
var = var_init
sqloss=np.zeros(50)
for it in range(50):
    batch_index = np.random.randint(0, len(X), (batch_size,))
    X_batch = X[batch_index]
    Y_batch = Y[batch_index]
    var = opt.step(lambda v: cost(v, X_batch, Y_batch), var)
    predictions = [variational_classifier(var, x) for x in X]
    acc = accuracy(Y, predictions)
    sqloss[it]= square_loss(Y,predictions)
    print("Iter: {:5d} | Cost: {:0.7f} | Accuracy: {:0.7f} ".format(it + 1, cost(var, X, Y), acc))

Dear Vineesha,

Welcome to the forum and thanks for your question!

When I run your code I get a different error from yours.

I wonder if you could please shorten your code to a minimum working example which only contains the lines necessary to reproduce the error? That would make it a lot easier for me to try and help :slight_smile:

1 Like

Dear Maria Schuld,

Thank you for your reply.
I was able to fix the errors.
However, the training is not very good. I get the results as shown below for 50 training examples generated using X, Y= sklearn.datasets.make_circles(n_samples=50, shuffle=True, noise=0.1, random_state=None)

I use Adam optimizer with learning rate 0.005 with square-loss cost function and batch-size=5.

How can I improve my training performance?

That is the central research question when doing machine learning :slight_smile:

You will have to experiment with different optimization settings and models, and if none works you might want think about theoretical limitations that prevent your model from training. It’s hard to give you a general answer here…

1 Like

Hi Vineesha,

I am trying to implement the classifier described in the cited article and I’m getting the same warning you mentioned: UserWarning: Output seems independent of input.
My code is actually similar to yours… I can’t really understand where I’m doing wrong and how this warning can affect my results.
How did you manage to solve this issue?

Thanks in advance!

Hi @valeria. Thanks for the question. Could you please run qml.about() and tell me the output and also give me the full list of your imports?

Hi @sjahangiri,

Thanks for your answer! After running qml.about(), I get the following output:
WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip. Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue. To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.

Name: PennyLane
Version: 0.8.1
Summary: PennyLane is a Python quantum machine learning library by Xanadu Inc.
Home-page: https://github.com/XanaduAI/pennylane
Author: None
Author-email: None
License: Apache License 2.0
Location: c:\users\valeria\anaconda3\lib\site-packages
Requires: scipy, networkx, appdirs, autograd, numpy, toml, semantic-version
Required-by: PennyLane-SF, PennyLane-qiskit, PennyLane-Cirq
Platform info:           Windows-8.1-6.3.9600-SP0
Python version:          3.7.3
Numpy version:           1.18.2
Scipy version:           1.2.1
Installed devices:
- default.gaussian (PennyLane-0.8.1)
- default.qubit (PennyLane-0.8.1)
- default.tensor (PennyLane-0.8.1)
- default.tensor.tf (PennyLane-0.8.1)
- strawberryfields.fock (PennyLane-SF-0.8.0)
- strawberryfields.gaussian (PennyLane-SF-0.8.0)
- qiskit.aer (PennyLane-qiskit-0.8.2)
- qiskit.basicaer (PennyLane-qiskit-0.8.2)
- qiskit.ibmq (PennyLane-qiskit-0.8.2)
- cirq.simulator (PennyLane-Cirq-0.8.0)

Moreover, these are the packages I imported in my notebook:
from pennylane import numpy as np
import pennylane as qml
import matplotlib.pyplot as plt
from sklearn.datasets import make_moons
%matplotlib inline

Thanks in advance for the help :slight_smile:

Thanks @valeria. I get a different error when I run the code copied above. Could you please copy a minimum working example of your code that gives that warning?

Hi @sjahangiri. The following is the needed code to get the mentioned warning.

from pennylane import numpy as np
import pennylane as qml
import matplotlib.pyplot as plt
from sklearn.datasets import make_moons
%matplotlib inline

np.random.seed(0)
eta = 0.01
batch_size = 5
n_steps = 500
n_layers = 4
Theta = (0.05*np.random.rand(n_layers,2,4),0.01)
fraction_train = 0.75
dev = qml.device("strawberryfields.fock", wires=2, cutoff_dim=3, analytic=False)

def layer(l,params):
    qml.Beamsplitter(params[l,0,0],params[l,1,0],wires=[0,1])
    qml.Displacement(params[l,0,1],0.0,wires=0)
    qml.Displacement(params[l,1,1],0.0,wires=1)
    qml.QuadraticPhase(params[l,0,2],wires=0)
    qml.QuadraticPhase(params[l,1,2],wires=1)
    qml.CubicPhase(params[l,0,3],wires=0)
    qml.CubicPhase(params[l,1,3],wires=1)

@qml.qnode(dev)
def circuit_o0(params, x):
    qml.templates.embeddings.SqueezingEmbedding(x,wires=[0,1],method='phase',c=1.5)
    for l in range(n_layers):
        layer(l,params)
    return qml.expval(qml.FockStateProjector(np.array([2,0]), wires=[0,1]))

@qml.qnode(dev)
def circuit_o1(params, x):
    qml.templates.embeddings.SqueezingEmbedding(x,wires=[0,1],method='phase',c=1.5)
    for l in range(n_layers):
        layer(l,params)
    return qml.expval(qml.FockStateProjector(np.array([0,2]), wires=[0,1]))

def variational_classifier(Theta, x):
    params = Theta[0]
    bias = Theta[1]
    o0 = circuit_o0(params,x) + bias
    o1 = circuit_o1(params,x) + bias
    p0 = o0/(o0+o1)
    p1 = o1/(o0+o1)
    if p0 > p1:
        label = 0
    else:
        label = 1
    return label

def square_loss(labels, predictions):
    loss = 0
    for l, p in zip(labels, predictions):
        loss = loss + (l - p) ** 2
    loss = loss / len(labels)
    return loss

def accuracy(labels, predictions):
    acc = 0
    for l, p in zip(labels, predictions):
        if abs(l - p) < 1e-5:
            acc = acc + 1
    acc = acc / len(labels)
    return acc

def cost(Theta, X, Y):
    predictions = np.array([variational_classifier(Theta, x=x) for x in X])
    return square_loss(Y, predictions)

X, Y = make_moons(n_samples=200, shuffle=True, noise=0.1, random_state=None)
num_data = len(Y)
num_train = int(fraction_train * num_data)
index = np.random.permutation(range(num_data))
feats_train = X[index[:num_train]]
Y_train = Y[index[:num_train]]
feats_val = X[index[num_train:]]
Y_val = Y[index[num_train:]]
X_train = X[index[:num_train]]
X_val = X[index[num_train:]]
opt = qml.AdamOptimizer()

var = Theta
a = []
c = []
l = []
cnt = 0

for it in range(n_steps):
    batch_index = np.random.randint(0, num_train, (batch_size,))
    feats_train_batch = feats_train[batch_index]
    Y_train_batch = Y_train[batch_index]
    var = opt.step(lambda v: cost(v, feats_train_batch, Y_train_batch), var)
    predictions_train = np.array([variational_classifier(var, x) for x in feats_train])
    predictions_val = np.array([variational_classifier(var, x) for x in feats_val])
    acc_train = accuracy(Y_train, predictions_train)
    acc_val = accuracy(Y_val, predictions_val)
    c.append(cost(var, X, Y))
    a.append(acc_val)
    l.append(square_loss(Y_val, predictions_val))
    if it>0 and abs(a[it] - a[it-1])<1e-3:
        cnt = cnt+1
    else:
        cnt = 0
    if cnt >= 20 and it>=100:
        break
    print(
        "Iter: {:5d} | Loss: {:0.7f} | Cost: {:0.7f} | Acc train: {:0.7f} | Acc validation: {:0.7f} "
        "".format(it + 1, square_loss(Y_val, predictions_val), cost(var, X, Y), acc_train, acc_val)
    )
1 Like

Hi @valeria,

The automatic differentiation framework (in this case, the autograd.numpy module) needs to see every step in the computation as differentiable. Your conditional check and assignment of integer in variational_classifier means that this particular step is non-differentiable. To avoid the warning you can train on the label probabilities, not the labels, by modifying variational_classifier as:

def variational_classifier(Theta, x):
    params = Theta[0]
    bias = Theta[1]
    o0 = circuit_o0(params,x) + bias
    o1 = circuit_o1(params,x) + bias
    p0 = o0/(o0+o1)
    return p0, 1-p0

In this case, square_loss and cost should be updated as well:

def square_loss(labels, prediction_probs):
    loss = 0
    for l, p in zip(labels, prediction_probs):
        prediction = 0.0 * p[0] + 1.0 * p[1]  # 0*p[0] actually unnecessary, but kept for clarity
        loss = loss + (l - prediction) ** 2
    loss = loss / len(labels)
    return loss

def cost(Theta, X, Y):
    prediction_probs = np.array([variational_classifier(Theta, x=x) for x in X])  # only name changed here, to make more clear
    return square_loss(Y, prediction_probs)

There’s an illustrative related example here for making the training loss a function of the label probabilities (this example uses logistic loss instead of square loss though).

Please let us know if you have any further questions.

2 Likes

Thank you so much @sjahangiri! Your explanation was very clear and the warning disappeared - your help was so precious :smile:
Cheers,
Valeria

1 Like

Hi @valeria I did a similar correction to my variational_classifier function, I am returning only normalized p0 and adjusting square loss function accordingly:

def variational_classifier(weights, x=None):
     p0= circuit_p0(weights,x=x)
     p1= circuit_p1(weights,x=x)
     normalization= p0+ p1+ 1e-10
     output= p0/normalization
     return output
1 Like

Hi @valeria @Vineesha,

I am using the same code but getting a different error:

Cannot differentiate with respect to parameter(s) {44}

I tried sjahangiri’s recommendation but it doesn’t work. How did you manage this? My code is below:

import pennylane as qml
from pennylane import expval
from pennylane import numpy as np
from pennylane import AdamOptimizer
import sklearn.datasets
from sklearn.datasets import make_moons
%matplotlib inline    

dev = qml.device("strawberryfields.fock", wires=2, cutoff_dim=3, analytic=False)

    np.random.seed(0)
    batch_size = 5
    n_steps = 500
    n_layers = 4
    Theta = 0.05*np.random.rand(n_layers,2,4)
    fraction_train = 0.75
    lambda_reg = 0.1

    def layer(params,l): # The variational circuit layer
        qml.Beamsplitter(params[l,0,0],params[l,1,0],wires=[0,1])
        qml.Displacement(params[l,0,1],0.0,wires=0)
        qml.Displacement(params[l,1,1],0.0,wires=1)
        qml.QuadraticPhase(params[l,0,2],wires=0)
        qml.QuadraticPhase(params[l,1,2],wires=1)
        qml.CubicPhase(params[l,0,3],wires=0)
        qml.CubicPhase(params[l,1,3],wires=1)

    @qml.qnode(dev)
    def circuit_o0(Theta, x): # Circuit to Calculate p(2,0)
        qml.templates.embeddings.SqueezingEmbedding(x,wires=[0,1],method='phase',c=1.5)
        for l in range(n_layers):
            layer(Theta,l=l)
        return qml.expval(op=qml.FockStateProjector(np.array([2,0]), wires=[0,1]))

    @qml.qnode(dev)
    def circuit_o1(params,x): # Circuit for Calculate p(0,2)
        qml.templates.embeddings.SqueezingEmbedding(x,wires=[0,1],method='phase',c=1.5)
        for l in range(n_layers):
            layer(params,l=l)
        return qml.expval(op=qml.FockStateProjector(np.array([0,2]), wires=[0,1]))

    def variational_classifier(Theta, x): # Determine the probability: p(2,0)/(p(0,2)+p(2,0))
        #params = Theta[0]
        #bias = Theta[1]
        o0 = circuit_o0(Theta,x=x) + 0.01
        o1 = circuit_o1(Theta,x=x) + 0.01
        p0 = o0/(o0+o1)
        return p0, 1-p0

    def square_loss(labels,prediction_probs): # Calculate square loss
        loss = 0
        print(prediction_probs)
        for l, p in zip(labels, prediction_probs):
            prediction = 0.0 * p[0] + 1.0 * p[1]
            loss = loss - (l - p) ** 2
        loss = loss / len(labels)
        print(loss)
        return loss

    def accuracy(labels, predictions): # Calculate Accuracy
        acc = 0
        for l, p in zip(labels, predictions):
            if abs(l - p) < 1e-5:
                acc = acc + 1
        acc = acc / len(labels)
        return acc

    def cost(Theta, X=None, Y=None): # Calculate cost using square loss
        prediction_probs = np.array([variational_classifier(Theta, x=x) for x in X])
        return square_loss(Y,prediction_probs)



    # Extract dataset
    X, Y = make_moons(n_samples=200, shuffle=True, noise=0.1, random_state=None)

    num_data = len(Y) # Number of data

    num_train = int(fraction_train * num_data) # Number of training samples

    index = np.random.permutation(range(num_data)) # Permutation for shuffling
    feats_train = X[index[:num_train]] # Features for training
    Y_train = Y[index[:num_train]] # Labels of training features
    feats_val = X[index[num_train:]] # Test features
    Y_val = Y[index[num_train:]] # Labels of test features

    opt = qml.AdamOptimizer() # I will use Adam Optimizer

    var = Theta
    a = []
    c = []
    l = []
    cnt = 0

    for it in range(n_steps):
        
        # Randomly select the batches
        batch_index = np.random.randint(0, num_train, (batch_size,))
        feats_train_batch = feats_train[batch_index]
        Y_train_batch = Y_train[batch_index]
        
        # Optimize the theta
        var = opt.step(lambda v: cost(v, X=feats_train_batch, Y=Y_train_batch), var)
        
        # Predict train and test samples
        predictions_train = np.array([variational_classifier(var, x) for x in feats_train])
        predictions_val = np.array([variational_classifier(var, x) for x in feats_val])
        
        # Calculate accuracies
        acc_train = accuracy(Y_train, predictions_train)
        acc_val = accuracy(Y_val, predictions_val)
        
        # Store cost, accuracy and square loss values
        c.append(cost(var, X, Y))
        a.append(acc_val)
        l.append(square_loss(Y_val, predictions_val))
        
        # Break the loop after reaching convergence
        if it>0 and abs(a[it] - a[it-1])<1e-3:
            cnt = cnt+1
        else:
            cnt = 0
        if cnt >= 20 and it>=100:
            break
            
        # Print results
        print(
            "Iter: {:5d} | Loss: {:0.7f} | Cost: {:0.7f} | Acc train: {:0.7f} | Acc validation: {:0.7f} "
            "".format(it + 1, square_loss(Y_val, predictions_val), cost(var, X, Y), acc_train, acc_val)
        )

Hi @hilarlia,

Welcome to the community :slight_smile:

Your question was directed at other users, so I’ll leave space for them to provide their thoughts, but I was looking into your code and was unable to reproduce the error you specified (I ended up with a different error).

Both PennyLane and Strawberry Fields are updated regularly with potentially breaking changes, and this thread is several months old. To help with solving any issues you are seeing, could you please post the output of qml.about() here?

Hi @nathan,
Now I noticed I send the code here different. The line in the square loss function should be:
loss = loss + (l - prediction) ** 2
Now it gives that error.
Name: PennyLane Version: 0.14.1 Summary: PennyLane is a Python quantum machine learning library by Xanadu Inc. Home-page: https://github.com/XanaduAI/pennylane Author: None Author-email: None License: Apache License 2.0 Location: c:\users\ufkug\.conda\envs\env\lib\site-packages Requires: toml, numpy, semantic-version, scipy, networkx, autograd, appdirs Required-by: PennyLane-SF Platform info: Windows-10-10.0.19041-SP0 Python version: 3.8.3 Numpy version: 1.18.5 Scipy version: 1.5.1 Installed devices:
default.gaussian (PennyLane-0.14.1)
default.mixed (PennyLane-0.14.1)
default.qubit (PennyLane-0.14.1)
default.qubit.autograd (PennyLane-0.14.1)
default.qubit.jax (PennyLane-0.14.1)
default.qubit.tf (PennyLane-0.14.1)
default.tensor (PennyLane-0.14.1)
default.tensor.tf (PennyLane-0.14.1)
strawberryfields.fock (PennyLane-SF-0.14.0)
strawberryfields.gaussian (PennyLane-SF-0.14.0)
strawberryfields.gbs (PennyLane-SF-0.14.0)
strawberryfields.remote (PennyLane-SF-0.14.0)
strawberryfields.tf (PennyLane-SF-0.14.0)

Hi @hilarlia,

Thanks for the clarification. I was able to determine that the cause of the error message was the usage of FockStateProjector in your model. Specifically, this measurement operation takes an array parameter to indicate which Fock states to compute the expectation value of. By default, PennyLane assumes all array parameters should be differentiated, unless directly stated otherwise (this is similar to what TensorFlow does; in contrast PyTorch requires you to explicitly declare every trainable parameter manually).

In this case, you actually don’t want to differentiate those specific arrays, since doing so doesn’t make sense.

You will be able to get your code to run with the following modification:

projector20 = np.array([2, 0], requires_grad=False)
projector02 = np.array([0, 2], requires_grad=False)

along with updating your circuits to instead end with:

return qml.expval(op=qml.FockStateProjector(projector20, wires=[0,1]))

and

return qml.expval(op=qml.FockStateProjector(projector02, wires=[0,1]))

as applicable.

Note that the reason the previous posters did not have this issue is that they were using an older version of PennyLane at the time.

Hi @Maria_Schuld @Vineesha , I am a beginner in in the quantum machine learning and was going through the paper QML in feature hilbert space. At one point I am little confused, it will be great if either of you help in clarifying.

In the paper at page #7 the statement “Since this probability depends on the displacement and squeezing intensity, it is better to define two probabilities, say p(n1 = 2; n2 = 0) and p(n1 = 0; n2 = 2), as
a one-hot encoded output vector (o0; o1).”
Here my doubt is why we are considering 2. Is it representing probability or photon count. I am bit confused. if you kindly give a brief explanation to it.

Thanks and Regards

Hi @Satanik_Mitra, welcome to the forum!

Since it seems your question is independent of the above discussion, could you please create a new topic specifically for it? This will make it easier for future readers to find it :slight_smile:

1 Like

Thank you @nathan, it was really helpful, I don’t get the error anymore! Sorry for late response, I forgot to answer at the moment.

Hi @Maria_Schuld,

I was trying to reproduce the results in your paper, so I have a couple of questions since the accuracy always oscillates between interval 0.6-0.8. First, how do you implement l2 regularization in binary classification? Is it adding the Frobenius norm^2 of theta multiplied with a penalty term to the square loss? If so, could I ask what was the penalty term you set? The second question is how did you implement the optimizer? Is it TensorFlow’s Stochastic Gradient Descent? How did you set the adaptive learning rate?

By the way sorry for bothering those questions. If answering those questions violate some kind of confidentiality, feel free to ignore my post.