QML Algorithm doesn't learn

Hello,

I’m using a quantum machine learning program that uses this game:
https://gym.openai.com/envs/CartPole-v1/
to try to keep the brown stick up as long as possible. But the Pennylane qml algorithm doesn’t learn:

fig
The blue is the actual duration of each attempt/episode and the orange line is the average of the last 100 episodes. At the beginning, the high duration could be due to random choices and doesn’t necessarily mean the algorithm is getting worse. To get a perspective how it should look, here is a result of the program with only classical machine learning:

dqn50000

The Ansatz that was used is below:

import numpy as np
import torch
import torch.nn as nn
from torch.nn.functional import relu
import pennylane as qml

out_dim = 2  # output dimension of model
wires = 1  # this is the width of the quantum element
n_quantum_layers = 2  # this is the depth of the quantum element


def layer(inputs, w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10):
    qml.templates.SqueezingEmbedding(inputs, wires=range(wires))
    qml.templates.CVNeuralNetLayers(w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10,
                                    wires=range(wires))
    return [qml.expval(qml.X(wires=i)) for i in range(wires)]


class DQN(nn.Module):

    def __init__(self, img_height, img_width):
        super().__init__()
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(in_features=img_height * img_width * 3, out_features=12)
        self.fc2 = nn.Linear(in_features=12, out_features=8)
       # self.fc3 = nn.Linear(in_features=10, out_features=8)
        self.clayer_in = torch.nn.Linear(in_features=8, out_features=wires)
        self.clayer_out = torch.nn.Linear(wires, out_dim)

        dev = qml.device('strawberryfields.fock', wires=wires, cutoff_dim=3)
        self.layer_qnode = qml.QNode(layer, dev)

        weights = qml.init.cvqnn_layers_all(n_quantum_layers, wires)
        weight_shapes = {"w{}".format(i): w.shape for i, w in enumerate(weights)}
        
        self.qlayer = qml.qnn.TorchLayer(self.layer_qnode, weight_shapes)

    def forward(self, t):
        t = self.flatten(t)
        t = self.fc1(t)
        t = self.fc2(t)
       # t = self.fc3(t)
        t = self.clayer_in(t)
        t = self.qlayer(t)
        t = self.clayer_out(t)
        t = t.sigmoid()
        return t

Does anyone have an idea why the algorithm is not learning?

Hi @Shawn,

One thing that always comes to mind when people are working with CV layers is that if the cutoff is too small, it can be very easy to obtain inaccurate or confusing answers. The reason for this would be that certain gates (squeezing, displacement, and cubic phase) add energy to the system. The more of these gates you have, the more likely they are to raise the energy of the CV state, requiring a higher cutoff to accurately capture.

The cutoff in your code (of 3) is very likely to be too small, and to be subject the problem I described above. I would recommend verifying whether or not the system has a trace equal to (or near to) 1 at the end of your quantum layer. If not, you’ll need to bump up the cutoff dimension (with the tradeoffs in increased resources that come from that)

Hi @nathan many thanks for the insight. Are you referring to self.qlayer() or at the end of the forward() function?

Yes. More specifically, in the device used to compute that layer, as specified here:

My apologies @nathan but I just tried to print(dev) but I just get

Strawberry Fields Fock PennyLane plugin
Short name: strawberryfields.fock
Package: pennylane_sf
Plugin version: 0.9.0
Author: Josh Izaac
Wires: 1
Shots: 1000

and just running >>>dev gives

<StrawberryFieldsFock device (wires=1, shots=1000) at 0x7ff63e370bb0>

I ran this code:

import numpy as np
import torch
import torch.nn as nn
from torch.nn.functional import relu
import pennylane as qml

out_dim = 2  # output dimension of model
wires = 1  # this is the width of the quantum element
n_quantum_layers = 2  # this is the depth of the quantum element


def layer(inputs, w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10):
    qml.templates.SqueezingEmbedding(inputs, wires=range(wires))
    qml.templates.CVNeuralNetLayers(w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10,wires=range(wires))
    return [qml.expval(qml.X(wires=i)) for i in range(wires)]


flatten = nn.Flatten()
fc1 = nn.Linear(in_features=10 * 10 * 3, out_features=12)
fc2 = nn.Linear(in_features=12, out_features=8)
clayer_in = torch.nn.Linear(in_features=8, out_features=wires)
clayer_out = torch.nn.Linear(wires, out_dim)
dev = qml.device('strawberryfields.fock', wires=wires, cutoff_dim=3)
layer_qnode = qml.QNode(layer, dev)
weights = qml.init.cvqnn_layers_all(n_quantum_layers, wires)
weight_shapes = {"w{}".format(i): w.shape for i, w in enumerate(weights)}
qlayer = qml.qnn.TorchLayer(layer_qnode, weight_shapes)

Hi @Shawn,

As mentioned, you’ll need to change the cutoff dimension in your device.
You can do this by changing the line
dev = qml.device('strawberryfields.fock', wires=wires, cutoff_dim=3)
to
dev = qml.device('strawberryfields.fock', wires=wires, cutoff_dim=N)
where N is some higher cutoff value than 3

Hi @nathan yea that is obvious. Was just wondering how to get the output of the trace i.e. the matrix so I could see if it is near 1 to find a good number.

Hi Shawn,

The trace of the system, \text{Tr}(\rho), can also be written as \text{Tr}(\rho I)=\langle I\rangle, so we can equivalently think of it as the expectation of the identity operator.

This allows you to construct a QNode that returns the trace like so:

@qml.qnode(dev)
def layer(inputs, w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10):
    qml.templates.SqueezingEmbedding(inputs, wires=range(wires))
    qml.templates.CVNeuralNetLayers(w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10,wires=range(wires))
    return qml.Identity(wires=range(wires))

Alternatively, without modifying your existing QNode, you can inspect the device after QNode evaluation to find the trace:

@qml.qnode(dev)
def layer(inputs, w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10):
    qml.templates.SqueezingEmbedding(inputs, wires=range(wires))
    qml.templates.CVNeuralNetLayers(w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10,wires=range(wires))
    return [qml.expval(qml.X(wires=i)) for i in range(wires)]

# evaluate the QNode
result = layers(**inputs)

# Check the device trace
dev.state.trace()

Hi @josh thanks for the insight. @nathan I have cutoff_dim set to 10 and played with other variables (also removed some linear layers) and still see no learning from the algorithm. Here is the DQN algorithm (similar to the one in my initial post but some things changed):

out_dim = 4  # output dimension of model
wires = 1  # this is the width of the quantum element
n_quantum_layers = 2  # this is the depth of the quantum element


def layer(inputs, w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10):
    qml.templates.SqueezingEmbedding(inputs, wires=range(wires))
    qml.templates.CVNeuralNetLayers(w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10,
                                    wires=range(wires))
    return [qml.expval(qml.X(wires=i)) for i in range(wires)]


class DQN(nn.Module):

    def __init__(self, img_height, img_width):
        super().__init__()
        self.flatten = nn.Flatten()
        self.clayer_in = torch.nn.Linear(in_features=img_height * img_width * 3, out_features=wires)
        self.clayer_out = torch.nn.Linear(wires, out_dim)

        dev = qml.device('strawberryfields.fock', wires=wires, cutoff_dim=10)
        self.layer_qnode = qml.QNode(layer, dev)

        weights = qml.init.cvqnn_layers_all(n_quantum_layers, wires)
        weight_shapes = {"w{}".format(i): w.shape for i, w in enumerate(weights)}

        self.qlayer = qml.qnn.TorchLayer(self.layer_qnode, weight_shapes)

    def forward(self, t):
        t = self.flatten(t)
        t = self.clayer_in(t)
        t = self.qlayer(t)
        t = self.clayer_out(t)
        t = t.sigmoid()
        return t

Any other ideas on how I can improve this?

Hi Shawn,
Were you able to verify that the trace remained close to one (using the methods @josh mentioned) in your updated model?

No I kept getting errors so I just opted to try different cut_off dimensions. I tried up to 30 and didn’t see a difference. The problem is with @josh’s code is that I don’t have a decorator on my layer function. The first error I get is:

Traceback (most recent call last):
  File "test_qdqn.py", line 19, in <module>
    result = layer(**inputs)
NameError: name 'inputs' is not defined

@Josh do you have any recommendations on to solve this?

Hey @Shawn, if I may come in here briefly to clarify your last questions:

  1. The decorator Josh used is just a shorthand for what you are doing, your code
def layer(inputs, w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10):
    ...
    return ...

dev = qml.device('strawberryfields.fock', wires=wires, cutoff_dim=3)
layer_qnode = qml.QNode(layer, dev)

would be the same as writing

dev = qml.device('strawberryfields.fock', wires=wires, cutoff_dim=3)

@qml.qnode(dev)
def layer_qnode(inputs, w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10):
    ...
    return ...

since the decorator turns the quantum function directly into a qnode. In other words, both code pieces produce the same object layer_qnode.

  1. To check the trace of your qnode you have to evaluate it with specific inputs. I think Josh just called generic inputs to the qnode inputs here, which may be confusing because it is not the same as the object in def layer(inputs,...).

So in your case, you need to define specific values for input, w0, w1,.... at which you want to check the trace, and feed them to the qnode:

print(layer_qnode(inputs, w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10))
# Hopefully the result is 1

The double star notation is just a handy way to unpack dictionaries, while a single star would unpack a list.

Hi @Maria_Schuld thanks for clearing up the confusion there. I have a couple follow-up questions if that is okay:

  1. I read up on https://pennylane.readthedocs.io/en/ising/_modules/pennylane/qnode.html but how are the variables from layer()put into qml.QNode(layer, dev)? The variable res = self.func(*variables, **kwarg_variables) surely helps with this but the “inputs” and w_i’s from the layer() function have to be originated somewhere – I’m having troubles finding out how/where to find these values.

  2. I’m confused on the dimension of the variables w_i that is explained here: https://pennylane.readthedocs.io/en/stable/code/api/pennylane.templates.layers.CVNeuralNetLayers.html :
    “The layers act on the M modes given in wires Since in my example uses wires=1 Does that mean M is equal to 1? That wouldn’t make sense because K would then be zero.

  3. Doesn’t it matter what values I put into the layer() function? I would assume that would alter the trace, right? So putting in arbitrary values would give me wrong insight to the trace.

I’m hoping to see real values for the “inputs” and “w_i’s” from the layer()function so I can then run the function with those values and see what the value of the trace is. Seems more complicated than I thought! :smiley:

Of course.

  1. Maybe the “Creating a quantum node” section of the intro in the documentation can clarify this? From a user perspective, you first create a qnode which you assign to a variable, and then you use that variable as if it was your function.
layer_qnode = qml.QNode(layer, dev)

layer_qnode(<...parameters that you want to feed into layer...>)
  1. Would you mind reminding me, what is K? Yes, if you have a single wire then M should be 1.

  2. The parameters may very well alter the trace. Essentially, if your cutoff_dim is too small, the quantum simulation is not exact/correct. The level of correctness could change with the inputs. Especially in Displacement and Squeezing gates the higher your parameters, the higher the energies in your circuit, and the more dimensions you need to simulate it correctly. So the rule of thumb is: small cutoff_dim → keep parameters that influence energy small.

And yes, CV quantum computing is a bit more advanced than qubit-based :slight_smile:

Hi @Maria_Schuld thanks for the info!

  1. Unfortunately that link didn’t help. I’m just trying to find out what is actually going into the layer() function (i.e. the arrays for inputs and the w0-w10 parameters). Once I have some reasonable values, I can then see if the trace is near one or not.
  2. From the link I provided: https://pennylane.readthedocs.io/en/stable/code/api/pennylane.templates.layers.CVNeuralNetLayers.html, K is the amount of beamsplitters. So since K = 0, what is the dimension of some of the parameters shown from the link above? Some of them have the dimension (L,K).
  3. What do you mean by “the higher your parameters”? Meaning the values of the parameters are bigger?

Thanks again!

Hi @Shawn!

The layer function that you’ve defined will take in the parameters inputs and w0, w1, etc. and so will the layer_qnode = qml.QNode(layer, dev). What you decide to input into the QNode is up to you and your code, with some restrictions (see SqueezingEmbedding and CVNeuralNetLayers for specifics).

To get the trace of a state it’s probably easiest to evaluate the QNode, i.e. input some parameters into the QNode as layer_qnode("params-of-your-choice"), and then checking the trace of the state with dev.state.trace() as per the suggestion above.

Regarding K=0, it’s true that for a single wire the parameters with shape (L, K) will be empty, since the beamsplitters can be seen as rotations between two wires and thus won’t be applied at all on a single wire.

I hope this addresses your questions. :sun_with_face:

Hi @theodor thanks for the reply! Yes, I understood all of that but my code doesn’t provide the values for the layer function – that is where I am confused. My code provides states, actions and q-values but the w0, w1, etc. comes from (or should I say is due to) the quantum layer. As Maria stated, the values of the parameters going into the layer() may alter the trace – so without knowing what usual values of the parameters are (are they negative? are they bounded within a set of real numbers? etc.), I am kind of swinging in the dark. Does that make sense?

But, of course I did try and I am confused on how the weights should look. Perhaps someone could give me some insight?

I ran:

import pennylane as qml
import tensorflow as tf


out_dim = 8 
wires = 1 
n_quantum_layers = 2 


dev = qml.device("strawberryfields.fock", wires=wires, cutoff_dim=10)

@qml.qnode(dev)
def layer(inputs, w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10):
    qml.templates.DisplacementEmbedding(inputs, wires=range(wires))
    qml.templates.CVNeuralNetLayers(w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10, wires=range(wires))
    return qml.Identity(wires=range(wires))
    
print(layer([1],[1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11]))

And am getting the error;

ValueError: wrong shape of weight input(s) detected

I’d appreciate any guidance on the shape of the parameters.

Hi @Shawn,

The parameters are mostly between 0 and 2\pi for the different angles for the beamsplitters and rotations in the interferometer part of the network (the details can be found in the documentation). There are no specific values that you should use here. What you could instead do is train the network, optimizing over these parameters, and print the trace in-between each step to see if the trace keeps close to 1 (if it does not, then you should use a higher cutoff).

The shape of your inputs should follow the ones in the CVNeuralNetLayers; see the list of shapes under Parameters at the bottom of the documentation page (e.g. they should be 2-dimensional arrays of floats with shape (L, K) or (L, M), with some being, as noted earlier, empty: [[],[]]).

Just as an example (this should work):

import pennylane as qml
import tensorflow as tf

wires = 1 

dev = qml.device("strawberryfields.fock", wires=wires, cutoff_dim=10)

@qml.qnode(dev)
def layer(inputs, w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10):
    qml.templates.SqueezingEmbedding(inputs, wires=range(wires))
    qml.templates.CVNeuralNetLayers(w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10,wires=range(wires))
    return [qml.expval(qml.X(wires=i)) for i in range(wires)]

and then printing the trace after evaluating the above QNode:

inputs = [0.5]
a = [[], []]  # shape: (L, K) = (2, 0)
b = [[0.5], [0.5]]  # shape: (L, M) = (2, 1)

results = layer(inputs, a, a, b, b, b, a, a, b, b, b, b)

dev.state.trace()

Thank you @theodor and everyone! :grinning:

1 Like

Hi everyone, I’d like to restart this topic. So over the last week I’ve been programing and running several games with classical and quantum algorithms and would like to discuss some of my findings.

I made a new very simplified reinforcement learning deep q-network algorithm (DQN) using the game Frozenlake-v0 and have been running both classical ML and QML programs over the last 5 days. The cartpole game that I used at the beginning of this post ended up being too time costly in terms of finding an optimal policy when using QML so I switched games. Frozenlake is as simple as it gets.

The neural network for the DQN (classical ML) is as follows:

n_actions = env.action_space.n
input_dim = env.observation_space.n
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(32, input_dim = input_dim , activation = 'relu'))
model.add(tf.keras.layers.Dense(16, activation = 'relu'))
model.add(tf.keras.layers.Dense(n_actions, activation = 'linear'))
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate = 0.00012), loss = 'mse')

And after testing an assortment of different hyperparameters here are the best results I got with a learning rate of 0.00012:

fig_basic_froz_dqn_batch60_mem50000_lr00012

What you see here is: 0 is the total reward if the agent does not get to the goal and 1 if the agent makes it to the goal (click on the Frozenlake hyperlink above to get a visualization of the game: H = hole, F = Frozen, G = Goal and S = Start). So we see after a lot of exploration, the agent learns heavily around the 700 mark and it is almost certain that the agent reaches the goal every time after 1000 episodes.

For the QML version I used:

out_dim = 4  
wires = 1
n_quantum_layers = 2

dev = qml.device("strawberryfields.fock", wires=wires, cutoff_dim=30)

@qml.qnode(dev)
def layer(inputs, w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10):
   qml.templates.DisplacementEmbedding(inputs, wires=range(wires))
   qml.templates.CVNeuralNetLayers(w0, w1, w2, w3, w4, w5, w6, w7, w8, w9, w10, wires=range(wires))
   return [qml.expval(qml.X(wires=i)) for i in range(wires)]


weights = qml.init.cvqnn_layers_all(n_quantum_layers, wires)#, seed=0)
weight_shapes = {"w{}".format(i): w.shape for i, w in enumerate(weights)}
qlayer = qml.qnn.KerasLayer(layer, weight_shapes, output_dim=wires)
clayer_in = tf.keras.layers.Dense(wires)  # we will sandwich the quantum circuit between two classical layers
clayer_out = tf.keras.layers.Dense(out_dim)
model = tf.keras.models.Sequential([clayer_in, qlayer, clayer_out])
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate = 0.00012), loss = 'mse')

I used cutoff_dim = 30 as it gave roughly .99 from the trace. Two of the best results were:

with learning rate 0.00012:

forz_qdqn_lr00012_1000

and with learning rate 0.0012:

forz_qdqn_lr0012_1000

So it’s a bummer it is not learning and I would like to get some input as to why this could be or what else I can try out that could potentially improve the results.

Some more remarks and questions:

  1. The QML program takes much longer than the classical. Classical takes about 10-20 minutes to finish and the QML 1-2 days. What could be the reason(s) behind this? Is there a way to see where the bottlenecks are or what is actually taking so long on the pennylane side?

  2. Also, this is of couse reinforcement learning, something that is a bit different than un/supervised learning. Could it be that Pennylane isn’t prepared to do RL i.e. sequential decision making just yet? (I believe if pennylane has a neural network, it should work disregarding what area of ML it is being used in)

  3. Could it be a bad Ansatz? In this paper they provide an “equivalent” (CVNN) to the classical neural network and also explain that we just don’t know which Ansätze will be good (just like in classical machine learning). Could it be that we need to just run through the guessing game and try things out to see what works?

The findings in the HuHu papers:

use various algorithms from RL and he was able to get his programs to work quite quickly and successfully in terms of episodes.

I contacted him about his findings but he hasn’t been responsive unfortunately.

Any insight is greatly appreciated.