Weights are zero when using a Pipeline for training

Hello everyone, this is my first topic here, so feel free to tell me if this is the right place for it. :sweat_smile:

I’m doing an exploratory study on QML applied to high-energy physics. So, I’m building a pipeline that allows me to train multiple circuits with different optimizers, data, etc. - I give it some hyperparameters specifying the training, and It should do the rest. :sunglasses:

> The Problem
When I do the step method on my optimizer / calculate the gradient and apply them, the gradients of my model weights/parameters are always zero! Being that the gradient of the model bias is != 0. I even tried to put an insane stepsize of 1e4 on my NesterovMomentum optimizer, but nothing changed.

> What I think It’s Happening
If you look at my code, you can see that the circuit is initialized by

self.circuit_ = qml.QNode(globals()[self.circuit_name], self.dev)

What essentially is being done is that the specified circuit (given by the user as a string) is being loaded from another file called “circuits” where the circuit itself is located.

This self.circuit_ is used on my circuit method, where I just pass the current weights and input vector to my parametrized circuit.

 def circuit(self, weights, x):
        return self.circuit_(weights, self.embedding, self.n_layers, self.n_qubits, x)

This circuit method is itself called by my classifier method where I consider the bias of each parameter and this is the method my cost function will call to calculate the gradients.

def classifier(self, weights, bias, x):
    return self.circuit(weights, x) + bias

My theory is pennylane can only see the gradients of the vectors specified in the current class and can’t go further - For example, it can’t peek inside the circuit that is defined in another file. This would explain why only the bias has a gradient != 0.

Honestly, I have no idea for sure what is happening. If somebody has any idea of the problem, I would be forever thankful!

Have a nice day and thanks in advance! :slight_smile:

> Aditional Information

A somewhat simplified version of my code here

As well as the circuits file here.

A picture of the gradient values:
The 0 index represents the weights gradient and index 1, the bias gradient.

Hi @Miguel_Cacador_Peixo,

Welcome to the forum, thanks for posting! :slightly_smiling_face:

Hmm, I’m not exactly sure what the issue could be here after a skim through. The definitions of circuit1 and circuit2 could possibly be located in another file, they should just be Python functions. So if they are imported correctly, it shouldn’t be an issue to have them elsewhere.

It would be useful to have the full-fledged example be simplified a bit further: change from using globals and a separate file and making it a stand-alone script (some parts seem to be missing for the first snippet, e.g., it raises ModuleNotFoundError: No module named 'helper' ). Apart from that one thing that stands out is that self.weights is not being updated on line 113, although the comment below claims that an alternative way was also tried, through which self.weights is updated.

Could you post an executable version of the code that still behaves strangely? :slightly_smiling_face: That should help us get closer to what could be going wrong there.

1 Like

Hello @antalszava!

First of all, thank you so much for your kind reply.

I’ve managed to further simplify this version, put everything together into one file and I even adapted it to use publicly available data for training - this code will automatically download “data_banknote_authentication.txt” for you (check BankData class for + info).

The code is available here.

This is the output for me, the same problem as before - weights are not updated:

Also, I’ve noticed a new warning:

“Output seems independent of the input.”

I don’t know why It thinks that :sweat_smile:

Hopefully, everything is clear now - Thanks in advance! :smile:

Hi @Miguel_Cacador_Peixo, I managed to reproduce your problem.

I tested changing the initial weights and bias and in fact the cost never changes so I suggest that you review your cost function to make sure that a change in the weights really does create a change in the cost function.

My suggestion would be to simplify your circuit and cost function as much as possible and slowly add the complexity you need until you find exactly what is causing the problem.

1 Like

Thank you so much for your kind answer @CatalinaAlbornoz

I’m using the basic mean squared error loss function, I don’t think I can simplify my loss further :sweat_smile:

About the circuit I’m using a circuit identical to BasicEntanglerLayers (circuit1), and then returning:


This will then return a value ∈[-1,1] correct? (My labels are -1 or 1)
Can I simplify the circuit any further?

Hi @Miguel_Cacador_Peixo,

I dug deeper into your problem and I think the problem is that you’re updating global variables within your cost function.

When using Autograd with PennyLane, cost functions must be pure — they cannot perform side effects (such as updating external variables), otherwise:

  • You will see ArrayBox objects rather than NumPy arrays, and
  • The values stored via the side-effect will no longer be differentiable.

I hope this helps you modify your code so that you can perform all the updates you need without updating external variables.

Please let me know if this helps.