Hello everyone, this is my first topic here, so feel free to tell me if this is the right place for it.
I’m doing an exploratory study on QML applied to high-energy physics. So, I’m building a pipeline that allows me to train multiple circuits with different optimizers, data, etc. - I give it some hyperparameters specifying the training, and It should do the rest.
> The Problem
When I do the step method on my optimizer / calculate the gradient and apply them, the gradients of my model weights/parameters are always zero! Being that the gradient of the model bias is != 0. I even tried to put an insane stepsize of 1e4 on my NesterovMomentum optimizer, but nothing changed.
> What I think It’s Happening
If you look at my code, you can see that the circuit is initialized by
self.circuit_ = qml.QNode(globals()[self.circuit_name], self.dev)
What essentially is being done is that the specified circuit (given by the user as a string) is being loaded from another file called “circuits” where the circuit itself is located.
This self.circuit_ is used on my circuit method, where I just pass the current weights and input vector to my parametrized circuit.
def circuit(self, weights, x): return self.circuit_(weights, self.embedding, self.n_layers, self.n_qubits, x)
This circuit method is itself called by my classifier method where I consider the bias of each parameter and this is the method my cost function will call to calculate the gradients.
def classifier(self, weights, bias, x): return self.circuit(weights, x) + bias
My theory is pennylane can only see the gradients of the vectors specified in the current class and can’t go further - For example, it can’t peek inside the circuit that is defined in another file. This would explain why only the bias has a gradient != 0.
Honestly, I have no idea for sure what is happening. If somebody has any idea of the problem, I would be forever thankful!
Have a nice day and thanks in advance!
> Aditional Information
A somewhat simplified version of my code here
As well as the circuits file here.
A picture of the gradient values:
The 0 index represents the weights gradient and index 1, the bias gradient.