Issue with gradient descent optimisation on CV gaussian system

Hi,

I’m trying to optimise classical fisher information from a gaussian system simulated using default.gaussian backend. I’m measuring the mean and variance and using that to generate the gaussian probability distribution classically which I then use to calculate the fisher information.

The problem arises when I try to optimise the circuit taking the Fisher Information matrix as the cost function. The qml.GradientDescentOptimizer just returns zero as the cost and doesn’t update the parameters. It doesn’t give an error or anything suggesting that the objective function is not differentiable.
I also tried changing the inital values and learning rate.

What could be the problem here?

Thanks in advance :slight_smile:

-Kannan

Hi @Kannan! Welcome to the Xanadu forum! :wave:

It’s difficult for me pin down the issue without taking a look at your code. Could you please share it so we can better assist you?

As a general piece of advice, the Fisher information matrix is a notably difficult cost function to work with because it can’t easily be expressed as an expectation value of a simple observable. As you noted, it requires taking gradients of individual outcome probabilities. It’s probably wise to be extra careful in this case.

In the meantime, you may find this tutorial on quantum metrology helpful. It is described for qubit circuits, but it also makes use of the Fisher information matrix as the cost function.

Best,

Juan Miguel

Hi @jmarrazola,

Thank you for the suggestion. I figured out the issue, but I thought I’ll share it here so it may help others as well.

The problem was with a simple float point precision error. The gradient of the function was extremely small (of the order of 10^-30) compared to the inital step which was 0.1 and the learning rate was just 0.01.

So the step of the gradient descent optimization was going like:
0.1 - (0.01*1e-30)
which is computed to be 0.1 itself.

So I think this may be resolved by using an astronomical learnigng rate like 1e29 so that the step is comparable to initial state.

  • Kannan
1 Like

Hi @kannan_v,

I’m glad you identified that the gradient is extremely small. However, I’m not sure that using an extremely large learning rate is the way to go. Vanishing gradients, also known as barren plateaus, are a common problem in the optimization of quantum circuits. Are you observing that the model trains properly when setting such a large learning rate?

Ideally, your model should be such that the gradient is not vanishing. You may also want to experiment with changing the gates in your circuit, using different initial parameters, or even using gradient-free optimizers.

Best,

Juan Miguel