Why does Rotosolve not work with QAOA for Maxcut example

I was curious as to this example https://pennylane.ai/qml/demos/tutorial_qaoa_maxcut.html and changing only the optimiser from Adagrad to Rotosolve caused optimisation to utterly and completely failed, with all of the parameters immediately going to -pi/2. I cannot figure out a reason for why this is the case or for why Rotosolve completely breaks down here. Looking at the original paper, I don’t think I’m violating any of the assumptions.

Code: https://pastebin.com/KynB6Hj6

Hi! Have you tried using different initial conditions for the optimization?

Hey Nic,

The initial conditions are randomized (const * np.random.rand(2,2)), and changing the value of the constant makes no difference. It also is bizarre since Rotosolve, no matter the number of layers, converges to -pi/2 for every parameter, as well as never finds the actual optimum.

For what I can see in your example const = 0.01. The fact that all the values go to -pi/2 seems to suggest that the local landscape of your cost function is very flat in that region since the optimal values are found via

image

(cf. https://pennylane.ai/qml/demos/tutorial_rotoselect.html)

Maybe trying to initialize the parameters in a region away from zero can help.

Nicolas

Hey Nic,

There’s no value of the const scaling or of initial parameters that I’ve cycled through or set by hand, that do not immediately in a single iteration converge to -pi/2. I 100% agree that coordinate descent is extremely sensitive to initialisation, but in the >100 runs I’ve done varying everything it always converges to every param being -pi/2. I don’t think this is an initialisation issue.

Edit: By the way, I can also change the size and connectivity of the graph and that makes no difference, despite dramatically effecting the cost landscape.

One more thing that you could try is to initialize the optimization using rotosolve in a (possibly local) minimum obtained from adagrad.

@Nicolas_Quesada, I tried doing that and running with the parameters, same exact behaviour. It’s definitely not an initialisation problem, especially given that 3.9977 is basically the global optimum possible of 4. My guess is it might be something to do with the fact that the objective function calls the circuit multiple times, and that the Rotosolve optimiser doesn’t handle storing that information, or maybe that I am actually violating an assumption in the original paper.

Any other ideas

p=2
Objective before step     0:  3.9977946
Params are  [[ 1.56087298 -0.77972525]
 [ 0.40580708 -0.80786168]]
Objective before step     1:  2.0000000
Params are  [[ 3.14159265  2.94419709]
 [-1.57079633 -1.57079633]]
Optimized (gamma, beta) vectors:
[[-1.57079633 -1.57079633]
 [-1.57079633 -1.57079633]]
Most frequently sampled bit string is: 0001

New code is:

Hi @milanleonard. I looked into the issue that you’re having and it seems to have to do with the way that Rotosolve works. To get the next value for a parameter it temporarily replaces parameter \theta_d with 0 and \pm\pi/2, freezing all other parameters, and then calculates the new parameter value using the following equation:

\theta_d^* = -\frac{\pi}{2} - \text{arctan2}\left(2\left<H\right>_{\theta_d=0} - \left<H\right>_{\theta_d=\pi/2} - \left<H\right>_{\theta_d=-\pi/2}, \left<H\right>_{\theta_d=\pi/2} - \left<H\right>_{\theta_d=-\pi/2}\right)

In the cost function that you’re using, and the one that is used by the MaxCut demo, \left<H\right>_{\theta_d} seems to always be -2 when the first parameter \theta_0 (gammas[0] in your code) is 0 or \pm\pi/2, causing \theta_0 to evaluate to -\pi/2 during the first iteration, in turn causing all following parameters to also evaluate to -\pi/2.

Unfortunately, it seems like this particular cost-function doesn’t work well with the Rotosolve optimizer as it is currently implemented.

Hey @theodor, thanks so much for looking into this for me. Shame that Rotosolve doesn’t play well in this case.
Do you know if this is a bug / if I could define the cost function differently to avoid this problem? Did you manage to have a deeper look into what causes this?
Thanks so much.

Hi @milanleonard. It’s not really a bug as far as I can tell. You could possibly update the cost function so that it doesn’t provide the same output for when the first parameter is equal to 0 or \pm\pi/2 (so that \theta_d^* doesn’t always evaluate to -\pi/2).

I haven’t looked into what is causing the cost function to provide the specific value of -2 for these cases, though.

@theodor.

I’m actually not getting that the cost function for when the first parameter is 0 to be exactly -2. For example, with params params = np.array([[ 0, 0.99], [ 0.0005,1.621312]]) then I get that. Although this is very close to -2. As such, my guess is that it lies within the Rotosolve update but I could very very much be wrong.

Objective before step     0:  1.8158789
Params are  [[0.000000e+00 9.900000e-01]
             [5.000000e-04 1.621312e+00]]

I might try and look a little bit further into this in a couple weeks, let me know if you decide to take any more of a dive or have any ideas. Thanks so much for taking a look into it for me.