The docs say the equation to compute the gradients is

0.5 * (circuit_plus - circuit_minus)

where circuit_plus and minus are the same circuit with the parameters shifted pi/2 (plus minus respectively). Trying to recreate this, I’ve come across a weird scaling for the sigma_x params.

Q: Is there a factor of 2 in the shift for pauli xs? i.e. is the shift ± pi instead of pi/2? If no, I can post some code that outlines this

So I was misunderstanding how the grad computations work.

I’m working with qaoa, where each phase of the circuit implements many gates, each with the same parameter. I shifted the gates all at the same time. However, the grad computations actually work by the product rule, so the grad contributions from each gate need to be computed separately and summed (also explaining some of the scaling problem).