When does autograd need help?

I am new to Pennylane so please forgive the naivete of my question. You seem to be using autograd, but you also use this nice trick of evaluating derivatives of Rx, Ry, Rz at theta with an explicitly formula. I’ve never used autograd before. I assume it complains when you use a function of bumpy that is unsupported. So why the explicit formula? Does autograd complain about something that the explicit formula circumvents? When does autograd need help in taking the derivative of a Variable of Pennylane?

You seem to create your own dag which exists at the same time as the dag that autograd creates. Is that correct? When you do backprop to calculate the grads, do you use your dag or autograd’s

Hi @rrtucci

PennyLane was designed from the forefront to be

  1. quantum hardware focused
  2. hardware (and, by extension, framework) agnostic

So, while we could have simply used autograd’s NumPy functions to simulate the QNode (and then have autograd work out the gradient itself), this would not work with most Python frameworks (we would have to monkey-patch their import of NumPy, or they might be using a C extension), non-Python based frameworks, as well as physical quantum devices.

In this approach, we completely decouple the automatic quantum gradient calculation from autograd.

Autograd essentially sees the QNode as a black box — when backpropagation reaches the QNode, PennyLane takes over, and uses the analytic gradient formulas to query the quantum device directly to determine the gradient of that part of the computational graph (as well as more advanced behaviours, such as parameter fanout, taking care of the chain rule, etc., hence the use of the internal QNode DAG). This will work with both external quantum simulators and hardware.

So, rather than use a machine-learning library to bring auto-differentiation to quantum simulations, we have instead brought quantum devices to the machine-learning library (i.e. we have made autograd ‘quantum aware’).

Thanks. Wow! That is amazing. I think you guys are very smart, savvy programmers. I am learning a lot from reading your code What you are doing is very sophisticated, ambitious and general! My program Qubiter is much less sophisticated.

Thank you :slight_smile:

And as of the latest release (PennyLane v0.3, released minutes ago), we now also support QNode integration with both TensorFlow and PyTorch.

Josh, this might be helpful someday. Just in case you are not aware of it. Look at the answer with only 4 votes, a formula by Hig-ham. Your explicit formula only applies to Rx, Ry and Rz, but not to a general rotation R, if you vary each of the 3 degrees of freedom of R separately, but Hig-ham’s formula, although not exact like the one you use, does apply to a general R. Of course you can always decompose an R into a product RxRyRz a la Euler, but that is more wasteful

I am ashamed to say that I spoke too quickly. Higham’s as presented in that StackExchange comment does not work in our case. However, I figured out how to modify Highams so that it does work in our case. I wrote a brief blog post about it:

Hi @rrtucci, are you referring to the following arbitrary rotation operation?

R(\phi,\theta,\omega) = RZ(\omega)RY(\theta)RZ(\phi)

This is available in PennyLane as the qml.Rot() operator, and in fact, since it can be written in the form

R(\phi,\theta,\omega) = e^{-i\omega ~\hat{\mathbf{n}}(\theta, \phi)\cdot\mathbf{\sigma}/2}

and has two distinct eigenvalues, it satisfies the requirements for the analytic parameter-shift formula for the gradient:

\frac{\partial}{\partial \phi}R(\phi,\theta,\omega)=\frac{1}{2}\left[R(\phi+\pi/2,\theta,\omega)-R(\phi-\pi/2,\theta,\omega)\right]

(and ditto for angle parameters \theta and \omega as well).

This is currently implemented in PennyLane, and the unit tests verify the analytic gradient formula against finite differences :slight_smile:

For more details on the analytic gradient formula, see our arXiv paper: arXiv:1811.11184

1 Like

Yes. Thanks. Nevertheless, if someday you are handling a 2 qubit matrix like, for example, those exotic swaps that they use in ion traps, your exact formulas might not work. The symmetric finite difference formulas are approximate but completely general. The symmetric difference causes the h^2 terms to vanish so the approx is good to order h^3. Also, exponentials are really smooth. Their Taylor series converges very quickly. So the coefficient for that h^3 error is probably fairly small too.

1 Like

Actually, on second thought, I think your exact formulas will always work, but you will always have to reduce the gate to a sequence of single qubit rotations along x,y,z axes with multiple controls first. And then take the derivative of each of those single qubit rotations along x,y,z using your exact formulas. Still, the symmetric finite difference might someday help you avoid doing that decomposition first

1 Like