Hi @karolishp — a very well-founded question!
Everything you note is correct:
-
backprop
definitely has an overhead on the forward pass, since every intermediate stage of the computation is being stored in memory, to be accumulated later during the backwards pass. -
The
adjoint
method is a version of backprop that is designed for unitary/reversible computation — as a result, we are able to remove the memory caching requirements, and replace it with some additional computation. In effect, we are trading a reduction in memory for an increase in computational time, however, this allows us to scale up the regimes where we can use adjoint beyond standard backprop, to ~30 qubits or more.However, like you mention, we are still working on adding support for more operations.
-
parameter-shift
will have the fastest forward execution time (since there is no ‘bookkeeping’ that must be done on the forward pass), but requires 2P separate circuit evaluations on the backwards pass, for all P parameters in the circuit. So useful for small circuits with few parameters, but rapidly becomes unscalable as the number of parameters/qubits increase.
Following Backpropagation with Pytorch , I tried
lightning.qubit
withadjoint
, however there are some operations I use that are not supported by adjoint, so that’s not a valid option for my use case.
Which operations/measurements do you currently need which aren’t supported with adjoint yet? This will help us build up the adjoint to ensure feature parity.
P.S. I also noticed that using
default.qubit
withbackprop
on PyTorch (as well asdefault.qubit.torch
) gives the following warning message:
Would you be able to post:
- A small QNode example that generates this warning?
- Your Torch, PennyLane, and NumPy versions?
This will help us track down the issue