Backpropagation with Pytorch

James_Ellis · February 4, 2021, 1:35am

Hi,

Is it possible to use backpropagation with default.qubit on Pytorch? Or only tensorflow?

Also, does lightning.qubit support backpropagation?

Thanks

James

josh · February 4, 2021, 5:50am

Hi @James_Ellis!

Is it possible to use backpropagation with default.qubit on Pytorch? Or only tensorflow?

Currently, standard backpropagation is supported using both TensorFlow and the default Autograd:

import pennylane as qml
from pennylane import numpy as np

dev = qml.device("default.qubit", wires=2)

@qml.qnode(dev, diff_method="backprop")
def circuit(weights):
	qml.RX(weights[0], wires=0)
	qml.RY(weights[1], wires=0)
	return qml.expval(qml.PauliZ(0))

weights = np.array([0.1, 0.2], requires_grad=True)

# compute the gradient via backprop
print(qml.grad(circuit)(weights))

Unfortunately, we cannot support PyTorch for standard backpropagation until PyTorch has full support for complex numbers.

Having said that, we have recently added a new feature to PennyLane v0.14, released just this week — adjoint backpropagation. This is a form of backpropagation that takes advantage of the fact that quantum computing is unitary/reversible, and thus has a reduced memory/speed overhead than standard backpropagation.

Adjoint backpropagation is implemented directly in PennyLane, so will support any interface, including PyTorch:

import pennylane as qml
import torch

dev = qml.device("default.qubit", wires=2)

@qml.qnode(dev, diff_method="adjoint", interface="torch")
def circuit(weights):
    qml.RX(weights[0], wires=0)
    qml.RY(weights[1], wires=0)
    return qml.expval(qml.PauliZ(0))

weights = torch.tensor([0.1, 0.2], requires_grad=True)
loss = circuit(weights)
loss.backward()
print(weights.grad)

The latest version of lightning.qubit (v0.14) is also compatible with the new adjoint differentiation method

Note: As this is a new feature, if you notice any bugs or issues, please let us know via a GitHub issue!

James_Ellis · February 4, 2021, 1:16pm

Wow… using the transfer learning demo (Quantum transfer learning — PennyLane) I am seeing more than 3x improvement in training speed.

In your opinion, do you think more speed up could be witnessed using backprop on Tensorflow vs adjoint on Pytorch?

Thanks for your help!

josh · February 4, 2021, 1:25pm

That’s awesome to hear! What interface, device, and differentiation method are you using?

In your opinion, do you think more speed up could be witnessed using backprop on Tensorflow vs adjoint on Pytorch?

This is a good question, I’m not sure! If you’re interested in benchmarking, I would be very interested to see your results.

James_Ellis · February 4, 2021, 1:53pm

Using the torch interface I tried:

1.) ‘lightning.qubit’ with finite-difference differentiation method (I assume this it the default?). Time: 20s

2.) ‘lightning.qubit’ with adjoint backpropagation. Time: 6s

The code I used is exactly the same as the demo.

I will try and convert the demo into Tensorflow so I can benchmark, but I am not very familiar with Tensorflow so it might take me a while.

Thanks again for your help

James_Ellis · February 4, 2021, 1:55pm

Oh and just for reference,

‘default.qubit’ with finite-difference had a time of 40s for that demo on my computer.

jmarrazola · February 4, 2021, 2:24pm

This is fantastic James! Excellent work!

The PennyLane dev team has put in a lot of work recently in improving speed and performance. It’s great to see actual numbers for user runtimes!

glassnotes · February 5, 2021, 4:15pm

2 posts were split to a new topic: Quantum circuit between classical layers with Tensorflow

James_Ellis · February 7, 2021, 8:15pm

I have run a similar experiment on Tensorflow. I basically followed a tutorial (انتقال یادگیری و تنظیم دقیق | TensorFlow Core) because I am not familiar with TF. I modified it to use ResNet50.

(For each I give the time for ‘lightning.qubit’ and ‘default.qubit’ respectively)

Using ‘adjoint’ as the differentiation method, I observed 11s and 12s.

Using ‘backprop’, I observed 35s and 34s

As a reference, using ‘finite-diff’ I observed 17s and 26s.

It is interesting to see backprop being slower than finite-difference in this case. Is there any reason for that?

I have attached my code in case any results are a mistake on my behalf.

The python file was run using a GTX 1080. tf_exp.py (3.7 KB)

antalszava · February 8, 2021, 9:06pm

Hi @James_Ellis,

Thanks for the results!

It seems odd indeed, that you saw poorer performance using backprop than with the finite-diff with default qubit Locally backprop seems to be faster, though this might mean that the difference arises when run on a GPU.

As lightning.qubit doesn’t support backprop (unlike adjoint), we’re actually falling back to default.qubit in that case (hence the comparable results there).

antalszava · February 9, 2021, 6:02pm

Hi @James_Ellis,

Have two things to add after a chat with @josh:

For a couple of parameters, using finite differences or the parameter shift rule can actually be more efficient than backpropagation. This is due to lower overheads with those techniques. Backpropagation performs particularly well for a large number of parameters.
I’ve previously mentioned that we are falling back to default.qubit.autograd from lightning.qubit. This is actually due to a bug we just uncovered thanks to the discussion, and we’re working on disallowing it (so in the future you might see an error when specifying backprop with lightning.qubit).

Topic		Replies	Views
Backpropagation with lightning.qubit and PyTorch PennyLane Help	3	1597	March 15, 2022
Pytorch backpropagation RAM issue PennyLane Help	9	949	March 16, 2021
Issues with backpropagation when using Parameter Broadcasting with PyTorch PennyLane Help	18	1044	September 27, 2023
Question about torch autograd of a circuit returns a state PennyLane Help	2	494	November 27, 2020
Hybrid Quantum-Classical network with pytorch PennyLane Help	18	3050	November 4, 2020

Backpropagation with Pytorch

Related topics