Transfer learning error


When I try to run I get RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking arugment for argument mat2 in method wrapper_mm)

This has not happened on previous computer, which had no GPU and no CUDA.

Hi @_risto,

This is an interesting one we haven’t seen before. Would you be able to a minimal code example that generated this error message for you, as well as the output of qml.about()? That will better help us diagnose. Thanks!

When running this line: model_hybrid = train_model(
model_hybrid, criterion, optimizer_hybrid, exp_lr_scheduler, num_epochs=num_epochs

I get:
Training started:

C:\Users\risto\anaconda3\lib\site-packages\torch\nn\ UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at …\c10/core/TensorImpl.h:1156.)
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)

RuntimeError Traceback (most recent call last)
----> 1 model_hybrid = train_model(
2 model_hybrid, criterion, optimizer_hybrid, exp_lr_scheduler, num_epochs=num_epochs
3 )

in train_model(model, criterion, optimizer, scheduler, num_epochs)
37 loss = criterion(outputs, labels)
38 if phase == “train”:
—> 39 loss.backward()
40 optimizer.step()

~\anaconda3\lib\site-packages\ in backward(self, gradient, retain_graph, create_graph, inputs)
253 create_graph=create_graph,
254 inputs=inputs)
–> 255 torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
257 def register_hook(self, hook):

~\anaconda3\lib\site-packages\torch\ in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
145 retain_graph = create_graph
–> 147 Variable.execution_engine.run_backward(
148 tensors, grad_tensors
, retain_graph, create_graph, inputs,
149 allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag

~\anaconda3\lib\site-packages\torch\autograd\ in apply(self, *args)
85 def apply(self, *args):
86 # _forward_cls is defined by derived class
—> 87 return self._forward_cls.backward(self, *args) # type: ignore[attr-defined]

~\anaconda3\lib\site-packages\pennylane\interfaces\ in backward(ctx, dy)
173 “”“Implements the backwards pass QNode vector-Jacobian product”""
174 ctx.dy = dy
–> 175 vjp = dy.view(1, -1) @ ctx.jacobian.apply(ctx, *ctx.saved_tensors)
176 vjp = torch.unbind(vjp.view(-1))
177 return (None,) + tuple(vjp)

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking arugment for argument mat2 in method wrapper_mm)

I have also tried to run some random network from github and I notice the error occuring, while running the following:

device_name = “cuda:0:” if torch.cuda.is_available() else “cpu”
device = torch.device(device_name)

RuntimeError: Invalid device string: ‘cuda:0:’

If your priority is just getting the code working, you could try

device_name = "cpu"

This won’t have a potential GPU speedup, but it should at least work.

Let me know if that helps :slight_smile:

Hi @christina

Yes, I get the same error: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking arugment for argument mat2 in method wrapper_mm).

I also get same error when running: device = torch.device(“cuda:0”)

Is it because I use RTX 3060 and those are not compatible with pennylane?
I would like to scale the code for larger data an more complex resNet, thus would like to use it with GPU, if it would be possible.

Looks like there’s a bug in computing the jacobian with a GPU. Thanks so much for bringing this problem up. :+1:

A bugfix is currently in the works, so stay tuned.

1 Like

Happy to be of any help. This company & group is doing really amazing things.


Hi @christina

How is the bugfix going?

Hi @_risto,

The bug fix has now been merged into the master branch of PennyLane - try installing directly from there, and let us know how it goes. Just a heads up, depending on your setup, you might need to use the most recent stable version of torch, which is 1.9 (those of us on the team who tested the fix both have a GTX 1060, and had some card-related issues with 1.8.x that are resolved with 1.9).

What do you mean by that?
I was using 1.9 stable version of torch all along while I got the error.

Hi @_risto,

The bug was in PennyLane, but has now been merged into the master branch of PennyLane (so you’d need to install it from there, which can be done by executing pip install git+ or by downloading/cloning the repository and installing it directly from the downloaded folder using pip install .).

It’s recommended that you use torch v1.9 as well, since that version also resolved some issues that some members of the team had. :slight_smile:

Hi @theodor

Did the installations of Pennylane. I have been using torch 1.9 all along. Still getting the error.

Hi @_risto,

Could you please post your error output? Is it exactly the same as before, or is the traceback pointing to a different part of the code now?

Also, could please post the output of running qml.about()?


Hi @glassnotes

So, first Powershell tells me pennylane 17 is installed

But then when I run the demo I get that I am using pennylane 16

And then this error

I don’t understand why it says that pennylane 17 is installed, but then it shows 16 while running the code. :face_with_raised_eyebrow:

When I try to install it via conda install git+ I get:

Hi @glassnotes

I have solved the issue. I have uninstalled anaconda all together and have installed and run jupyter notebook from pip. But I still get:

What does that mean?

In addition, are the following parameters from the demo set to the optimal training performance?
n_qubits = 4 # Number of qubits
step = 0.0004 # Learning rate
batch_size = 4 # Number of samples for each training step
num_epochs = 30 # Number of training epochs
q_depth = 6 # Depth of the quantum circuit (number of variational layers)
gamma_lr_scheduler = 0.1 # Learning rate reduction applied every 10 epochs.
q_delta = 0.01 # Initial spread of random quantum weights
start_time = time.time() # Start of the computation timer

Hi @_risto,

The warning seems to be something that Torch raises internally when using functions like max_pool1d. The Torch team has been made aware of this happening and they have fixed it in their master version. See their message here from a week ago. According to the Torch team, it will be in the next release. It should be safe to be ignored until then.

As for the hyper-parameters, these were reported in the original paper to provide high enough accuracy for transfer learning. Would you be interested in optimal training performance in terms of accuracy or execution time?

Hi @antalszava

Thank you for your answer.
Yes, I would be interested in setup for optimal performance. I am playing and changing parameters, but I don’t know how practical that would be.
I am also trying to run some other models from torch (f.e.: resnext101_32x8d).

But then it says Training started, but nothing is happening. However, my GPU usage is at 100%.


What does that mean?

Also, I have managed to run the algorithm on resnet152, both on a quantum and a classical one, with same parameters (batch size, epochs, etc.), on the same data and the classical one has better test/val acc. as well as better speed (the quantum model barely uses my GPU). So what would be an advantage of using the hybrid quantum model in compare to the classical one?

Hi @_risto,

The parameters in the paper present a scenario where the training seemed to work well, but there may be other configurations that also work nicely or even better. It’s worth exploring, but there aren’t many trivial approaches as hyperparameter optimization is an established problem in itself. There are some techniques that can be worth explored (e.g., doing a grid search of the parameters).

Was that something that came up specifically for resnext101_32x8d? Naively, it would be an indication that the computation is still ongoing. Did that not occur when using only CPUs?

So what would be an advantage of using the hybrid quantum model in compare to the classical one?

The advantage of using quantum machine learning algorithms compared to similar classical machine learning algorithms is an open research question. The paper on transfer learning presents proof of concept ideas and verifies that they produce accurate results on quantum hardware too.

Hi @antalszava

I have run resnet152 both as classical and modified to this quantum demo with same parameters and same data. The result: classical has better validation / test accuracy and much faster (it took classical 2 min, whereas quantum one 20 min). Too bad, was hoping for an advantage of the quantum one.