Batching in TorchLayer

Hello guys,

I’m using a TorchLayer to use the new support for native backpropagation using PyTorch. However, the torchlayer trains much more slowly compared to a simple torch fully connected layer, for the same problem. I was wondering if batching inputs is occurring under the hood and if it’s not, what can I do to speed up training?

Best regards.

Hey @Andre_Sequeira!

Although the TorchLayer accepts batched inputs, no batch-level optimization is going on under the hood. You can check out how things work in the forward method of TorchLayer.

There might be a couple of reasons why the hybrid model you are using is taking longer to train than a simple fully connected classical layer. From a fundamental perspective, we do expect the training times to increase exponentially on a simulator as we scale the number of qubits. This is what provides the nice motivation to construct the quantum hardware.

On the other hand, for a small number of qubits we can still try a couple of things to extract more performance. One approach is to optimize the way we differentiate the circuit. In older versions of PennyLane, the diff_method="parameter-shift" method was used for Torch, you can check out more details here. Luckily, in the new version of PennyLane released a few days ago, we added support for backpropagation in the Torch interface. This simulator-only approach can provide a big speedup! In fact, I just tried running this tutorial and it took 8 seconds to train in the latest version of PennyLane and 44 seconds with an older version :rocket:

So in summary, although there are some fundamental reasons why we might expect training to be tough on quantum simulators, you could try upgrading your PennyLane version and you might get a speedup without having to change any code!

Hey @Tom_Bromley,

thank you for your support.

There might be a couple of reasons why the hybrid model you are using is taking longer to train than a simple fully connected classical layer. From a fundamental perspective, we do expect the training times to increase exponentially on a simulator as we scale the number of qubits. This is what provides the nice motivation to construct the quantum hardware.

The problem that i’m working on is relatively small though, only 4 qubits and 3 layers , each with 8 parameters to train, so 24 trainable parameters. The backpropagation from the new version of pennylane provided indeed a massive speedup, however, i have a big batch of data to feed into the quantum neural network. This is where i see the quantum model taking longer to train.

Although the TorchLayer accepts batched inputs, no batch-level optimization is going on under the hood.

Is there anything that i can do with respect to data in order to speed up things a little bit? Is it possible to do batch-level optimization?

Hi @Andre_Sequeira,

I’m glad the PennyLane update provided some speedup! Unfortunately there is not a lot that we can do in terms of optimizing iteration over a batch dimension in incoming tensors. This is not a feature we have prioritized so far, partly due to the limitations of quantum hardware.

However, it’s useful feedback to know that you’re interested in more efficient tensor batching. We have recently been working on a batch_transform decorator, which is helpful for things like supporting differentiability and submitting multiple circuit executions to hardware as one job. This functionality may eventually help us allow batching over a tensor dimension on supported devices.

Thanks,
Tom

Hey @Tom_Bromley,

ok, that is unfortunate, but keep up the good work !!

Thank you for your help.

1 Like