Behaviour of pennylane torch layer with batched inputs

Hello! I’m trying to develop a quantum neural network using PyTorch and Pennylane. I am messing around with state preparations algorithms and currently I am testing this encoding algorithm:

def Ry_EncodingFormula(self, K: float, samples: Tensor):
        scalar = ((pi/2)/K)
        return scalar * samples

def statePreparation(self, inputs: Tensor):

        for i in self.positionalWires:
            qml.Hadamard(wires = i)
        
        encodingThetas = self.Ry_EncodingFormula(K = float(2**self.nBitEncoding) - 1, samples = inputs)

        for i in range(encodingThetas.shape[1]):

            currentState = str(binary_repr(i, width = len(self.positionalWires))).split()
            
            XControlQubits = [index for index, digit in enumerate(currentState) if digit == 0]

            for x in XControlQubits:
                qml.PauliX(wires = x)

            for controller, _ in enumerate(range(len(self.positionalWires))):
                    if controller == len(self.positionalWires) - 1:
                        qml.CRY(encodingThetas[:, i], wires = [self.positionalWires[controller], self.colorEncodingWire])
                    else:
                        qml.CRY(encodingThetas[:, i], wires = [self.positionalWires[controller], self.colorEncodingWire])
                        qml.CNOT(wires = [self.positionalWires[controller], self.positionalWires[controller+1]])
                        qml.CRY(-encodingThetas[:, i], wires = [self.positionalWires[controller+1], self.colorEncodingWire])
                        qml.CNOT(wires = [self.positionalWires[controller], self.positionalWires[controller+1]])
               

The functions are methods of a “wrapper” class.
Basically, I want to encode the color and positional information of each of the pixels in a grayscale image using an appropriate number of positional wires and a color wire. Each pixel is encoded in a rotation using the formula: \frac{\pi}{2k} \times g_k, where g_k is the k-th pixel in the image. Once rotations are computed, a series of gates is applied using those rotations following the logic in the code.
self.positionalWires is a range of wire indexes (e.g. [0, 1, 2, 3]).
Images are 8-bit encoded, so the maximum value K of the pixels is 255.
Inputs are, of course, the images. Specifically, inputs is a tensor of flattened images with shape [batch_size, width*height]. For reference, I am currently using the MedMNIST dataset, which has both grayscale and RGB images with shape 28x28. So for reference my inputs would be [batch_size, 784].
Consider the function statePreparation as a step in a circuit function:

def circuit(self, inputs, weights):

   self.statePreparation(inputs)
   self.anotherManipulation(weights)
   ...

   return [qml.expval(qml.PauliZ(self.colorEncodingWire))]

the circuit is then used in a QNode:

self.dev = qml.device(name = backend, wires = self.circuitWires)
self.qnn = qml.qnn.TorchLayer(qml.QNode(self.circuit, self.dev, interface = "torch", diff_method = diffMethod), weight_shapes = weightShapes)

I omit the PyTorch code as I do not have problems with it.
The questions are the following:

  1. Is it correct to broadcast values for batched inputs using encodingThetas[:, i]? I have seen this notation in a past discussion (I’ll try to recover the link to it) and in the release notes, but I want to be sure I am not actually broadcasting all the encoding parameters to the same circuit and I am retaining the “batched” behaviour of classical neural networks.
  2. Is there a solution to avoid the for loop over each tensor’s features? Maybe defining a small ansatz for the routine applied to each pixel and then broadcast this ansatz to each pixel? Execution times are pretty high and I would like to optimize wherever I can. I already tried different backends and diff_methods using insights from other discussions, but for now my most performing setup is “default.qubit” as backend and “backprop” as diff_method.

I hope I have been clear enough explaining my issue.
Thank you for your help in advance! Hope to hear from you soon

Pennylane Version:

Name: PennyLane
Version: 0.38.0
Summary: PennyLane is a cross-platform Python library for quantum computing, quantum machine learning, and quantum chemistry. Train a quantum computer the same way as a neural network.
Home-page: https://github.com/PennyLaneAI/pennylane
Author: 
Author-email: 
License: Apache License 2.0
Location: /home/lollo/miniconda3/envs/pennylane/lib/python3.9/site-packages
Requires: appdirs, autograd, autoray, cachetools, networkx, numpy, packaging, pennylane-lightning, requests, rustworkx, scipy, toml, typing-extensions
Required-by: PennyLane_Lightning, PennyLane_Lightning_GPU

Platform info:           Linux-6.9.3-76060903-generic-x86_64-with-glibc2.35
Python version:          3.9.20
Numpy version:           1.23.5
Scipy version:           1.13.1
Installed devices:
- default.clifford (PennyLane-0.38.0)
- default.gaussian (PennyLane-0.38.0)
- default.mixed (PennyLane-0.38.0)
- default.qubit (PennyLane-0.38.0)
- default.qubit.autograd (PennyLane-0.38.0)
- default.qubit.jax (PennyLane-0.38.0)
- default.qubit.legacy (PennyLane-0.38.0)
- default.qubit.tf (PennyLane-0.38.0)
- default.qubit.torch (PennyLane-0.38.0)
- default.qutrit (PennyLane-0.38.0)
- default.qutrit.mixed (PennyLane-0.38.0)
- default.tensor (PennyLane-0.38.0)
- null.qubit (PennyLane-0.38.0)
- lightning.qubit (PennyLane-Lightning-0.38.0)
- lightning.gpu (PennyLane-Lightning-GPU-0.38.0)

Hi @LM98 , welcome to the Forum!

Your code is very complex so I don’t really know the answers to your questions.
As a general guideline, things that make circuits run slow are:

  • Using a lot of qubits: anything higher than 15 qubits might be slow, and higher than 20 qubits may not even run on your laptop.
  • Using very deep circuits: it’s probably not your issue in this case.
  • Using nested for loops: in your case you have an if within a for loop within a for loop.

For the last point here I don’t know if there’s a way to change the logic or how to do this. This really lies within the research and development process of working in quantum machine learning.

My recommendation would be to simplify the problem as much as possible and then add complexity once you have a code that runs the way you want it to.

I hope this helps, and I’m sorry I couldn’t give a more concrete answer.

Hello @CatalinaAlbornoz, thanks for replying. Actually, circuit depth is a problem, as this implementation does become a bit cluttered when considering even trivial images like the ones from MedMNIST 28x28, so in the near future I will be investigating different approaches to encode data more efficiently. However, I don’t think I was clear enough writing the “batch” question.
Let’s consider a batch in a classic dataloader (e.g. the pytorch one), like this one in the photo.

I was wondering about what I should write in my pennylane code (aside from the optimization part, which as you said depends on a lot of factors) to pass each sample to my circuit. I would like to avoid for loops, so implementations like this for example:

data, labels = batch
for sample in data:
   result = circuit(sample, weights)

Reading the forum and the docs, I stumbled upon the parameter broadcasting function of Pennylane and I thought that maybe I could use that to avoid for loops as much as I could.
What I was asking was: Am I implementing the behaviour shown in the next image, or am I wrong?

Basically, I want to retain pytorch’s behaviour: pass the batch of samples to the circuit/qnn, evaluate the forward pass for each sample in parallel and get the outputs as a tensor of shape [batch_size, ...]. Just like a classical torch layer.
Or am I forced to use for loops?
I still haven’t managed to retrieve the forum topic where one of the suggestions to a “batch” problem like mine was the bit of code I originally posted, my bad. I’ll try to search it in the weekend. I hope I have explained the problem clearly anyway.
Thanks again for your help and have a great weekend!

Hi @LM98,

Yes, you can definitely run things in batches. The code I’m adding below comes from our demo on turning quantum nodes into Torch layers. Note that here we create batches of 5 input points, where each sample has size 2.

import torch
import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import make_moons
import pennylane as qml

# Set random seeds
torch.manual_seed(42)
np.random.seed(42)

# Create your data
X, y = make_moons(n_samples=200, noise=0.1)
y_ = torch.unsqueeze(torch.tensor(y), 1)  # used for one-hot encoded labels
y_hot = torch.scatter(torch.zeros((200, 2)), 1, y_, 1)

# Plot your data
c = ["#1f77b4" if y_ == 0 else "#ff7f0e" for y_ in y]  # colours for each class
plt.axis("off")
plt.scatter(X[:, 0], X[:, 1], c=c)
plt.show()

# Create a QNode
n_qubits = 2
dev = qml.device("default.qubit", wires=n_qubits)

@qml.qnode(dev)
def qnode(inputs, weights):
    qml.AngleEmbedding(inputs, wires=range(n_qubits))
    qml.BasicEntanglerLayers(weights, wires=range(n_qubits))
    return [qml.expval(qml.PauliZ(wires=i)) for i in range(n_qubits)]

# Set the number of layers
n_layers = 6
weight_shapes = {"weights": (n_layers, n_qubits)}

# Optionally draw the circuit
qml.draw_mpl(qnode,level='device')(X[0,:],torch.rand(n_layers, n_qubits));

# Create your Hybrid model with two classical layers, a quantum layer, and a softmax
qlayer = qml.qnn.TorchLayer(qnode, weight_shapes)

clayer_1 = torch.nn.Linear(2, 2)
clayer_2 = torch.nn.Linear(2, 2)
softmax = torch.nn.Softmax(dim=1)
layers = [clayer_1, qlayer, clayer_2, softmax]
model = torch.nn.Sequential(*layers)

# Choose an optimizer and a loss function
opt = torch.optim.SGD(model.parameters(), lr=0.2)
loss = torch.nn.L1Loss()

# Set your data to something trainable
X = torch.tensor(X, requires_grad=True).float()
y_hot = y_hot.float()

# Get your data ready for training in batches
batch_size = 5
batches = 200 // batch_size

data_loader = torch.utils.data.DataLoader(
    list(zip(X, y_hot)), batch_size=5, shuffle=True, drop_last=True
)

# Choose your epochs and start your training
epochs = 6

for epoch in range(epochs):

    running_loss = 0

    for xs, ys in data_loader:
        opt.zero_grad()

        loss_evaluated = loss(model(xs), ys)
        loss_evaluated.backward()

        opt.step()

        running_loss += loss_evaluated

    avg_loss = running_loss / batches
    print("Average loss over epoch {}: {:.4f}".format(epoch + 1, avg_loss))

# Calculate your accuracy
y_pred = model(X)
predictions = torch.argmax(y_pred, axis=1).detach().numpy()

correct = [1 if p == p_true else 0 for p, p_true in zip(predictions, y)]
accuracy = sum(correct) / len(correct)
print(f"Accuracy: {accuracy * 100}%")

You could expand this to datasets where the inputs have a different shape, but you’d need to make sure that all shapes match. For example if you wanted to add data with 4 features you’d need 4 qubits, and you’d need to modify your first classical layer such that is has 4 inputs and outputs, as well as the second classical layer to have 4 inputs.

I hope this helps!