Hybrid neural network speed suggestions

James_Ellis · March 5, 2020, 1:18pm

Hi,

I am currently testing out some hybrid neural networks with the MNIST dataset. The problem is that it is very slow.

5 epochs takes around 4/5 hours.

I have tried Forest pyQVM because it was shown to be the fastest in Speeding up grad computation but it was a lot slower.

Any suggestions would be appreciated.

Thanks for your help!

import torch

import torch.nn as nn

import torchvision

import torchvision.transforms as transforms

import pennylane as qml

from pennylane import numpy as np

# Params

q_depth = 2

n_qubits = 4             

q_delta = 1 

input_size = 784

num_classes = 10

num_epochs = 5

batch_size = 100

learning_rate = 0.001

dev = qml.device("default.qubit", wires=4)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

def RY_layer(w):

    """Layer of parametrized qubit rotations around the y axis.

    """

    for idx, element in enumerate(w):

        qml.RY(element, wires=idx)

@qml.qnode(dev, interface="torch")

def q_net(q_in, q_weights_flat):

    # Reshape weights

    q_weights = q_weights_flat.reshape(q_depth, n_qubits)

    # Embed features in the quantum node

    RY_layer(q_in)

    for i in range(n_qubits - 1):

        qml.CNOT(wires=[i, i+1])

    

    for i in range(n_qubits):

        qml.RX(q_weights[0][i], wires = i)

    

    for i in range(n_qubits):

        qml.RY(q_weights[1][i], wires = i)

    

    return tuple(qml.expval(qml.PauliZ(wires=i)) for i in range(n_qubits))

# MNIST dataset 

train_dataset = torchvision.datasets.MNIST(root='../data', 

                                           train=True, 

                                           transform=transforms.ToTensor(),  

                                           download=True)

test_dataset = torchvision.datasets.MNIST(root='../data', 

                                          train=False, 

                                          transform=transforms.ToTensor())

train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 

                                           batch_size=batch_size, 

                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 

                                          batch_size=batch_size, 

                                          shuffle=False)

class NeuralNet(nn.Module):

    def __init__(self, input_size, num_classes):

        super(NeuralNet, self).__init__()

        self.relu = nn.ReLU()

        self.pre_net = nn.Linear(input_size, n_qubits)

        self.q_params = nn.Parameter(q_delta * torch.randn(q_depth * n_qubits))

        self.post_net = nn.Linear(n_qubits, 10)

    

    def forward(self, x):

        q_in = self.pre_net(x)

        q_in = self.relu(q_in)

        q_out = torch.Tensor(0, n_qubits).to(device)

        for elem in q_in:

            q_out_elem = q_net(elem, self.q_params).float().unsqueeze(0)

            q_out = torch.cat((q_out, q_out_elem))

        q_out = self.relu(q_out)

        out = self.post_net(q_out)

        return out

model = NeuralNet(input_size, num_classes).to(device)

# Loss and optimizer

criterion = nn.CrossEntropyLoss()

optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)  

# Train the model

total_step = len(train_loader)

for epoch in range(num_epochs):

    for i, (images, labels) in enumerate(train_loader):  

        # Move tensors to the configured device

        images = images.reshape(-1, 28*28).to(device)

        labels = labels.to(device)

        # Forward pass

        outputs = model(images)

        loss = criterion(outputs, labels)

        # Backward and optimize

        optimizer.zero_grad()

        loss.backward()

        optimizer.step()

        print(i)

        if (i+1) % 100 == 0:

            print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}' 

                   .format(epoch+1, num_epochs, i+1, total_step, loss.item()))

# Test

with torch.no_grad():

    correct = 0

    total = 0

    for images, labels in test_loader:

        images = images.reshape(-1, 28*28).to(device)

        labels = labels.to(device)

        outputs = model(images)

        _, predicted = torch.max(outputs.data, 1)

        total += labels.size(0)

        correct += (predicted == labels).sum().item()

    print('Accuracy of the network on the 10000 test images: {} %'.format(100 * correct / total))

# Save

torch.save(model.state_dict(), 'model1.ckpt')

josh · March 5, 2020, 3:10pm

Hi @James_Ellis!

Since that post, we have invested some time improving the built-in default.qubit simulator, and managed to get an approximate two orders of magnitude speed improvement. So default.qubit will now be significantly faster than pyQVM

With respect to the training time, the largest factor is the number of parameters in the quantum circuit. As PennyLane uses the parameter shift rule to differentiate quantum nodes in a hardware-friendly manner, the number of quantum evaluations required to compute the gradient of all p parameters scales as 2p\Delta t, where \Delta t is the time taken for one forward pass/quantum simulation.

Some suggestions for improving the speed of training:

PennyLane always treats positional QNode arguments as differentiable, and keyword arguments as non-differentiable. You may see some speed improvement if you change q_in to be a keyword argument:
```
@qml.qnode(dev, interface="torch")
def q_net(q_weights_flat, q_in=None):
```
You could try a high performance simulator, such as Qulacs. However, the PennyLane-Qulacs plugin is experimental, and needs more work to ensure its accuracy.
Finally, a new experimental feature in the latest version of PennyLane is the PassthruQNode. Instead of using the parameter-shift rule, the PassthruQNode is simply a white box, passing tensors to a compatible simulator where classical backpropagation occurs.
- This scales with only constant overhead compared to the parameter-shift rule, but is not hardware compatible.
- The PassthruQNode currently only works with the default.tensor.tf simulator, coded in TensorFlow, so must be used with the TensorFlow interface.
See this post for an example of the PassthruQNode being used.

James_Ellis · March 5, 2020, 3:33pm

Thank you again for your replies they are very helpful.

For 1) I am getting an error

TypeError: q_net() got multiple values for argument 'q_in'

The code works fine when I run as
def q_net(q_weights_flat, q_in):

Do you have any ideas why it might be doing that?

For 3.) what is actually happening in PasthruQNode that means it doesn’t have to use the parameter-shift rule?

Thanks

josh · March 6, 2020, 2:11am

For 1) I am getting an error

TypeError: q_net() got multiple values for argument 'q_in'

The code works fine when I run as
def q_net(q_weights_flat, q_in):

Do you have any ideas why it might be doing that?

When you define a keyword argument in a QNode, you must then always call the QNode the argument as a keyword argument. For example, if you have

def q_net(q_weights_flat, q_in=None):

Then, when you call it inside your layer, you need to call it like this:

q_out_elem = q_net(self.q_params, q_in=elem).float().unsqueeze(0)

what is actually happening in PasthruQNode that means it doesn’t have to use the parameter-shift rule?

It’s more a question of what ‘doesn’t’ happen inside a PassthruQNode

In a standard QNode, the QNode is a black box — PyTorch has no access to the internal working of the QNode. It simply asks the QNode for the gradient, the QNode performs the hardware-friendly parameter-shift rule to get the gradient, and returns the gradient to PyTorch.

The PassthryQNode however is a ‘white box’ — TensorFlow/PyTorch do not even see the QNode, they control the entire computation all the way down to the individual quantum operations. This allows classical backpropagation to determine the gradient, at the expense that it won’t work with hardware.

James_Ellis · March 6, 2020, 10:15pm

Thanks! The qulacs simulator and keyword argument are making a big difference!

For PassthruQNode, how do you apply this to pytorch? Is there a quick example you could provide?

Thanks!

nathan · March 6, 2020, 10:28pm

Hi @James_Ellis,

Unfortunately, until pytorch has native support for complex number, we can’t make a compatible PassthruQNode

Topic		Replies	Views
Hybrid Quantum-Classical network with pytorch PennyLane Help	18	3085	November 4, 2020
How to use GPU to accelerate the hybrid QNN PennyLane Help	6	534	March 8, 2024
Pytorch benchmarks, different devices, and computing resources PennyLane Help	1	400	March 15, 2022
Execution time very long. Options to speed up? PennyLane Plugins	9	2130	April 26, 2021
Why is my neural network training getting slower and slower? PennyLane Help	2	577	October 16, 2023

Hybrid neural network speed suggestions

Related topics