Lightnint.kokkos cannot import StatePrep in pennylane 0.30

Hi! I am trying to run a code of a trainable quantum convolution using the simulator lightning.kokkos. My code works well with lightning.qubit and lightning.gpu (too slow, but it works). I wanted to try lightning.kokkos to see if it goes faster, but I see an import error saying that cannot import name ‘StatePrep’ from ‘pennylane’. I have installed pennylane 0.30, lightning 0.30 and I am working in Python 3.10.6. I updated to pennylane 0.32 and the error disappeared, but my code is not capable of running in 0.32 in any simulator due to an issue with the size of the input tensor. I tried to fix my code to run in the 32 version but I cannot find the error. Here is my full code:

class QuanvLayer1D(nn.Module):
def init(self, sim_dev=“lightning.kokkos”,in_channels=1, out_channels = 3, kernel_size=2, stride=1, padding=1, n_layers=1, seed=0):
super(QuanvLayer1D, self).init()
# init device
self.wires = out_channels # We use n qubits to obtain n out_channels
self.dev = qml.device(sim_dev, wires=self.wires)
self.kernel_size = kernel_size
self.stride = stride
self.padding = padding
self.in_channels=in_channels
self.out_channels = out_channels
self.n_layers=n_layers

    if seed is None:
        seed = np.random.randint(low=0, high=10e6)
    print("Initializing Circuit with random seed", seed)

    # random circuits
    @qml.qnode(device=self.dev, interface="torch", diff_method="adjoint") #  device= "default.qubit" or "lightning.qubit"
    def circuit(inputs, weights):
        for j in range(self.out_channels):
            qml.RY(np.pi * inputs[j], wires=j)
        RandomLayers(weights, wires=list(range(self.wires)), seed=seed)
        # Measurement producing out_channels classical output values
        return [qml.expval(qml.PauliZ(j)) for j in range(self.out_channels)]


    #weight_shapes = {"weights": [n_layers, out_channels]} # n_rotations = out_channels
    weights = {"weights": (torch.randn((n_layers, out_channels)).to(device),)}

    self.circuit = qml.qnn.TorchLayer(circuit, weights)


def forward(self, vector):
    batch_size, in_channels, height, width = vector.size()

    output_height = height - self.kernel_size + 1
    output_width = width - self.kernel_size + 1
    x_unfolded = F.unfold(vector, self.kernel_size)
    device = next(self.circuit.parameters()).device
    print('el device en el forward',device)
    x_unfolded = x_unfolded.to(device)
    #we separate the dimension of the input channels
    x_unfolded = x_unfolded.view(batch_size, in_channels, self.kernel_size *self.kernel_size , output_height * output_width)

    
    #we permute because we want the output_height*output_width in dimension 1
    x_unfolded=x_unfolded.permute(0, 3, 1, 2)

    
    x_unfolded=x_unfolded.view(batch_size, output_height * output_width, in_channels*self.kernel_size *self.kernel_size)
    conv = self.circuit(inputs=torch.Tensor(x_unfolded))
    conv =conv.permute(0, 2, 1)
    conv=conv.contiguous()

    #y separamos ahora la última dimensión
    conv = conv.view(batch_size, self.out_channels, output_height, output_width)
    convp=conv.detach().cpu().numpy()
    plt.imshow(convp[0, 1, :, :], cmap="gray")
    plt.show()
    return conv

Hi @Sandra_Juarez, welcome to the Forum!

Thank you for asking your question here. Could you please post the output of qml.about()?

This can help us identify the problem.

Hi! thank you for replying. Here is the output of qml.about()

Name: PennyLane
Version: 0.30.0
Summary: PennyLane is a Python quantum machine learning library by Xanadu Inc.
Home-page: GitHub - PennyLaneAI/pennylane: PennyLane is a cross-platform Python library for differentiable programming of quantum computers. Train a quantum computer the same way as a neural network.
Author:
Author-email:
License: Apache License 2.0
Location: /home/sandra/.local/lib/python3.10/site-packages
Requires: appdirs, autograd, autoray, cachetools, networkx, numpy, pennylane-lightning, requests, rustworkx, scipy, semantic-version, toml
Required-by: pennylane-catalyst, PennyLane-Lightning, PennyLane-Lightning-GPU, PennyLane-Lightning-Kokkos

Platform info: Linux-5.10.16.3-microsoft-standard-WSL2-x86_64-with-glibc2.35
Python version: 3.10.6
Numpy version: 1.23.0
Scipy version: 1.8.0
Installed devices:

  • default.gaussian (PennyLane-0.30.0)
  • default.mixed (PennyLane-0.30.0)
  • default.qubit (PennyLane-0.30.0)
  • default.qubit.autograd (PennyLane-0.30.0)
  • default.qubit.jax (PennyLane-0.30.0)
  • default.qubit.tf (PennyLane-0.30.0)
  • default.qubit.torch (PennyLane-0.30.0)
  • default.qutrit (PennyLane-0.30.0)
  • null.qubit (PennyLane-0.30.0)
  • lightning.qubit (PennyLane-Lightning-0.30.0)
  • lightning.gpu (PennyLane-Lightning-GPU-0.30.0)
  • lightning.kokkos (PennyLane-Lightning-Kokkos-0.33.0.dev0)

Hi @Sandra_Juarez ,

In v0.32 of PennyLane there were some changes to StatePrep so I think the issue is caused because of running Lightning-Kokkos v0.33 with older versions of PennyLane. It might be easier to help you fix the issue with the size of the input tensor and update everything to the latest version. Can you please post the full error message that you get when you run everything with the latest version?

Hi! Thanks for your response @CatalinaAlbornoz . I’ve already updated to version 32 and ran my code. Here is my code again, I tried to comment on it in a clearer way. What I am trying to do is something similar to this tutorial Quanvolutional Neural Networks | PennyLane Demos but in a trainable version and replacing the loops with tensors. This code worked in pennylane 30, I printed the image after the quantum convolution and it did exactly what I expected. Also is the same code I utilized to perform a classical convolution from scratch with pytorch. Here is my code:

class QuanvLayer1D(nn.Module):
def init(self, sim_dev=“lightning.kokkos”,in_channels=1, out_channels = 3, kernel_size=2, stride=1, padding=1, n_layers=1, seed=0):
super(QuanvLayer1D, self).init()
# init device
self.wires = out_channels # We use n qubits to obtain n out_channels
self.dev = qml.device(sim_dev, wires=self.wires)
self.kernel_size = kernel_size
self.stride = stride
self.padding = padding
self.in_channels=in_channels
self.out_channels = out_channels
self.n_layers=n_layers

    if seed is None:
        seed = np.random.randint(low=0, high=10e6)

    print("Initializing Circuit with random seed", seed)

    
    @qml.qnode(device=self.dev, interface="torch", diff_method="adjoint") #  device= "default.qubit" or "lightning.qubit"
    def circuit(inputs, weights):
        for j in range(self.out_channels):
            qml.RY(np.pi * inputs[j], wires=j)
        # Random quantum circuit
        RandomLayers(weights, wires=list(range(self.wires)), seed=seed)
    
        # Measurement producing out_channels classical output values
        return [qml.expval(qml.PauliZ(j)) for j in range(self.out_channels)]


    #weight_shapes = {"weights": [n_layers, out_channels]} # n_rotations = out_channels
    weights = {"weights": (torch.randn((n_layers, out_channels)).to(device),)}

    self.circuit = qml.qnn.TorchLayer(circuit, weights)


def forward(self, vector):
    batch_size, in_channels, height, width = vector.size()

    output_height = height - self.kernel_size + 1
    output_width = width - self.kernel_size + 1

    ###### 1  #########
    #with the following line we will have strips of size kernel*kernel
    #for example for MNISt, a batch of 64 and kernel of 2 we will have [64,4,729]
    x_unfolded = F.unfold(vector, self.kernel_size)
    print('1. Shape after F.unfold(vector,kernel_size)',x_unfolded.shape)
    
    #device = next(self.circuit.parameters()).device
    #print('el device en el forward',device)
    x_unfolded = x_unfolded.to(device)

    ### 2 ######
    #now we separate the dimension of the input channels
    x_unfolded = x_unfolded.view(batch_size, in_channels, self.kernel_size *self.kernel_size , output_height * output_width)
    print('2. Shape after x.view(batch, in_channels, kernel *kernel , output_height * output_width)',x_unfolded.shape)

    ### 3 #########
    #And we permute, we want output_height*output_width in dimension 1
    x_unfolded=x_unfolded.permute(0, 3, 1, 2)
    print('3. Shape after x.permute',x_unfolded.shape)

    #### 4 ########
    # And the last dimension is now multiplied with the previous dimensions 2 and 3.
    # This is because instead of the convolution operation, we will perform a dot product between the kernel and the corresponding pixels.
    # It should have a size of in_channels * kernel * kernel because remember that when we convolve input with multiple channels,
    # we also sum along the channels.
    x_unfolded=x_unfolded.view(batch_size, output_height * output_width, in_channels*self.kernel_size *self.kernel_size)
    print('4. Shape before entering to circuit',x_unfolded.shape)


    #print('el size de vector slice',vector_slice.size())
    #with torch.autograd.profiler.profile(use_cuda=True) as prof:

    conv = self.circuit(inputs=torch.Tensor(x_unfolded))
    print('el shape de conv',conv.shape)
    conv =conv.permute(0, 2, 1)
    conv=conv.contiguous()
    conv = conv.view(batch_size, self.out_channels, output_height, output_width)
    convp=conv.detach().cpu().numpy()
    plt.imshow(convp[0, 1, :, :], cmap="gray")
    plt.show()
    
    print('we finish quanvolucion')
    return conv

And here is the output of the prints of the dimensions of the input vector in each of the 4 changes that are applied before entering the quantum circuit:

  1. Shape after F.unfold(vector,kernel_size) torch.Size([2, 4, 729])
  2. Shape after x.view(batch, in_channels, kernel *kernel , output_height * output_width) torch.Size([2, 1, 4, 729])
  3. Shape after x.permute torch.Size([2, 729, 1, 4])
  4. Shape before entering to circuit torch.Size([2, 729, 4])

And here is the error message:
STAGE:2023-09-25 17:33:23 7589:7589 ActivityProfilerController.cpp:311] Completed Stage: Warm Up STAGE:2023-09-25 17:33:23 7589:7589 ActivityProfilerController.cpp:317] Completed Stage: Collection STAGE:2023-09-25 17:33:23 7589:7589 ActivityProfilerController.cpp:321] Completed Stage: Post Processing

--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) /tmp/ipykernel_7589/1713626098.py in 10 # Calcular las salidas y la pérdida 11 with torch.autograd.profiler.profile(use_cuda=True) as prof: —> 12 outputs = net(inputs) 13 print(prof.key_averages().table(sort_by=“cuda_time_total”, row_limit=10)) 14 loss = criterion(outputs, labels) ~/.local/lib/python3.10/site-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs) 1499 or _global_backward_pre_hooks or _global_backward_hooks 1500 or _global_forward_hooks or _global_forward_pre_hooks): → 1501 return forward_call(*args, **kwargs) 1502 # Do not call functions when jit is used 1503 full_backward_hooks, non_full_backward_hooks = , /tmp/ipykernel_7589/1755277684.py in forward(self, x) 11 12 def forward(self, x): —> 13 x = self.quanv_layer(x) 14 x = self.conv_layer(x) 15 x = self.relu(x) ~/.local/lib/python3.10/site-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs) 1499 or _global_backward_pre_hooks or _global_backward_hooks 1500 or _global_forward_hooks or _global_forward_pre_hooks): → 1501 return forward_call(*args, **kwargs) 1502 # Do not call functions when jit is used 1503 full_backward_hooks, non_full_backward_hooks = , /tmp/ipykernel_7589/593562690.py in forward(self, vector) 73 #with torch.autograd.profiler.profile(use_cuda=True) as prof: 74 —> 75 conv = self.circuit(inputs=torch.Tensor(x_unfolded)) 76 print(‘el shape de conv’,conv.shape) 77 conv =conv.permute(0, 2, 1) ~/.local/lib/python3.10/site-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs) 1499 or _global_backward_pre_hooks or _global_backward_hooks 1500 or _global_forward_hooks or _global_forward_pre_hooks): → 1501 return forward_call(*args, **kwargs) 1502 # Do not call functions when jit is used 1503 full_backward_hooks, non_full_backward_hooks = , ~/.local/lib/python3.10/site-packages/pennylane/qnn/torch.py in forward(self, inputs) 406 else: 407 # calculate the forward pass as usual → 408 results = self._evaluate_qnode(inputs) 409 410 # reshape to the correct number of batch dims ~/.local/lib/python3.10/site-packages/pennylane/qnn/torch.py in _evaluate_qnode(self, x) 433 434 if len(x.shape) > 1: → 435 res = [torch.reshape(r, (x.shape[0], -1)) for r in res] 436 437 return torch.hstack(res).type(x.dtype) ~/.local/lib/python3.10/site-packages/pennylane/qnn/torch.py in (.0) 433 434 if len(x.shape) > 1: → 435 res = [torch.reshape(r, (x.shape[0], -1)) for r in res] 436 437 return torch.hstack(res).type(x.dtype)

RuntimeError: shape ‘[1458, -1]’ is invalid for input of size 4

Hi @Sandra_Juarez, the error you get shows some dimension mismatch. This usually occurs because the inputs or weights have the wrong dimensions, or because you’re using batching in an unsupported way. If you post a self-contained (but minimal) version of your code I can try to replicate your error. The code that you shared is unfortunately not complete so I cannot run it. Make sure to include any imports and a small sample dataset.

I would also recommend changing lightning.kokkos to lightning.qubit and default.qubit to see if this fixes your issue. It can help us understand if the issue is only in kokkos or not.

Finally, if you repost your code make sure to format it correctly so that the entire code looks right here. You can use three backticks (```) before and after your code so that it formats correctly.

Let me know if you have any questions about this.

Hi @CatalinaAlbornoz . I tried with your suggestion of using lightning.qubit and default.qubit but the error was the same. Here is my code, or if you prefer, I also attached the code:
quantolution_simplified.py (6.8 KB)

import torch
import torchvision
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision.transforms import transforms
import torchvision.datasets as datasets
from torch.utils.data import DataLoader
from torch.utils.data import Dataset
import pennylane as qml
from pennylane import numpy as np
from pennylane.templates import RandomLayers
np.random.seed(0)           # Seed for NumPy random number generator
torch.manual_seed(0)        # Seed for Pytorch random number generator

from torch.utils.data import Subset
# Cargar el conjunto de datos MNIST
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

train_dataset = datasets.FashionMNIST('data', train=True, download=True, transform=transform)
test_dataset = datasets.FashionMNIST('data', train=False, transform=transform)

# Reduce el tamaño del conjunto de datos de entrenamiento
n = 256
train_indices = list(range(n))
train_dataset = Subset(train_dataset, train_indices)

 #Reduce el tamaño del conjunto de datos de prueba
n_test = 50
test_indices = list(range(n_test))
test_dataset = Subset(test_dataset, test_indices)

# Definir el tamaño del lote y crear los iteradores de datos
batch_size = 2
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)


device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

class QuanvLayer1D(nn.Module):
    def __init__(self, sim_dev="lightning.kokkos",in_channels=1, out_channels = 3, kernel_size=2, stride=1, padding=1, n_layers=1, seed=0):
        super(QuanvLayer1D, self).__init__()
        # init device
        self.wires = out_channels # We use n qubits to obtain n out_channels
        self.dev = qml.device(sim_dev, wires=self.wires)
        self.kernel_size = kernel_size
        self.stride = stride
        self.padding = padding
        self.in_channels=in_channels
        self.out_channels = out_channels
        self.n_layers=n_layers

        if seed is None:
            seed = np.random.randint(low=0, high=10e6)

        print("Initializing Circuit with random seed", seed)

        
        @qml.qnode(device=self.dev, interface="torch", diff_method="adjoint") #  device= "default.qubit" or "lightning.qubit"
        def circuit(inputs, weights):
            for j in range(self.out_channels):
                qml.RY(np.pi * inputs[j], wires=j)
            # Random quantum circuit
            RandomLayers(weights, wires=list(range(self.wires)), seed=seed)
        
            # Measurement producing out_channels classical output values
            return [qml.expval(qml.PauliZ(j)) for j in range(self.out_channels)]


        #weight_shapes = {"weights": [n_layers, out_channels]} # n_rotations = out_channels
        weights = {"weights": (torch.randn((n_layers, out_channels)).to(device),)}

        self.circuit = qml.qnn.TorchLayer(circuit, weights)


    def forward(self, vector):
        batch_size, in_channels, height, width = vector.size()

        output_height = height - self.kernel_size + 1
        output_width = width - self.kernel_size + 1

        ###### 1  #########
        #with the following line we will have strips of size kernel*kernel
        #for example for MNISt, a batch of 64 and kernel of 2 we will have [64,4,729]
        x_unfolded = F.unfold(vector, self.kernel_size)
        print('1. Shape after F.unfold(vector,kernel_size)',x_unfolded.shape)
        
        #device = next(self.circuit.parameters()).device
        #print('el device en el forward',device)
        x_unfolded = x_unfolded.to(device)

        ### 2 ######
        #now we separate the dimension of the input channels
        x_unfolded = x_unfolded.view(batch_size, in_channels, self.kernel_size *self.kernel_size , output_height * output_width)
        print('2. Shape after x.view(batch, in_channels, kernel *kernel , output_height * output_width)',x_unfolded.shape)

        ### 3 #########
        #And we permute, we want output_height*output_width in dimension 1
        x_unfolded=x_unfolded.permute(0, 3, 1, 2)
        print('3. Shape after x.permute',x_unfolded.shape)

        #### 4 ########
        # And the last dimension is now multiplied with the previous dimensions 2 and 3.
        # This is because instead of the convolution operation, we will perform a dot product between the kernel and the corresponding pixels.
        # It should have a size of in_channels * kernel * kernel because remember that when we convolve input with multiple channels,
        # we also sum along the channels.
        x_unfolded=x_unfolded.view(batch_size, output_height * output_width, in_channels*self.kernel_size *self.kernel_size)
        print('4. Shape before entering to circuit',x_unfolded.shape)


       

        conv = self.circuit(inputs=torch.Tensor(x_unfolded))
        print('el shape de conv',conv.shape)
        conv =conv.permute(0, 2, 1)
        conv=conv.contiguous()
        conv = conv.view(batch_size, self.out_channels, output_height, output_width)
        print('we finish quanvolucion')
        return conv


class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.quanv_layer = QuanvLayer1D(in_channels=1, out_channels=3, kernel_size=2).to(device)
        self.conv_layer = nn.Conv2d(3, 6, 3, stride=1, padding=1).to(device)
        self.relu = nn.ReLU().to(device)
        self.maxpool = nn.MaxPool2d(2).to(device)
        self.flatten = nn.Flatten().to(device)
        self.fc1 = nn.Linear(1014, 16).to(device)
        self.fc2 = nn.Linear(1, 10).to(device)

    def forward(self, x):
        x = self.quanv_layer(x)
        x = self.conv_layer(x)
        x = self.relu(x)
        x = self.maxpool(x)
        x = self.flatten(x)
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

# Initialize the net
net = CNN().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.01)

epochs =10
losses = []
for epoch in range(epochs):
    running_loss = 0.0
    for i, (inputs, labels) in enumerate(train_loader, 0):
        inputs = inputs.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()

        
        outputs = net(inputs)
        
        loss = criterion(outputs, labels)


        # Realizar la retropropagación y la optimización
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        print('train loader',i)
        print('Loss',loss)
        if i % 10 == 99:
            print(f'Epoch: {epoch + 1}, Batch: {i + 1}, Loss: {running_loss / 100:.3f}')

            #if is last batch of epoch
            #if i== train_loader.__len__()-1:
            last_running_loss_toSafe = running_loss
            running_loss = 0.0

    losses.append(last_running_loss_toSafe / 10)

Hello @Sandra_Juarez !

If I understood, you want to implement Quanvolutional Neural Network using Torch Layers.

I took a look at your code, I noticed some issues. Your code is quite extensive for a minimal working example, and It is hard to tell which part you defined your quantum kernel. Or the quantum pre-processing of the dataset. If you could send a smaller, simplified version, it would be great! And it is also an opportunity to debug it by yourself. :slight_smile:

Also, are you sure the inputs match the dimensions of the circuit?

On lines 44-47, you hard-coded some variables:

class QuanvLayer1D(nn.Module):
    def __init__(self, sim_dev="lightning.kokkos",in_channels=1, out_channels = 3, kernel_size=2, stride=1, padding=1, n_layers=1, seed=0):
        super(QuanvLayer1D, self).__init__()
        # init device

Are you sure out_channels is matching the data size encoded to the circuit?

According to the documentation of qnn.TorchLayer, this function takes a QNodes and converts it to a Torch Layer. Note that a TorchLayer can be used within the torch.nn Sequential or Module classes for creating quantum and hybrid models. I wonder about CNN… :thinking:

I need to check a few more details, but I’ll return to you soon! :slight_smile: