Shape error in TorchLayer using MottonenStatePreparation

I am trying to use MottonenStatePrep for batched circuits in a TorchLayer, and have been running into issues. For some of my input state vectors I have no issues, but for others I do. While I understand MottonenStatePrep has not been fully tested for differentiability, it is not clear to me that is where there is an issue, as the error arries in pennylane\qnn\torch.py before I even try to update parameters. Here is the following error:

File "User\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "User\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "User\site-packages\pennylane\qnn\torch.py", line 406, in forward
    results = torch.reshape(results, (*batch_dims, *results.shape[1:]))
RuntimeError: shape '[8]' is invalid for input of size 1

The code that produces this error:

import torch
import torch.nn as nn
import pennylane as qml
import numpy as np

def QuantumLayer():
    n_qubits = 3
    dev = qml.device("default.qubit", wires=n_qubits)

    def _circuit(inputs, weights):
        qml.MottonenStatePreparation(state_vector=inputs, wires=[0,1,2])
        qml.RY(phi=weights, wires=[0])
        return qml.expval(qml.PauliZ(wires=0))
    
    qlayer = qml.QNode(_circuit, dev, interface="torch")
    weight_shapes = {"weights": (1)}
    return qml.qnn.TorchLayer(qlayer, weight_shapes)

# Define a simple PyTorch model class
class SimpleQuantumModel(nn.Module):
    def __init__(self):
        super(SimpleQuantumModel, self).__init__()
        self.quantum_layer = QuantumLayer()

    def forward(self, x):
        return self.quantum_layer(x)

# Example usage
model = SimpleQuantumModel()

numpy_data = np.load("mottoen_test.npz", allow_pickle=True)
features = torch.tensor(numpy_data['data'], dtype=torch.float32, requires_grad=True)[160:168]

#FAILS
#0:168
#161:168
#160:170

#SUCCESSES
#0:167

print(features.shape)
print(features)
output = model(features)
print(output)

The output before the error:

torch.Size([8, 8])
tensor([[0.0000, 1.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
          [0.0000, 1.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
          [0.0000, 0.3273, 0.9449, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
          [0.0000, 0.5672, 0.8236, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
          [0.5823, 0.4304, 0.6897, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
          [0.0219, 0.7780, 0.6279, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
          [0.7041, 0.4822, 0.5212, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
          [0.7641, 0.0000, 0.6451, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000]],
       grad_fn=<SliceBackward0>)

The data from mottoen_test.npz in the minimal code is shape (1453,8). When I try to batch all of these at once, I get the same error:

RuntimeError: shape '[1453]' is invalid for input of size 1

I cannot seem to isolate it to problematic input data (they are all normalized vectors), as the slices0:168 fail, but 165:168 doesn’t for example. Below is the example tensor from above that failed:

state_vectors= [[0.0000, 1.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
        [0.0000, 1.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
        [0.0000, 0.3273, 0.9449, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
        [0.0000, 0.5672, 0.8236, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
        [0.5823, 0.4304, 0.6897, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
        [0.0219, 0.7780, 0.6279, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
        [0.7041, 0.4822, 0.5212, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
        [0.7641, 0.0000, 0.6451, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000]]

features= torch.tensor(state_vectors, requires_grad=True)

Thank you for the help.

System Information:

Name: PennyLane
Version: 0.33.1
Summary: PennyLane is a Python quantum machine learning library by Xanadu Inc.
Home-page: https://github.com/PennyLaneAI/pennylane
Author: 
Author-email: 
License: Apache License 2.0
Location: /usr/local/lib/python3.10/dist-packages
Requires: appdirs, autograd, autoray, cachetools, networkx, numpy, pennylane-lightning, requests, rustworkx, scipy, semantic-version, toml, typing-extensions
Required-by: PennyLane-Lightning

Platform info:           Linux-5.15.120+-x86_64-with-glibc2.35
Python version:          3.10.12
Numpy version:           1.23.5
Scipy version:           1.11.3

Hey @Anthony_Smaldone, welcome back!

I think the main issue here is that MottonenStatePreparation doesn’t support parameter broadcasting — you can’t input several states and get several parallel outputs. Using StatePrep instead is probably the way to go: qml.StatePrep — PennyLane 0.33.0 documentation. It supports broadcasting:

dev = qml.device("default.qubit")

@qml.qnode(dev)
def circuit(state):
    #qml.MottonenStatePreparation(state, wires=[0])
    qml.StatePrep(state, wires=[0])
    return qml.expval(qml.PauliZ(0))

states = np.array([[1, 0], [0, 1]])
print(circuit(states))
[ 1. -1.]

Let me know if this helps!

Thank you, this does help! An issue is I actually chose to use MottonenStatePreparation since the gradient of the input features needs to be tracked. In order to amplitude encode data where the features are differentiable, would I need to execute MottonenStatePreparation sequentially for each circuit since it cannot be parallelized with Pennylane? Are there any other ways I can use amplitude encoding without having to run everything sequentially (and gradient can be tracked)?

You should be able to use StatePrep still :slight_smile:

import pennylane as qml
import pennylane.numpy as np

dev = qml.device("default.qubit")

@qml.qnode(dev)
def circuit(state):
    qml.StatePrep(state, wires=[0])
    return qml.expval(qml.PauliZ(0))

states = np.array([[1, 0], [0, 1]], requires_grad=True)
qml.jacobian(circuit)(states)
array([[[ 2.,  0.],
        [ 0.,  0.]],

       [[ 0.,  0.],
        [ 0., -2.]]])

Thank you, however when I tried to run StatePrep, I got a RuntimeError:

Traceback (most recent call last):
  File "c:\Users\Desktop\qswapnet_transfer\qswapnet_gcn.py", line 590, in <module>
    outputs = quantum_model(atomic_features_batch, edge_features_batch, classical_to_quantum_weight_vector)
  File "C:\Users\miniconda3\envs\QMLGPU\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\Users\miniconda3\envs\QMLGPU\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
  File "C:\Users\miniconda3\envs\QMLGPU\lib\site-packages\pennylane\interfaces\execution.py", line 371, in wrapper
    res = list(fn(tuple(execution_tapes.values()), **kwargs))
  File "C:\Users\miniconda3\envs\QMLGPU\lib\site-packages\pennylane\devices\default_qubit.py", line 474, in execute
    results = tuple(
  File "C:\Users\miniconda3\envs\QMLGPU\lib\site-packages\pennylane\devices\default_qubit.py", line 475, in <genexpr>
    simulate(
  File "C:\Users\miniconda3\envs\QMLGPU\lib\site-packages\pennylane\devices\qubit\simulate.py", line 269, in simulate
    state, is_state_batched = get_final_state(circuit, debugger=debugger, interface=interface)
  File "C:\Users\miniconda3\envs\QMLGPU\lib\site-packages\pennylane\devices\qubit\simulate.py", line 156, in get_final_state
    state = create_initial_state(sorted(circuit.op_wires), prep, like=INTERFACE_TO_LIKE[interface])
  File "C:\Users\miniconda3\envs\QMLGPU\lib\site-packages\pennylane\devices\qubit\initialize_state.py", line 45, in create_initial_state
    return qml.math.asarray(prep_operation.state_vector(wire_order=list(wires)), like=like)
  File "C:\Users\miniconda3\envs\QMLGPU\lib\site-packages\pennylane\ops\qubit\state_preparation.py", line 230, in state_vector
    ket[indices] = op_vector
  File "C:\Users\miniconda3\envs\QMLGPU\lib\site-packages\torch\_tensor.py", line 1032, in __array__
    return self.numpy().astype(dtype, copy=False)
RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.

When I try to run my batched states in a new file and set the required_grad flag to True, it runs without issue. Could the issue be dependent on which gradient functions are being used in my original program?

When I detach my inputs from the gradient in StatePrep, no errors occur.

Futhermore, in my original program, I run the model inputs through a quantum circuit, and use the output of the first circuit to feed into the StatePrep. When I remove the first quantum circuit and directly feed inputs into StatePrep (where requires_grad=True still), I get no error.

I was able to narrow down the bug to some extent. However I still do not know how to fix it. The issue seems to arise when I apply StatePrep to one set of wires and then apply a parameterized gate to another unused wire? Here is an example of the problematic code:

import torch
import torch.nn as nn
import pennylane as qml
import pennylane.numpy as np

def QuantumLayer():
    n_qubits = 4
    dev = qml.device("default.qubit", wires=n_qubits)

    def _circuit(inputs, weights):
        qml.StatePrep(inputs, wires=[1,2,3])
        qml.RY(phi=weights, wires=[0])
        return qml.expval(qml.PauliZ(wires=0))
    
    qlayer = qml.QNode(_circuit, dev, interface="torch")
    weight_shapes = {"weights": (1)}
    return qml.qnn.TorchLayer(qlayer, weight_shapes)

# Define a simple PyTorch model class
class SimpleQuantumModel(nn.Module):
    def __init__(self):
        super(SimpleQuantumModel, self).__init__()
        self.quantum_layer = QuantumLayer()
    def forward(self, x):
        return self.quantum_layer(x)

model = SimpleQuantumModel()
features = torch.tensor(np.array([[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]), requires_grad=True)

print(model(features))

Things to note:

  • The issue goes away when you remove the RY gate and leave the Stateprep wires as [1,2,3]
  • The issue goes away when you change the Stateprep wires to [0,1,2] and leave the RY gate

Hey @Anthony_Smaldone,

I change this, I don’t get any errors.

features = torch.tensor(np.array([[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]], requires_grad=True))

Let me know if this helps!

Thanks for the reply. I believe this only works because converting the Pennylane numpy wrapper version of an array to a torch tensor will erase the gradient associated with that array (which needs to be preserved).

# original features definition
features = torch.tensor(np.array([[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]), requires_grad=True)
print(features)

# prints
tensor([[1., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0.]], dtype=torch.float64,
       requires_grad=True)


# suggested features definition
features = torch.tensor(np.array([[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]], requires_grad=True))
print(features)
# prints
tensor([[1., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0.]], dtype=torch.float64)

Sadly, I do not think there is a functional way to amplitude encode/prepare a state that can be differentiated. I have opened an issue on the github page. Thank you for the help, hopefully we come up with a fix soon!

Thanks for opening the issue @Anthony_Smaldone !

We will work on a fix in the new year. Thank you for uncovering this bug!