"ValueError: array is not broadcastable to correct shape" when running opt.step method

I tried to optimize quantum circuit’s parameters using AdamOptimizer via opt.step method.
This is the list of the parameters (see the attachment)
image

The weights_conv, thetas_conv, and weights_fc all go to quantum circuit, while the classical_bias is added later after the measurement of the circuit. When I run the opt.step method, I got error message “ValueError: array is not broadcastable to correct shape”.

I have tried to reduce the parameters one by one until I find the one that cause the error. I found that both the weights_fc and the classical_bias cause the error, independently to each other.

Any suggestions on how to fix this error or how to arrange my parameters to not cause this error?
Thank you

Hey @eraraya-ricardo!

Sure, we should be able to help you out with this. Could you share some minimal code that reproduces the error? Note that instead of attaching a screenshot of the code, you can use three backticks ``` like so:
image


This will nicely render the code and allow others to interact with it:

image

From your screenshot, it looks like you are seeing a limitation of the autograd interface used by default in PennyLane. For example, the following code will not work:

import pennylane as qml
from pennylane import numpy as np

dev = qml.device("default.qubit", wires=2)

qml.enable_tape()

a = np.ones(3)
b = np.ones(1)

params = [a, b]

@qml.qnode(dev)
def f(params):
    qml.Rot(*params[0], wires=0)
    qml.RX(params[1], wires=1)
    qml.CNOT(wires=[0, 1])
    return qml.expval(qml.PauliZ(0))

opt = qml.AdamOptimizer()

opt.step(f, params)

The issue is that autograd expects the params to be composed of NumPy arrays of all the same shape.

For more flexibility, I recommend considering either the Torch or TensorFlow interfaces. For example, using Torch:

import pennylane as qml
import torch

dev = qml.device("default.qubit", wires=2)

qml.enable_tape()

a = torch.ones(3, requires_grad=True)
b = torch.ones(1, requires_grad=True)

params = [a, b]

@qml.qnode(dev, interface="torch")
def f(params):
    qml.Rot(*params[0], wires=0)
    qml.RX(*params[1], wires=1)
    qml.CNOT(wires=[0, 1])
    return qml.expval(qml.PauliZ(0))

out = f(params)

out.backward()
g_a = a.grad
g_b = b.grad

Thank you for your reply @Tom_Bromley .
That is exactly the kind of circuit I want to run. This is the code

dev = qml.device("default.qubit", wires=9)

@qml.qnode(dev)
def qcircuit(params, x=None, y=None):

    # conv layer iteration
    for l in range(len(params[0])):
        # qubit iteration
        for q in range(9):
            # gate iteration
            for g in range(3):
                qml.Rot(*(params[0][l][3*g:3*(g+1)] * x[q][3*g:3*(g+1)] + params[1][l][3*g:3*(g+1)]), wires=q)

                
    # fc layer iteration
    for l in range(len(params[2])):
        # qubit iteration
        for q in range(9):
            qml.Rot(*params[2][l][3*q:3*(q+1)], wires=q)
            
        # entangling layer
        if l%2 == 0:
            # for even layer (0, 2, 4, ...)
            qml.CZ(wires=[0,1])
            qml.CZ(wires=[2,3])
            qml.CZ(wires=[5,6])
            qml.CZ(wires=[7,8])
        if l%2 != 0:
            # for odd layer (1, 3, 5, ...)
            qml.CZ(wires=[1,2])
            qml.CZ(wires=[3,4])
            qml.CZ(wires=[6,7])
            
            qml.CZ(wires=[4,5])
            
            qml.CZ(wires=[0,8])


    return qml.expval(qml.Hermitian(y, wires=[0]))

def DRC_Conv(params, x=None, y=None):
        return qcircuit(params, x=x, y=y) + params[-1]

def cost(params, x, y, state_labels=None):
    # Compute prediction for each input in data batch
    loss = 0.0
    dm_labels = [density_matrix(s) for s in state_labels]
    for i in range(len(x)):
        f = DRC_Conv(params, x=x[i], y=dm_labels[y[i]])
        loss = loss + (1 - f) ** 2
    return loss / len(x)

The cost is the function that I want to minimize using the opt.grad method.

The issue is that autograd expects the params to be composed of NumPy arrays of all the same shape.

But I saw the Variational Classifier demo (Variational classifier | PennyLane Demos) is able to optimize parameters with different shape (4x3 matrix as circuit parameters and a single number as classical bias) without using Torch or TF interface. Is there any fundamental difference between my code and the one in the demo?

This is not particularly a ‘solution’ but I found a workaround for this problem. I managed to train the circuit as I wanted by wrapping the QNode as a Keras layer and train it in the ‘Keras way’.

Thanks @eraraya-ricardo, that’s a good point regarding the variational classifier demo.

So it looks like autograd is indeed fine with params being a list of differently-shaped objects, for example the following works nicely:

import autograd
from autograd import numpy as np

def f(params):
    angles = params[0]
    bias = params[1]
    
    return np.cos(angles[0]) + np.sin(angles[1]) + bias

params = [np.ones(2), 0.5]
df = autograd.grad(f)

df(params)

The issue arises when those parameters are within a QNode, e.g.,

import pennylane as qml
from pennylane import numpy as np

dev = qml.device("default.qubit", wires=1)

@qml.qnode(dev)
def f(params):
    qml.Rot(*params[0], 0, wires=0)
    qml.RX(params[1], wires=0)
    return qml.expval(qml.PauliZ(0))

params = [np.ones(2), 0.5]

df = qml.grad(f)

df(params)

The way that the variational classifier demo gets around this (probably by accident) is by defining an intermediate classical function variational_classifier which unpacks the parameters and feeds them to the QNode. For example, the above code can be adapted to:

import pennylane as qml
from pennylane import numpy as np

dev = qml.device("default.qubit", wires=1)

@qml.qnode(dev)
def f(a, b):
    qml.Rot(*a, 0, wires=0)
    qml.RX(b, wires=0)
    return qml.expval(qml.PauliZ(0))

params = [np.ones(2), 0.5]

def cost(params):
    return f(*params)

dcost = qml.grad(cost)

dcost(params)

This will now work, because cost() unpacks the parameters.

For your code specifically, all you should need to do is update the signature of qcircuit() to something like:
def qcircuit(params0, params1, params2, x=None, y=None):
as well as update the contents of qcircuit correspondingly. Then, in DRC_Conv you can do:

def DRC_Conv(params, x=None, y=None):
        return qcircuit(*params, x=x, y=y) + params[-1]

This should hopefully work. However I agree that perhaps we should look at supporting this functionality for parameters processed within the QNode.

1 Like

@eraraya-ricardo, on closer inspection it looks like this should be an issue only when operating in tape mode. However, from the code you shared, it looks like qcircuit necessitates tape mode due to parameter addition (as mentioned here).

1 Like

I’ve added this as an issue in PennyLane’s GitHub here.