Gradient of learnable circuit parameters is None when using torch interface

Yang · March 16, 2021, 12:34pm

Hi, I am implementing QCNN based on pennylane. For flexibility, I choose the torch interface. The forward process seems going well. However, when I check the gradient of circuit parameters, I find some of them is None, which results in that these parameters will not get updated during optimization.

Here is my code example.

def qconv_kernel(phi, params):
    '''
    Implementation of quantum convolution kernel circuit
    ------------------------------------------
    :param phi: image pixels, [n_quibits]
    :param params: learnable convolutional weights, [n_layers * n_qubits]
    '''
    # quantum circuit implemention here
    return qml.expval(qml.PauliZ(wires=0))

def qconv_torch(inputs, params, qnode, n_in_channels=1, kernel_size=[2, 2], stride=None, padding=False):
    """
    Convolves the input image with many applications of the same quantum circuit.
    ------------------------------------------------------------------------
    :param inputs: input image, [C, H, W]
    :param params: learnable convolutional weights, [n_out_channels, n_layers * n_qubits]
    :param qnode: QNode object
    :param n_in_channels: input channels of previous layer
    :param kernel_size: size of qconv kernel, [kernel_h, kernel_w]
    :param stride: step of qconv, [stride_h, stride_w]
    :param padding: whether padding, bool
    return a new image, [C_out, H_out, W_out]
    """
    for j in range(0, in_h-kernel_size[0], stride[0]):
        for k in range(0, in_w-kernel_size[1], stride[1]):
            channel_results = []
            for c_in in range(inputs.shape[0]):
                img_patch = []
                for h in range(kernel_size[0]):
                    for w in range(kernel_size[1]):
                        img_patch.append(inputs[c_in, j+h, k+w])
                out_pixel = qnode(torch.tensor(img_patch), params[i])
            out[:, j//stride[0], k//stride[1]] = torch.tensor(out_pixel)
    return out

class QCNN_torch(torch.nn.Module):
    def __init__(self, cfg):
        super().__init__()
        n_quibit = sum(cfg.MODEL.QCNN.KERNEL_SIZE)
        self.DIM = cfg.MODEL.QCNN.DIM
        dev = qml.device(cfg.CIRCUIT.BACKEND, wires=n_quibit)
        self.qnode = qml.QNode(qconv_kernel, dev, interface='torch')
        self.qconv_param = []
        for i in range(len(self.DIM)-1):
            param_weight = np.random.uniform(0, 2*math.pi, shape1*shape2)
            param_weight = np.reshape(param_weight, (shape1, shape2))
            param_weight = torch.nn.Parameter(torch.tensor(param_weight, requires_grad=True))
            self.register_parameter('layer'+str(i+1), param_weight)
            self.qconv_param.append(param_weight)

        # FC
        n_feat = ((28//(2**(len(self.DIM)-1)))**2) * self.DIM[-1]
        stdv = 1./n_feat
        self.fc = np.random.uniform(-stdv, stdv, n_feat*10)
        self.fc = np.reshape(self.fc, (n_feat, 10))
        self.fc = torch.nn.Parameter(torch.FloatTensor(self.fc))

    def forward(self, x):
        out = []
        for b in range(len(x)):
            bx = x[b]
            for i in range(len(self.DIM)-1):
                bx = qconv_torch(bx, self.qconv_param[i], self.qnode, n_in_channels=self.DIM[i])
            out.append(bx.flatten().unsqueeze(0) @ self.fc)
        return torch.cat(out, dim=0)
if __name__=='__main__':
    opt = get_opt()
    cfg = get_config(opt.config_file)


    loss_fn = torch.nn.CrossEntropyLoss()

    qcnn = QCNN_torch(cfg)
    optimizer = torch.optim.Adam(params=qcnn.parameters(), lr=0.1)
    optimizer.zero_grad()

    inputs = torch.rand(4, 1, 28, 28)
    out = qcnn(inputs)
    
    loss = loss_fn(out, torch.LongTensor([1, 2, 3, 4]))
    optimizer.param_groups[0]['params'][0].retain_grad()
    loss.backward()
    optimizer.step()

The problem is that I can only get the gradient of qcnn.fc but not other parameters.

Any suggestions are helpful. Thank you in advance.

josh · March 16, 2021, 12:42pm

Hi @Yang! Welcome to the forum

Would you be able to reduce your code example above to a minimal, non-working example? That is, by removing parts of the code to make the example as small as possible.

This will help us pin down exactly what might be going wrong!

Yang · March 16, 2021, 1:08pm

Yes, I removed some unnecessary parts of code. Because the failure of gradient propagation may result from the tensor assignment operation, I keep most of them. Hope it is clear and short enough.

Yang · March 17, 2021, 7:02am

Hi @josh. Thank you for your concern. I have solved this problem. It’s more about pytorch operations rather than pennylane. Could you please delete this topic?

josh · March 19, 2021, 5:41am

No worries @Yang, I will close the topic. Feel free to open a new thread if you have any other questions!

Topic		Replies	Views
How to get gradients of quantum circuit with PyTorch interface PennyLane Help	9	336	December 12, 2024
Get gradient of quantum circuit with PyTorch interface PennyLane Help	5	527	June 2, 2023
Using PyTorch Gradients PennyLane Help	9	3044	January 5, 2021
Backward Errors while Training Quantum Circuit PennyLane Help	5	383	July 18, 2023
Differentiating quantum circuits using backprop PennyLane Help	3	122	June 14, 2024

Gradient of learnable circuit parameters is None when using torch interface

Related topics