A hybrid model was built using Torch layers, but the parameters were not updated during model training

I created a network that was all connected by qlayer, but the parameters of the network were not updated during training。
The network consists of three layers, the first layer of twenty-five 4-bit circuits, the second layer of five 5-bit circuits, and the third layer of one 5-bit circuit.

Hey @bown! Welcome to the forum :sunglasses:

Can you share a minimal example that reproduces what you’re seeing? Once I have that, I’ll copy-paste your code and see if I can help :slight_smile:

Thanks for the reply, I seem to have found the cause, but there is no way I can fix it. Since my task requires designing specific circuits, I can’t use “qml.AngleEmbedding” and “qml.BasicEntanglerLayers”。This is one of the first layer of 4-bit circuits

dev_layer3 = qml.device("default.qubit", wires=4)   
@qml.qnode(dev_layer3)
def circuit_template_3(inputs, params): ##### ZXIZ-->YXYX-->ZXIZ
    
    qml.Hadamard(wires=0)
    qml.Hadamard(wires=1)
    qml.Hadamard(wires=2)
    qml.Hadamard(wires=3)
    qml.RZ(inputs[0], wires=0)
    qml.RZ(inputs[1], wires=1)
    qml.RZ(inputs[2], wires=2)
    qml.RZ(inputs[3], wires=3)      
    
    qml.RX(params[0], wires=0)
    qml.RY(params[1], wires=0)
    qml.RY(params[2], wires=1)
    qml.RZ(params[3], wires=1)
    qml.RX(params[6], wires=3)
    qml.RY(params[7], wires=3)
    
    qml.CNOT(wires=[0, 1])
    qml.CNOT(wires=[1, 2])
    qml.CNOT(wires=[2, 3])
    qml.CNOT(wires=[3, 0])
    
    qml.RZ(params[8], wires=0)
    qml.RX(params[9], wires=0)
    qml.RY(params[10], wires=1)
    qml.RZ(params[11], wires=1)
    qml.RZ(params[12], wires=2)
    qml.RX(params[13], wires=2)
    qml.RY(params[14], wires=3)
    qml.RZ(params[15], wires=3)
    
    qml.CNOT(wires=[3, 0])
    qml.CNOT(wires=[2, 3])
    qml.CNOT(wires=[1, 2])
    qml.CNOT(wires=[0, 1])
        
    return qml.expval(qml.PauliZ(0)@qml.PauliX(1)@qml.PauliZ(3))

This is one of the second layer of 5-bit circuits

import pennylane as qml

dev_layer = qml.device("default.qubit", wires=5)   
@qml.qnode(dev_layer)
def circuit_template_other(inputs, params): 
    
    qml.Hadamard(wires=0)
    qml.Hadamard(wires=1)
    qml.Hadamard(wires=2)
    qml.Hadamard(wires=3)
    qml.Hadamard(wires=4)

    qml.RZ(inputs[0]*params[0], wires=0)
    qml.RZ(inputs[1]*params[1], wires=1)
    qml.RZ(inputs[2]*params[2], wires=2)
    qml.RZ(inputs[3]*params[3], wires=3)
    qml.RZ(inputs[4]*params[4], wires=4)
    qml.RX(params[5], wires=0)
    qml.RY(params[6], wires=0)
    qml.RX(params[7], wires=1)
    qml.RY(params[8], wires=1)
    qml.RX(params[9], wires=2)
    qml.RY(params[10], wires=2)
    qml.RX(params[11], wires=3)
    qml.RY(params[12], wires=3)
    qml.RX(params[12], wires=4)
    qml.RY(params[13], wires=4)
    return qml.expval(qml.PauliZ(0)@qml.PauliZ(1)@qml.PauliZ(2)@qml.PauliZ(3))

The second layer of the circuit has a part of the parameters and the data multiplied, whether this is written correctly, and whether this part of the parameters can be updated during training.

besides, my current batchsize is 1, because if I adjust the batchsize, my circuit can’t input data. How can this circuit be modified so that it can receive the batch data and return the batch expected measurements?

Just like the code in the example

@qml.qnode(dev)
def qnode(inputs, weights):
    qml.AngleEmbedding(inputs, wires=range(n_qubits))
    qml.BasicEntanglerLayers(weights, wires=range(n_qubits))
    return [qml.expval(qml.PauliZ(wires=i)) for i in range(n_qubits)]

Thank you for your patience and look forward to your reply!

Hey @bown,

In order to try and replicate the issue, I need the full code. What you attached looks like valid PennyLane code and I don’t see any issues :sweat_smile:

Hello, I’ve finally found the problem, could you please take a look at these two pieces of code. I just used a different way of writing, but the output of the circuit is completely different.

The first code:

import pennylane as qml
import numpy as np
weights = np.array([1,2,3,4])
n_qubits = 2
dev = qml.device("default.qubit", wires=n_qubits)
data_loader = torch.utils.data.DataLoader(
    list(zip(X, y_hot)), batch_size=5, shuffle=True, drop_last=True
)

@qml.qnode(dev)
def qnode(inputs, weights):
    qml.AngleEmbedding(inputs, wires=range(2))
    qml.RX(weights[0], wires=0)
    qml.RX(weights[1], wires=1)    
    qml.RX(weights[2], wires=0)
    qml.RX(weights[3], wires=1) 
    return [qml.expval(qml.PauliZ(wires=1))]

for xs, ys in data_loader:
    print(qnode(xs, weights))

Code output:

[tensor([0.9241, 0.8460, 0.9479, 0.9955, 0.9339], dtype=torch.float64)]
[tensor([0.9918, 0.7078, 0.9917, 0.7965, 0.9995], dtype=torch.float64)]
[tensor([0.9222, 0.7075, 0.9918, 0.7471, 0.7211], dtype=torch.float64)]
[tensor([0.9785, 0.7968, 0.8818, 0.8626, 0.9326], dtype=torch.float64)]

Then I use another qnode, and I find that the output of the circuit is completely different from the first one, and the circuit seems to only process the first data in one batchsize, and the rest of the data is not input into the circuit

@qml.qnode(dev)
def qnode(inputs, weights):
    qml.RZ(inputs[:,0]*weights[0], wires=0)
    qml.RZ(inputs[:,1]*weights[1], wires=1)
    qml.RX(weights[2], wires=0)
    qml.RX(weights[3], wires=1)
    return [qml.expval(qml.PauliZ(wires=1))]
[tensor([-0.6536, -0.6536, -0.6536, -0.6536, -0.6536], dtype=torch.float64)]
[tensor([-0.6536, -0.6536, -0.6536, -0.6536, -0.6536], dtype=torch.float64)]
[tensor([-0.6536, -0.6536, -0.6536, -0.6536, -0.6536], dtype=torch.float64)]
[tensor([-0.6536, -0.6536, -0.6536, -0.6536, -0.6536], dtype=torch.float64)]

But in my approach, I can only take the second qnode, I can’t use qml. AngleEmbedding, so please instruct me on how to modify this code

Hey @bown, there are two things wrong with your second QNode:

  1. By default qml.AngleEmbedding will use RX rotations (see here: qml.AngleEmbedding — PennyLane 0.35.1 documentation). You should replace your RZ operations with RX ones

  2. When you have two operators that are expressed like e^{i \theta X} and e^{i \phi X} (like the RX operator), then e^{i \theta X} e^{i \phi X} = e^{i (\theta + \phi) X} not e^{i \theta\phi X}.

Hope this helps!

Sorry, I was stupid and I ignored this. I made a modification to the example code “Turning quantum nodes into Torch Layers”, I removed the classic layer and used three qlayers. I input the data into two qlayers and then use one qlayer to receive the output of the previous layer. However, there is no change in the loss of the model and no change in acc during the training process.

%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import make_moons
import torch
import pennylane as qml


layer_1 = torch.nn.Linear(2, 2)
layer_2 = torch.nn.Linear(2, 2)
softmax = torch.nn.Softmax(dim=1)

layers = [layer_1, layer_2, softmax]
model = torch.nn.Sequential(*layers)
# Set random seeds
torch.manual_seed(42)
np.random.seed(42)

X, y = make_moons(n_samples=200, noise=0.1)
y_ = torch.unsqueeze(torch.tensor(y), 1)  # used for one-hot encoded labels
y_hot = torch.scatter(torch.zeros((200, 2)), 1, y_, 1)

c = ["#1f77b4" if y_ == 0 else "#ff7f0e" for y_ in y]  # colours for each class
plt.axis("off")
plt.scatter(X[:, 0], X[:, 1], c=c)
plt.show()
dev = qml.device("default.qubit", wires=1)   
@qml.qnode(dev)
def qnode(inputs, params): 
    # load data
    qml.Hadamard(wires=0)
    qml.RZ(inputs[:,0]*params[0], wires=0)        
    # params
    qml.RY(params[1], wires=0)
    qml.RZ(params[2], wires=0)
        
    return [qml.expval(qml.PauliX(0))]
dev1 = qml.device("default.qubit", wires=2)   
@qml.qnode(dev1)
def qnode_1(inputs, param): 
    # load data
    qml.Hadamard(wires=0)
    qml.RZ(inputs[:,0]*param[0], wires=0)        
    qml.RZ(inputs[:,1]*param[1], wires=1)        

    # params
    qml.RY(param[2], wires=0)
    qml.RZ(param[3], wires=1)
        
    return [qml.expval(qml.PauliX(0))]
opt = torch.optim.SGD(model.parameters(), lr=0.2)
loss = torch.nn.L1Loss()
weight_shapes = {"params": 3}
weight_shapes_1 = {"param": 4}
class HybridModel(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.qlayer_1 = qml.qnn.TorchLayer(qnode, weight_shapes)
        self.qlayer_2 = qml.qnn.TorchLayer(qnode, weight_shapes)
        self.qlayer_3 = qml.qnn.TorchLayer(qnode_1, weight_shapes_1)
        self.sigmoid = torch.nn.Sigmoid()

    def forward(self, x):
        x_1, x_2 = torch.split(x, 1, dim=1)
        x_1 = self.qlayer_1(x_1)
        x_2 = self.qlayer_2(x_2)
        x = torch.cat([x_1, x_2], dim=1)
        x = self.qlayer_3(x)
        return self.sigmoid(x)

model = HybridModel()
opt = torch.optim.SGD(model.parameters(), lr=0.2)
epochs = 6
X = torch.tensor(X, requires_grad=True).float()
y_hot = y_hot.float()

batch_size = 5
batches = 200 // batch_size

data_loader = torch.utils.data.DataLoader(
    list(zip(X, y_hot)), batch_size=5, shuffle=True, drop_last=True
)

epochs = 6
for epoch in range(epochs):

    running_loss = 0

    for xs, ys in data_loader:
        opt.zero_grad()

        loss_evaluated = loss(model(xs), ys)
        loss_evaluated.backward()

        opt.step()

        running_loss += loss_evaluated

    avg_loss = running_loss / batches
    print("Average loss over epoch {}: {:.4f}".format(epoch + 1, avg_loss))

y_pred = model(X)
predictions = torch.argmax(y_pred, axis=1).detach().numpy()

correct = [1 if p == p_true else 0 for p, p_true in zip(predictions, y)]
accuracy = sum(correct) / len(correct)
print(f"Accuracy: {accuracy * 100}%")

Here’s how the code runs

D:\keyan\TEMP\ipykernel_14972\2438726278.py:3: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  X = torch.tensor(X, requires_grad=True).float()
Average loss over epoch 1: 0.5000
Average loss over epoch 2: 0.5000
Average loss over epoch 3: 0.5000
Average loss over epoch 4: 0.5000
Average loss over epoch 5: 0.5000
Average loss over epoch 6: 0.5000
Accuracy: 50.0%

Can you help me modify this code to make it work. Thank you so much for your patience to understand my problem.

Hey @bown,

I was able to finagle your code to make it work :slight_smile:. Some changes to note:

  1. To get your code to work with the given dataset, your model output should match the dimensionality of the dataset labels. The predictionss are one-hotted, so you could use sigmoid and modify the data correspondingly, but softmax matches the data labels right from the start.
  2. I changed the forward pass a bit :slight_smile:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import make_moons
import torch
import pennylane as qml


layer_1 = torch.nn.Linear(2, 2)
layer_2 = torch.nn.Linear(2, 2)
softmax = torch.nn.Softmax(dim=1)

layers = [layer_1, layer_2, softmax]
model = torch.nn.Sequential(*layers)

# Set random seeds
torch.manual_seed(42)
np.random.seed(42)

X, y = make_moons(n_samples=200, noise=0.1)
y_ = torch.unsqueeze(torch.tensor(y), 1)  # used for one-hot encoded labels
y_hot = torch.scatter(torch.zeros((200, 2)), 1, y_, 1)

c = ["#1f77b4" if y_ == 0 else "#ff7f0e" for y_ in y]  # colours for each class
plt.axis("off")
plt.scatter(X[:, 0], X[:, 1], c=c)
plt.show()

dev = qml.device("default.qubit", wires=1)   
@qml.qnode(dev)
def qnode(inputs, params): 
    # load data
    qml.Hadamard(wires=0)
    qml.RZ(inputs[:,0]*params[0], wires=0)        
    # params
    qml.RY(params[1], wires=0)
    qml.RZ(params[2], wires=0)
        
    return qml.expval(qml.PauliX(0))

test_input = torch.rand((5, 1))
test_params = torch.rand((3,))

print(qnode(test_input, test_params))

dev1 = qml.device("default.qubit", wires=2)   
@qml.qnode(dev1)
def qnode_1(inputs, param): 
    # load data
    qml.Hadamard(wires=0)
    qml.RZ(inputs[:,0]*param[0], wires=0)        
    qml.RZ(inputs[:,1]*param[1], wires=1)        

    # params
    qml.RY(param[2], wires=0)
    qml.RZ(param[3], wires=1)
        
    return [qml.expval(qml.PauliX(0)), qml.expval(qml.PauliX(1))]

test_input = torch.rand((5, 2))
test_params = torch.rand((4,))

print(qnode_1(test_input, test_params))

opt = torch.optim.SGD(model.parameters(), lr=0.2)
loss = torch.nn.L1Loss()
weight_shapes = {"params": 3}
weight_shapes_1 = {"param": 4}

class HybridModel(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.qlayer_1 = qml.qnn.TorchLayer(qnode, weight_shapes)
        self.qlayer_2 = qml.qnn.TorchLayer(qnode, weight_shapes)
        self.qlayer_3 = qml.qnn.TorchLayer(qnode_1, weight_shapes_1)
        self.softmax = torch.nn.Softmax(dim=1)

    def forward(self, x):
        x_1, x_2 = torch.split(x, 1, dim=1)
        x_1 = self.qlayer_1(x_1)
        x_2 = self.qlayer_2(x_2)
        x = torch.stack((x_1, x_2), dim=0).reshape(-1, 2)
        x = self.qlayer_3(x)
        return self.softmax(x)

model = HybridModel()

model(torch.from_numpy(X))

opt = torch.optim.SGD(model.parameters(), lr=0.2)
epochs = 6
X = torch.tensor(X, requires_grad=True).float()
y_hot = y_hot.float()

batch_size = 5
batches = 200 // batch_size

data_loader = torch.utils.data.DataLoader(
    list(zip(X, y_hot)), batch_size=5, shuffle=True, drop_last=True
)

epochs = 6
for epoch in range(epochs):

    running_loss = 0

    for xs, ys in data_loader:
        opt.zero_grad()

        #print(model(xs).size())
        #print(ys.size())

        loss_evaluated = loss(model(xs), ys)
        loss_evaluated.backward()

        opt.step()

        running_loss += loss_evaluated

    #print(model(xs))
    avg_loss = running_loss / batches
    print("Average loss over epoch {}: {:.4f}".format(epoch + 1, avg_loss))

y_pred = model(X)
predictions = torch.argmax(y_pred, axis=1).detach().numpy()

correct = [1 if p == p_true else 0 for p, p_true in zip(predictions, y)]
accuracy = sum(correct) / len(correct)
print(f"Accuracy: {accuracy * 100}%")
Average loss over epoch 1: 0.5231
Average loss over epoch 2: 0.5009
Average loss over epoch 3: 0.4763
Average loss over epoch 4: 0.4619
Average loss over epoch 5: 0.4844
Average loss over epoch 6: 0.5043
Accuracy: 46.5%