How to do long short term memory with pennylane

Hello, I am very new to pennylane and could not find any demo or anything for lstm with pennylane, and am trying to understand how to do qlstm.


i want to mimic something like this very closely but just do not understand it enough. if anyone could point me to a yt video or blog or something that can help me understand how to use pennylane for qlstm, that would be much appreciated!!

Hey @Aadi_Tiwari! Welcom to the forum :tada:!

Really cool project! What I recommend doing is to start simpler. Based on your notebook, it looks like you’re trying to implement this paper. The recurrent unit is relatively complicated!

What I suggest for a “simpler” QRNN is as follows. You can keep the same circuit design as in the paper referenced above, but the bulk of your RNN recurrent unit can literally just be that circuit, where the measurements are, say, Pauli Z expectations in your embedded space. Then, you can have a classical processing layer (a softmax layer) to turn the measurement data from your quantum RNN into a probability vector (the final output of your recurrent cell).

On paper this should work! Try this out and see if you can get that functioning (it may not be very “trainable”, per se, but this simpler algorithm should still have the same properties of any other RNN).

thank you so much for replying! to be honest though i understood very little of what you said since i am very new to pennylane, and was wondering if you could give me a resource to help me learn regression with pennylane

Hey! Sorry about that! The notebook that you attached, is that your work? I assumed that it was :sweat_smile:! Are you just wanting to better understand how RNNs work, then translate that knowledge to better understanding the paper & notebook that you attached?

Haha I should’ve made clear that the link wasn’t mine, my bad. Yeah, I would like to understand RNNs to understand what he did so I can edit that code to work for my situation, otherwise I would have no idea what I would be doing

important to note that i want to understand RNNs and LSTM to use it for non linear regression

That’s awesome! I really recommend checking out Andrew’s Coursera course to learn about anything in classical machine learning. Here is the link specifically to the lesson on RNNs:

That’s what I learned from when learning about RNNs!

Hello @isaacdevlugt ,

I am training a Quantum LSTM using Pytroch and Pennylane as shown in the paper. I also save my model after my training as a state dictionary file. However, when I use the saved model state dict file in a different notebook to perform some experiments, I do not get the right results. This is not the case when I run the experiments in the same file right after training. Here is a snippet of my training code and my QLSTM model:

Qmodel = QShallowRegressionLSTM(num_sensors=len(features), hidden_units=num_hidden_units, n_qubits=4, output_size = len(target) )
loss_function = nn.MSELoss()
optimizer = torch.optim.Adagrad(Qmodel.parameters(), lr=learning_rate)

scheduler = ReduceLROnPlateau(optimizer, ‘min’, patience=5, factor=0.2, verbose=True)

####################################################################################################

Assuming the necessary imports and data loaders are already set up

quantum_loss_train =
quantum_loss_test =

Directory to save the model states

save_dir = ‘model_states’
os.makedirs(save_dir, exist_ok=True)

log_dir = ‘logs_fs5k_struppi’
os.makedirs(log_dir, exist_ok=True)
logging.basicConfig(filename=os.path.join(log_dir, ‘quantum.log’), level=logging.INFO)
logger = logging.getLogger()

print(“Untrained test\n--------”)
start = time.time()
test_loss = test_model(test_loader, Qmodel, loss_function)
end = time.time()
print(“Execution time”, end - start)

logger.info(f"Untrained test - Test Loss: {test_loss:.7f}, Execution time: {end - start:.2f} seconds")
quantum_loss_test.append(test_loss)

for ix_epoch in range(5000):
print(f"Epoch {ix_epoch}\n---------“)
start = time.time()
train_loss = train_model(train_loader, Qmodel, loss_function, optimizer=optimizer)
test_loss = test_model(test_loader, Qmodel, loss_function)
end = time.time()
print(“Execution time”, end - start)
logger.info(f"Epoch {ix_epoch} - Train Loss: {train_loss:.7f}, Test Loss: {test_loss:.7f}, Execution time: {end - start:.2f} seconds”)
quantum_loss_train.append(train_loss)
quantum_loss_test.append(test_loss)
# scheduler.step(test_loss)
torch.save(Qmodel.state_dict(), os.path.join(save_dir, f’{batch_size}b_epoch_{ix_epoch+1}_state.pth’))

Save the final model state

torch.save(Qmodel.state_dict(), f’Fullstate_{sequence_length}Quantum_state{batch_size}b_5kEpochs.pth’)
logger.info(“Final model state saved.”)
####################################################################################################

class QLSTM(nn.Module):
def init(self,
input_size,
output_size,
hidden_size,
n_qubits=4,
n_qlayers=1,
n_vrotations=3,
batch_first=True,
return_sequences=False,
return_state=False,
backend=“lightning.qubit”):
super(QLSTM, self).init()
self.n_inputs = input_size
self.hidden_size = hidden_size
self.concat_size = self.n_inputs + self.hidden_size
self.n_qubits = n_qubits
self.n_qlayers = n_qlayers
self.n_vrotations = n_vrotations
self.backend = backend # “default.qubit”, “qiskit.basicaer”, “qiskit.ibm”

    self.batch_first = batch_first
    self.return_sequences = return_sequences
    self.return_state = return_state
    
    self.wires_forget = [f"wire_forget_{i}" for i in range(self.n_qubits)]
    self.wires_input = [f"wire_input_{i}" for i in range(self.n_qubits)]
    self.wires_update = [f"wire_update_{i}" for i in range(self.n_qubits)]
    self.wires_output = [f"wire_output_{i}" for i in range(self.n_qubits)]

    self.dev_forget = qml.device(self.backend, wires=self.wires_forget)
    self.dev_input = qml.device(self.backend, wires=self.wires_input)
    self.dev_update = qml.device(self.backend, wires=self.wires_update)
    self.dev_output = qml.device(self.backend, wires=self.wires_output)

    #self.dev_forget = qml.device(self.backend, wires=self.n_qubits)
    #self.dev_input = qml.device(self.backend, wires=self.n_qubits)
    #self.dev_update = qml.device(self.backend, wires=self.n_qubits)
    #self.dev_output = qml.device(self.backend, wires=self.n_qubits)
    
    def ansatz(params, wires_type):
        # Entangling layer.
        for i in range(1,3): 
            for j in range(self.n_qubits):
                if j + i < self.n_qubits:
                    qml.CNOT(wires=[wires_type[j], wires_type[j + i]])
                else:
                    qml.CNOT(wires=[wires_type[j], wires_type[j + i - self.n_qubits]])

        # Variational layer.
        for i in range(self.n_qubits):
            qml.RX(params[0][i], wires=wires_type[i])
            qml.RY(params[1][i], wires=wires_type[i])
            qml.RZ(params[2][i], wires=wires_type[i])
            
    def VQC(features, weights, wires_type):
        # Preproccess input data to encode the initial state.
        #qml.templates.AngleEmbedding(features, wires=wires_type)
        # print(features.shape) #(batch_size, 4)
        

        batch_size = features.shape[0]
        num_features = features.shape[1]
        
        for batch in range(batch_size):
            ry_params = [torch.arctan(feature) for feature in features[batch]]
            rz_params = [torch.arctan(feature**2) for feature in features[batch]]
            
            for i in range(self.n_qubits):
                qml.Hadamard(wires=wires_type[i])
                qml.RY(ry_params[i], wires=wires_type[i])
                qml.RZ(rz_params[i], wires=wires_type[i])


        # ry_params = [torch.arctan(feature) for feature in features.squeeze()]
        # rz_params = [torch.arctan(feature**2) for feature in features.squeeze()]
        # for i in range(self.n_qubits):
        #     qml.Hadamard(wires=wires_type[i])
        #     qml.RY(ry_params[i], wires=wires_type[i])
        #     qml.RZ(ry_params[i], wires=wires_type[i])
    
        #Variational block.
        qml.layer(ansatz, self.n_qlayers, weights, wires_type = wires_type)

    def _circuit_forget(inputs, weights):
        VQC(inputs, weights, self.wires_forget)
        return [qml.expval(qml.PauliZ(wires=i)) for i in self.wires_forget]
    self.qlayer_forget = qml.QNode(_circuit_forget, self.dev_forget, interface="torch")

    def _circuit_input(inputs, weights):
        VQC(inputs, weights, self.wires_input)
        return [qml.expval(qml.PauliZ(wires=i)) for i in self.wires_input]
    self.qlayer_input = qml.QNode(_circuit_input, self.dev_input, interface="torch")

    def _circuit_update(inputs, weights):
        VQC(inputs, weights, self.wires_update)
        return [qml.expval(qml.PauliZ(wires=i)) for i in self.wires_update]
    self.qlayer_update = qml.QNode(_circuit_update, self.dev_update, interface="torch")

    def _circuit_output(inputs, weights):
        VQC(inputs, weights, self.wires_output)
        return [qml.expval(qml.PauliZ(wires=i)) for i in self.wires_output]
    self.qlayer_output = qml.QNode(_circuit_output, self.dev_output, interface="torch")

    weight_shapes = {"weights": (self.n_qlayers, self.n_vrotations, self.n_qubits)}
    print(f"weight_shapes = (n_qlayers, n_vrotations, n_qubits) = ({self.n_qlayers}, {self.n_vrotations}, {self.n_qubits})")

    self.clayer_in = torch.nn.Linear(self.concat_size, self.n_qubits)
    self.VQC = {
        'forget': qml.qnn.TorchLayer(self.qlayer_forget, weight_shapes),
        'input': qml.qnn.TorchLayer(self.qlayer_input, weight_shapes),
        'update': qml.qnn.TorchLayer(self.qlayer_update, weight_shapes),
        'output': qml.qnn.TorchLayer(self.qlayer_output, weight_shapes)
    }
    self.clayer_out = torch.nn.Linear(self.n_qubits, self.hidden_size)
    #self.clayer_out = [torch.nn.Linear(n_qubits, self.hidden_size) for _ in range(4)]

def forward(self, x, init_states=None):
    '''
    x.shape is (batch_size, seq_length, feature_size)
    recurrent_activation -> sigmoid
    activation -> tanh
    '''
    if self.batch_first is True:
        batch_size, seq_length, features_size = x.size()
    else:
        seq_length, batch_size, features_size = x.size()

    hidden_seq = []
    if init_states is None:
        h_t = torch.zeros(batch_size, self.hidden_size)  # hidden state (output)
        c_t = torch.zeros(batch_size, self.hidden_size)  # cell state
    else:
        # for now we ignore the fact that in PyTorch you can stack multiple RNNs
        # so we take only the first elements of the init_states tuple init_states[0][0], init_states[1][0]
        h_t, c_t = init_states
        h_t = h_t[0]
        c_t = c_t[0]

    for t in range(seq_length):
        # get features from the t-th element in seq, for all entries in the batch
        x_t = x[:, t, :]
        
        # Concatenate input and hidden state
        v_t = torch.cat((h_t, x_t), dim=1)

        # match qubit dimension
        y_t = self.clayer_in(v_t)

        f_t_list = []
        i_t_list = []
        g_t_list = []
        o_t_list = []

        for b in range(batch_size):
            f_t_list.append(self.clayer_out(self.VQC['forget'](y_t[b].unsqueeze(0))))
            i_t_list.append(self.clayer_out(self.VQC['input'](y_t[b].unsqueeze(0))))
            g_t_list.append(self.clayer_out(self.VQC['update'](y_t[b].unsqueeze(0))))
            o_t_list.append(self.clayer_out(self.VQC['output'](y_t[b].unsqueeze(0))))

        # print('y_t', y_t)
        # print('shape of y_t', y_t.shape)
        f_t = torch.sigmoid(torch.cat(f_t_list, dim=0))
        i_t = torch.sigmoid(torch.cat(i_t_list, dim=0))
        g_t = torch.tanh(torch.cat(g_t_list, dim=0))
        o_t = torch.sigmoid(torch.cat(o_t_list, dim=0))

        # f_t = torch.sigmoid(self.clayer_out(self.VQC['forget'](y_t)))  # forget block
        # i_t = torch.sigmoid(self.clayer_out(self.VQC['input'](y_t)))  # input block
        # g_t = torch.tanh(self.clayer_out(self.VQC['update'](y_t)))  # update block
        # o_t = torch.sigmoid(self.clayer_out(self.VQC['output'](y_t))) # output block

        c_t = (f_t * c_t) + (i_t * g_t)
        h_t = o_t * torch.tanh(c_t)

        hidden_seq.append(h_t.unsqueeze(0))
    hidden_seq = torch.cat(hidden_seq, dim=0)
    hidden_seq = hidden_seq.transpose(0, 1).contiguous()
    return hidden_seq, (h_t, c_t)

class QShallowRegressionLSTM(nn.Module):
def init(self, num_sensors, hidden_units, n_qubits=0, n_qlayers=1 , output_size=4):
super().init()
self.num_sensors = num_sensors # this is the number of features
self.hidden_units = hidden_units
self.num_targets = output_size
self.num_layers = 1

    #self.lstm = nn.LSTM(
    #    input_size=num_sensors,
    #    hidden_size=hidden_units,
    #    batch_first=True,
    #    num_layers=self.num_layers
    #)
    
    self.lstm = QLSTM(
        input_size=num_sensors,
        hidden_size=hidden_units,
        batch_first=True,
        n_qubits = n_qubits,
        n_qlayers= n_qlayers,
        output_size=self.num_targets
    )

    self.linear = nn.Linear(in_features=self.hidden_units, out_features=4)

def forward(self, x):
    batch_size = x.shape[0]
    h0 = torch.zeros(self.num_layers, batch_size, self.hidden_units).requires_grad_()
    c0 = torch.zeros(self.num_layers, batch_size, self.hidden_units).requires_grad_()
    
    _, (hn, _) = self.lstm(x, (h0, c0))
    out = self.linear(hn).flatten()  # First dim of Hn is num_layers, which is set to 1 above.

    return out

Please let me know how to resolve this issue. Thank you.

Hey @Siva_Karthikeya!

It would help if we could reduce your code down to something more manageable that still produces the discrepancy you’re observing :pray:. In the mean time, I recommend checking out our documentation for TorchLayer (see Usage Details): qml.qnn.TorchLayer — PennyLane 0.37.0 documentation

Under “Usage Details”, there’s a section about model saving. Let me know if that helps!

Dear @isaacdevlugt ,

Thank you for your response. The QLSTM I’m using is the same as the one used in this github repository. GitHub - rdisipio/qlstm: Example of a Quantum LSTM
Except that I implemented a few model saving instructions after the training loop which apparently don’t seem to work. If I got your message correct, do you mean to say you are looking into it? Thank you.

Apologies for the confusion! If you could reduce your code down to something way smaller that reproduces the discrepancy, that would greatly help. It takes a lot of effort for us to discern what’s happening and what’s important in large, custom code bases.