Why pennylane need a lot of memory just at very beginning

I use google colab to run a QML with Qubit=16 and I find the ram cost high(about 28GB) at very beginning. However since then, the ram cost just remain at almost 1GB. I use diff_method=’'backprop". Can anyone tell me the reason?

Hey zyh1999,

Welcome to Xanadu’s discussion forum and thanks for the question!

Memory usage is certainly a key issue when simulating quantum systems. Would you be able to share the code that uses the initial 28GB of RAM? This will help us determine the problem your experiencing.

Many thanks!

The brief code is shown below:
I found that if there is no “backprop” step, the ram will just cost much lower.
What I am confused about is the ram cost fluctuates.But not every time when working on a “backprop” step, the ram cost will become large, but just sometimes. In addition, the ram cost will become large definitely for the first time working on a “backprop” step.(the first time working on the function “opt.step_and_cost”)

train_dev = qml.device("default.qubit.autograd",wires = n_wires)
def circuit(params,n_wires):
 #... some layers, the depth is about 100, the n_wires is 16(number ofqubits)
@qml.qnode(train_dev,diff_method="backprop")
def training_circuit(params,n_wires,qubit):
    circuit(params,n_wires)
    return qml.expval(qml.PauliZ(qubit.item()))

def loss(params,n_wires,qubit):
    return ( 1-np.mean(training_circuit(params,n_wires,qubit)) ) / 2


params = np.random.randn(n_params,3,requires_grad=True)

opt = AdagradOptimizer(stepsize=0.08)

for t in range(epochs):

    t_time = time.time()

    shuffled_idx = np.random.permutation(n_compress)

    for i in shuffled_idx:

        params, c_loss = opt.step_and_cost(lambda x: loss(x,n_wires,i), params)

        print(f"Iter : {t+1} | Id : {i} | Loss {c_loss}")

Hi zyh1999,

Thanks for sharing your code and your findings when using backprop.

We would indeed expect that backprop would require a large amount of memory since the algorithm stores the results of all intermediate subexpressions and the computation is then traversed with the gradient computed by repeatedly applying the chain rule.

However is not clear why the fluctuations in memory are occurring. Would you be able to share your full code so we can replicate this issue?

Many thanks!

You mean in theory, every time work on ''backprop" step, pennylane will use a large amount of memory? But why sometime it did not use such large memory on “backprop” step is not clear?

My hidden code is just some simple and random circuit layer:
like this way:

def circuit(params,n_wires):
    """
    params (n_gates, 3)
    """
    n_wires = n_wires if n_wires is int else n_wires.item()
    p_ngates = 2*n_wires
    for d in range(L):
        for i,off in enumerate([0, n_wires]):
            for w in range(n_wires):
                idx = d*p_ngates + w + off
                w+=i
                p = params[idx]
                qml.U3(*p, wires = w%n_wires)
                if w%2 == (i+1)%2:
                    qml.CZ(wires = [(w-1)%n_wires,w%n_wires])

Furthermore, Are there some methods to lower the memory cost when I still use the ''backprop" step to calculate the gradient.

Hi zyh1999

Thanks for sharing the circuit you are using. Could you please share the parameter values you are using that generates this behaviour with memory fluctuations?

Please edit and share the code below such that it reproduces the issue you are seeing:

import pennylane as qml
import numpy as np
import time

L = ?
n_wires = 16
n_params = 3
epochs = ?
n_compress = ?

def circuit(params,n_wires):
    """
    params (n_gates, 3)
    """
    n_wires = n_wires if n_wires is int else n_wires.item()
    p_ngates = 2*n_wires
    for d in range(L):
        for i,off in enumerate([0, n_wires]):
            for w in range(n_wires):
                idx = d*p_ngates + w + off
                w+=i
                p = params[idx]
                qml.U3(*p, wires = w%n_wires)
                if w%2 == (i+1)%2:
                    qml.CZ(wires = [(w-1)%n_wires,w%n_wires])

train_dev = qml.device("default.qubit.autograd",wires = n_wires)

@qml.qnode(train_dev,diff_method="backprop")
def training_circuit(params,n_wires,qubit):
    circuit(params,n_wires)
    return qml.expval(qml.PauliZ(qubit.item()))

def loss(params,n_wires,qubit):
    return ( 1-np.mean(training_circuit(params,n_wires,qubit)) ) / 2


params = np.random.randn(n_params,3)

opt = qml.AdagradOptimizer(stepsize=0.08)

for t in range(epochs):

    t_time = time.time()

    shuffled_idx = np.random.permutation(n_compress)

    for i in shuffled_idx:

        params, c_loss = opt.step_and_cost(lambda x: loss(x,n_wires,i), params)

        print(f"Iter : {t+1} | Id : {i} | Loss {c_loss}")

Thanks for sharing your questions and code so far. We a very interested in reproducing this behaviour and solving the issue!