Is every reset() doubling RAM footprint?

I’m working on a complex circuit with many feed-forward, reset operations. The circuit needs a small number of qubits (3) but a large number of classical bits are being processed in a single shot. I’m using qml.device('default.qubit', wires=3,shots=10000)
and I got this error:

 File "/usr/local/lib/python3.10/dist-packages/pennylane/devices/modifiers/", line 30, in execute
    results = untracked_execute(self, circuits, execution_config)
  File "/usr/local/lib/python3.10/dist-packages/pennylane/devices/modifiers/", line 32, in execute
    results = batch_execute(self, circuits, execution_config)
  File "/usr/local/lib/python3.10/dist-packages/pennylane/devices/", line 553, in execute
    return tuple(
  File "/usr/local/lib/python3.10/dist-packages/pennylane/devices/", line 554, in <genexpr>
  File "/usr/local/lib/python3.10/dist-packages/pennylane/devices/qubit/", line 260, in simulate
    state, is_state_batched = get_final_state(circuit, debugger=debugger, interface=interface)
    state = create_initial_state(sorted(circuit.op_wires), prep, like=INTERFACE_TO_LIKE[interface])
  File "/usr/local/lib/python3.10/dist-packages/pennylane/devices/qubit/", line 41, in create_initial_state
    state = np.zeros((2,) * num_wires)
ValueError: maximum supported dimension for an ndarray is 32, found 37

Can you tell me if I can somehow lift this limit to say classical 48 wires by editing some source code of PennyLane? (Assume I have 1.5TB CPU RAM.) I’m not adding reproducer at this time - unless you think find it useful and this error should not kick in if only the number of classical bits exceeds 32

Follow up: after more test, it looks like every reset requests a new qubit behind the scene, so it is not practical to recycle qubits when using PennyLane default device. What if I use Qiskit simulator as a device? Qiskit supports reset() and is truly recycling qubits so the the RAM footprint is not doubling after every reset - would this work w/ PennyLane optimizer and Qiskit backend?

Hey @Jan_Balewski,

There is actually a cutoff imposed by Numpy to not have anything be larger than 2^32 (see here: numpy/numpy/core/include/numpy/ndarraytypes.h at 4ec5ea806959d750a212a47287b9a179476d7ad6 · numpy/numpy · GitHub). I think you can bypass this if you use the Jax interface, but I’m not 100% sure.

My initial question is more refined now. I see it is not practical to plan to run PennyLane optimizer on a ~30 qubit circuit because it may take hours per 1 step.
Q: If my problem can be expressed as a 3 qubit circuit with 25 resets, can I use PennyLane in such a way it will never need more than 3 qubits unitary (and run fast).
I believe what is happening by default in PennyLane , each time I call qml.reset(iq) a new qubit is added to the device, so at the end I’m forced to simulate 28 qubit circuit despite only 3 qubits are being used at any time.
Can this be overcomed? E.g. Qiskit switches from state vector simulator to density matrix simulator in such case, which also slows down simulations, but using multiple resets is not adding additional exponential cost of factor 2 per each new reset()

Hey @Jan_Balewski,

I think I understand your problem now :sweat_smile:. You are using mid-circuit measurements with resets, which is still currently defaulting to the deferred measurement principle in analytic mode. You should try the finite-shots implementation that we added in 0.35!

We are converging. But I do not see how do I enable feature? The example under hading ‘MCM’ looks like my code, except number of wires is not listed:

import pennylane as qml
dev = qml.device("default.qubit", shots=10)

and I’m using ver 0.35.1 already

core@bdbe8d0dda5a:~$ pip3 list |grep Penny
PennyLane                       0.35.1
PennyLane-Cirq                  0.34.0
PennyLane_Lightning             0.35.1
PennyLane_Lightning_GPU         0.35.1
PennyLane-qiskit                0.35.1
PennyLane-SF                    0.29.0

Can you point me an example code which is not deferring the mid circuit measurements?

Hmm… can I see your code just to see if something weird is happening?

mea culpa.
I use many Podman images and the issue with exhausted RAM is present for PennyLane ver 0.34 but you indeed fixed it w/ ver 0.35.
For the record, below is my tester code. You can dial # of resets with nReset.
The results become interesting for nReset>20. A simple metric of the true number of simulated qubits is duration of simulation of execution of 1 circuit. If reset() adds qubits to the register the simu time doubles with every reset.
And this is what I see for ver 0.34

nReset  runTime/sec
22  2.8
24 11.2
26 46.6
30  ArrayMemoryError: Unable to allocate 32.0 GiB (my laptop has only 20GB)

But after I switch to ver 0.35 I can dial even 40 resets and run time scales linearly with nReset

nReset  runTime/sec
22  5.0
24  5.8
26  6.3
30  6.8
40  8.8

So I’m good and I can continue my research. Thanks for your help.


import pennylane as qml
from pennylane import numpy as np
num_qubit=2 ; shots=1000; iqt=0
dev = qml.device('default.qubit', wires=num_qubit,shots=shots)
from time import time


def circuit(x):
    ang=np.arccos(x) ;    qml.RX(ang,iqt)
    for j in range(nReset):
        m = qml.measure(0,reset=True)
        qml.cond(m, qml.PauliX)(iqt)
    return qml.expval(qml.PauliZ(iqt) )

print(qml.draw(circuit, decimals=2)(x), '\n')
print(' run circ with %d resets ...'%nReset)
y = circuit(x)
print('input X=%.2f   Y=%.2f   shots=%d  elaT=%.1f sec  nReset=%d'%(x,y,shots,elaT, nReset))
1 Like

Oh great! Glad you got it working :slight_smile: