QAOA with High Number of Qubits

Hello everyone,

I would like to use QAOA to solve an optimisation problem. I have already achieved to build a prototype with a 10-qubit quantum circuit. However, when I try to scale the circuit up, to for example 30 qubits, I get the following error when I define the cost function method, try to optimise the circuit parameters with gradient descent or try to compute the probability distribution of the solutions:

MemoryError: Unable to allocate 16.0 GiB for an array with shape (1073741824,) and data type complex128

The complete error messages are long (I can post them if needed), but they all have the following in common:

--> 219     state = np.zeros(2**self.num_wires, dtype=np.complex128)
    220     state[index] = 1
    221     state = self._asarray(state, dtype=self.C_DTYPE)

MemoryError: Unable to allocate 16.0 GiB for an array with shape (1073741824,) and data type complex128

That means that a vector with all possible qubit “collapsed” states (0 or 1) has to be created (correct me if I’m wrong). Is there any way of scaling up QAOA on pennylane without having to define such a vector and using only 8GB to 32GB of memory?

Are there other ways of working with QAOA and a high number of qubits (for example in the range of 200 qubits)?

Thank you very much in advance.

Hi @pormelrog,

On a laptop you can usually only run circuits with up to 20 qubits. Remember that the compute becomes exponentially harder with each extra qubit, so 30 qubits is a big ask for a classical computer. However, we have implemented a nice trick in PennyLane that can, in some cases, allow you to run bigger circuits. This nice trick is circuit cutting.

The PennyLane functionality for this is qml.cut_circuit(). You can learn more about how to use it in this blog post, and do a deeper dive into it with this demo. The key thing here is that we’re breaking apart our circuit into smaller-sized circuits that we can run with a lower memory requirement. There is a classical overhead so your circuit will take longer to run, but you may be able to run larger circuits.

I encourage you to try it out and let us know if this worked for you!

Hello @CatalinaAlbornoz,

thank you very much for your answer. You are right and I understand that the memory and computational cost rises exponentially in the number of qubits. But maybe there is some technique or method unknown to me to achieve what I need.

I have also tried the circuit cutting method. However, it still does not help, because I keep getting an error for exceeding the maximum allowed dimension when defining the qml.device, i.e., before any circuit cutting can be performed:

dev = qml.device("lightning.qubit", wires=range(100))


ValueError                                Traceback (most recent call last)
c:\Users\pormelrog\Documents\Code\Prototyping.ipynb Cell 57 in 1
----> 1 dev = qml.device("lightning.qubit", wires=range(100))

File ~\AppData\Roaming\Python\Python39\site-packages\pennylane\, in device(name, *args, **kwargs)
    319     raise DeviceError(
    320         f"The {name} plugin requires PennyLane versions {plugin_device_class.pennylane_requires}, "
    321         f"however PennyLane version {__version__} is installed."
    322     )
    324 # Construct the device
--> 325 dev = plugin_device_class(*args, **options)
    327 # Once the device is constructed, we set its custom expansion function if
    328 # any custom decompositions were specified.
    329 if custom_decomps is not None:

File ~\AppData\Roaming\Python\Python39\site-packages\pennylane_lightning\, in LightningQubit.__init__(self, wires, c_dtype, shots, batch_obs, analytic)
    186 self._batch_obs = batch_obs
    188 # Create the initial state. Internally, we store the
    189 # state as an array of dimension [2]*wires.
--> 190 self._state = self._create_basis_state(0)
    191 self._pre_rotated_state = self._state

File ~\AppData\Roaming\Python\Python39\site-packages\pennylane_lightning\, in LightningQubit._create_basis_state(self, index)
    210 def _create_basis_state(self, index):
    211     """Return a computational basis state over all wires.
    212     Args:
    213         index (int): integer representing the computational basis state
    217     Note: This function does not support broadcasted inputs yet.
    218     """
--> 219     state = np.zeros(2**self.num_wires, dtype=np.complex128)
    220     state[index] = 1
    221     state = self._asarray(state, dtype=self.C_DTYPE)

ValueError: Maximum allowed dimension exceeded

What is the maximum allowed dimension? Is there any way of increasing it? Is there some other device which is better suited for this?

The error is caused by the fact that an 2^N dimensional vector, for N \in \mathbb{N} number of qubits, must be defined to save the probabilities of observing each of the possible solutions. Is there any way of avoiding this? Is there maybe some way of working with a certain set of possible solutions only or to work with collapsed/size-reduced probabilities, without having to split the circuit manually/myself?

Kind regards,


Hi @pormelrog,

What’s happening at the moment is that lightning.qubit is trying to initialize itself in the |0⟩ state by running state = np.zeros(2**self.num_wires, dtype=np.complex128). However 2**self.num_wires is just too big if you have too many wires. For other devices such as default.qubit you can define the data_type when you define the device (e.g. dev = qml.device("default.qubit", wires=range(30),r_dtype=np.float32, c_dtype=np.complex64)). This can help you reduce the memory needs of running your circuit. However, the initial state is still calculated with dtype=np.complex128 so you will still have trouble running circuits with more than 30 qubits.

If you really need to run larger circuits you will need computers with much larger memory, such as those used in HPC centres.

This is sad news if you want to do classical simulation of quantum computers, but it really shows that actual quantum hardware can help us solve problems that we cannot even describe with a classical computer.

Please let me know if you have any further questions. :slight_smile:

Hi @pormelrog,

I wanted to add to my answer something that may not have been clear before. If you want to run circuits with more than 30 qubits in smaller devices, you need to define the device with the size that you actually can run.

In the code below we create a 2-qubit device. However in the circuit itself 3 wires are used: wires=0, wires=1, and wires=2. Since we have set up the cut_circuit transform with auto_cutter=True, PennyLane will automatically cut this circuit into pieces of maximum size=2.

dev = qml.device("default.qubit", wires=2)

def circuit(x):
    qml.RX(x, wires=0)
    qml.RY(0.9, wires=1)
    qml.RX(0.3, wires=2)

    qml.CZ(wires=[0, 1])
    qml.RY(-0.4, wires=0)

    qml.CZ(wires=[1, 2])

    return qml.expval(qml.pauli.string_to_pauli_word("ZZZ"))

x = np.array(0.531, requires_grad=True)


You can try a similar example by defining a device that you can run (eg. lightning.qubit with wires=15) and then in the circuit itself use more wires that eventually will be split into smaller circuits.

Please let me know if this is more clear!

1 Like

Hello @CatalinaAlbornoz,

thank you very much for the answer. I was missing the fact that I have to define the qml.device object with the maximum number of qubits/wires in each cut.

What would happen if I define the device with a number of qubits strictly less than the minimum number of qubits in a cut? In the example code that you posted, this would correspond to setting the device to only 1 qubit. If I have a big circuit, I might not know in advance how big the cut circuits would at least need to be.

Kind regards,


Hi @pormelrog !

You would define the number of qubits in your device. Eg: you have a quantum computer with 4 qubits. However you want to run a circuit with 5 qubits. You simply define your device with 4 qubits and use the automated circuit cutter for it to figure out how to cut your 5-qubit circuit into parts that would fit into your 4-qubit device.

If you’re using a laptop I would set your device to maximum 16-20 qubits or up to 30 qubits if you have an HPC machine.

Please let me know if this is more clear!

Kind regards,