Feed-forward on IBM HW failure

I want to execute a simple feed-forward circuit on IBM HW using PennyLane.
Attached code works for dev = qml.device('default.qubit', wires=range(4)) but it crashes when I want to run it against ‘ibm_cairo’ . Compiler seems to drop the feed-forward classical logic at the expense of adding ancilla and a CNOT.
I’d like to use the feed-forward logic available on IBM devices.
Can you help me to fix my code?
This is the circuit I want:

0: ──H─╭●─╭●──┤↗├─────||─┤ ╭Probs
1: ────╰X─│────║──────||─┤ ├Probs
2: ───────╰X───║──────||─┤ ├Probs
3: ────────────║───X──||─┤ ╰Probs
               ╚═══╝  ||

This is the circuit I get:

0: ──H─╭●─╭●─╭●─────||─┤ ╭Probs
1: ────╰X─│──│──────||─┤ ├Probs
2: ───────╰X─│──────||─┤ ├Probs
3: ──────────│──╭X──||─┤ ╰Probs
4: ──────────╰X─╰●──||─┤

This is the error I see:
WireError: Did not find some of the wires <Wires = [0, 4]> on device with wires <Wires = [0, 1, 2, 3]>.

# Reproducer

import pennylane as qml

def feedForwardGHZ(n=4):
    for i in range(1, n-1):  
    m0 = qml.measure(0)
    qml.cond(m0, qml.PauliX)(n-1)
    return qml.probs(wires=range(n))
if 1:
    from qiskit_ibm_provider import IBMProvider
    provider = IBMProvider()
    backend = provider.get_backend('ibm_cairo')
    dev = qml.device('qiskit.ibmq', wires=range(4), backend=backend, shots=2000)
    dev = qml.device('default.qubit', wires=range(4))
qnode = qml.QNode(feedForwardGHZ, dev)

# run job
probTens = qnode()  
print("Probability of each computational basis state:")
for state, prob in enumerate(probTens):
    print("State %2d=%s  prob=%.3f"%(state,format(state, '04b'),prob))

The output of qml.about().

Name: PennyLane
Version: 0.34.0
Summary: PennyLane is a Python quantum machine learning library by Xanadu Inc.
Home-page: https://github.com/PennyLaneAI/pennylane
License: Apache License 2.0
Location: /usr/local/lib/python3.10/dist-packages
Requires: appdirs, autograd, autoray, cachetools, networkx, numpy, pennylane-lightning, requests, rustworkx, scipy, semantic-version, toml, typing-extensions
Required-by: PennyLane-Cirq, PennyLane-Lightning, PennyLane-Lightning-GPU, PennyLane-qiskit, PennyLane-SF

Platform info:           Linux-6.6.9-200.fc39.x86_64-x86_64-with-glibc2.35
Python version:          3.10.12
Numpy version:           1.23.5
Scipy version:           1.11.4
Installed devices:
- default.gaussian (PennyLane-0.34.0)
- default.mixed (PennyLane-0.34.0)
- default.qubit (PennyLane-0.34.0)
- default.qubit.autograd (PennyLane-0.34.0)
- default.qubit.jax (PennyLane-0.34.0)
- default.qubit.legacy (PennyLane-0.34.0)
- default.qubit.tf (PennyLane-0.34.0)
- default.qubit.torch (PennyLane-0.34.0)
- default.qutrit (PennyLane-0.34.0)
- null.qubit (PennyLane-0.34.0)
- cirq.mixedsimulator (PennyLane-Cirq-0.34.0)
- cirq.pasqal (PennyLane-Cirq-0.34.0)
- cirq.qsim (PennyLane-Cirq-0.34.0)
- cirq.qsimh (PennyLane-Cirq-0.34.0)
- cirq.simulator (PennyLane-Cirq-0.34.0)
- lightning.qubit (PennyLane-Lightning-0.34.0)
- lightning.gpu (PennyLane-Lightning-GPU-0.34.0)
- strawberryfields.fock (PennyLane-SF-0.29.0)
- strawberryfields.gaussian (PennyLane-SF-0.29.0)
- strawberryfields.gbs (PennyLane-SF-0.29.0)
- strawberryfields.remote (PennyLane-SF-0.29.0)
- strawberryfields.tf (PennyLane-SF-0.29.0)
- qiskit.aer (PennyLane-qiskit-0.34.0)
- qiskit.basicaer (PennyLane-qiskit-0.34.0)
- qiskit.ibmq (PennyLane-qiskit-0.34.0)
- qiskit.ibmq.circuit_runner (PennyLane-qiskit-0.34.0)
- qiskit.ibmq.sampler (PennyLane-qiskit-0.34.0)
- qiskit.remote (PennyLane-qiskit-0.34.0)

!pip3 list|grep qiskit

PennyLane-qiskit          0.34.0
qiskit                    0.45.1
qiskit-aer                0.13.2
qiskit-ibm-provider       0.8.0
qiskit-ibm-runtime        0.17.0
qiskit-ibmq-provider      0.20.2
qiskit-terra              0.45.1

Hi @Jan_Balewski! Currently PennyLane enacts mid-circuit measurements using deferred measurements, which is what you are seeing here.

We’d like to add support for native mid-circuit measurements on compatible devices, such as finite-shot simulators and some hardware (e.g., on IBM). Stay tuned for more on this in the coming few months!

Thanks Tom,
I see. So for now I need go back to Qiskit - my real problem is way more complex, this was just a trial ballon.
How about more advance logic performed by the classical portion of the algo?
This short note:

implies Pennylane does not not support anything beyond single measurement dependency, because you excluded support for and , or and not classical bit operators.
Qiskit (and IBMQ HW) allows for that:

The circuit I want to study needs a lookup table with several (~8) input bits and returning a Boolean decision forking the circuit. Will Pennylane support LUT-based feed-forward in the near future, for an arbitrary (small) lookup table uploaded by the user? At least for dev = qml.device('default.qubit')

Hi @Jan_Balewski,

Conditioning on multiple mid-circuit measurements is currently possible:

import pennylane as qml

dev = qml.device("default.qubit")

def f():
    qml.CNOT([0, 1])
    qml.CNOT([1, 2])

    m0 = qml.measure(0)
    m1 = qml.measure(1)

    qml.cond(m0 + m1 == 2, qml.CNOT)([1, 2])
    return qml.expval(qml.PauliZ(2))
>>> f()
tensor(1., requires_grad=True)
>>> print(qml.draw(f)())
0: ──H─╭●─────┤↗├─────────┤     
1: ────╰X─╭●───║───┤↗├─╭●─┤     
2: ───────╰X───║────║──╰X─┤  <Z>

You can also capture logic like and via the & operator. The warning in the documentation of qml.cond is just cautioning against use of the and keyword.

PennyLane might be able to do an 8-bit lookup table depending upon how complex it is. However, you can also try adding @qml.qjit when defining your QNode, allowing you to use Catalyst to capture more complex control flow like while loops and elif clauses in qml.cond.

Please let me know if this would work for you, we’re always interested to learn more about new features we could add.

Hi Tom,
I’m building up toward my target circuit ( not sure if this should be still a follow up or a new issue? )
The feed-forward code below has 3 input qubits (q0-q2) and 1 output qubit (q3). The 3 inputs are set to generate 0 or 1 at random, to sample all possible 3-bit strings.
It sets q3 to 1 if the Hamming weight of bits measured on 3 inputs is 2 or 3. The HW threshold is the input to the circuit (tghw)

dev = qml.device("default.qubit",wires=4,shots=1000)

def f4(tghw):
    mL=[None for i in range(ninp)] # locks memory
    for i in range(ninp):  
    hw=sum(mL) #... compute hamming weight
    qml.cond(hw >= tghw, qml.PauliX)(ninp) # set output
    return qml.counts()


This circuit produces the expected result: the most right bit is ‘1’ only of 2 or more other bits are ‘1’.

{'0000': tensor(127, requires_grad=True),
 '0010': tensor(118, requires_grad=True),
 '0100': tensor(109, requires_grad=True),
 '0111': tensor(118, requires_grad=True),
 '1000': tensor(137, requires_grad=True),
 '1011': tensor(147, requires_grad=True),
 '1101': tensor(128, requires_grad=True),
 '1111': tensor(116, requires_grad=True)}

In short this circuit uses the logic:
if hw==2 or hw==3 : apply X(q3)

Is it possible to encode a more complex logic which would by stochastic.
Let pL=[0.3,0.5,0.7,1] are probabilities of ‘accepting’ given Hamming weight.
In pure python this would be the logic I’d like to encode:

    qml.cond(  np.random.rand() < pL[hw] , qml.PauliX)(ninp)
    return qml.counts()

Thanks, Jan
P.S. I know I could add an ancilla qubit, set it with RX such that probability of measuring 1 is pL[hw] , depending on hw. Next, add AND conditions. But on FPGA ,which typically runs classical portion of the feed-forward decision, there must be some random generator available - I’d rather use it instead of scarifying ancilla qubit

Hi @Jan_Balewski !
Thank you for your question.
We have forwarded this question to members of our technical team who will be getting back to you within a week. Feel free to post any updates to your question here in the thread in the meantime!

Hi @Jan_Balewski!

Sorry for the late response. I managed to hack together something that will do what you are asking for:

import pennylane as qml
from pprint import pprint
import numpy as np
from functools import reduce

dev = qml.device("default.qubit",wires=4,shots=[1] * 1000)

def f4():
    mL=[None for i in range(ninp)] # locks memory
    for i in range(ninp):  
    bit = np.random.random()
    conditions = [(hw == i) & (pL[i] > bit) for i in range(4)]
    condition = reduce(lambda a, b: a | b, conditions)

    qml.cond(condition, qml.PauliX)(ninp)

    return qml.sample()

samples = f4()
counts = qml.counts(wires=[0, 1, 2, 3])._samples_to_counts(samples)

The code above avoids the pL[hw] indexing, which does not work with pure PennyLane because hw is not a concrete integer. Have you had a look into trying this with Catalyst?

Hi Tom,
Thanks you for thinking about my problem.
Yes, I also realized hw can’t be used as index to the list. But I’m not sure if this code works as intended.

To add some meaning to this feed-forward algorithm, let’s make an analogy to the high energy physics experiment (HEP). For HEP experiment one typically has meany raw triggers, like: zero-bias, min-bias, and few pT-dependent rare triggers which count at very different rates and one can’t afford to record them all.
In such case, we add presecalers, which discard at random a (large) portions of each trigger type, and only the weighted mix of raw triggers is recorded. This is what I try to accomplish here.

Back to the circuit I want to build:

  • The hamming weigh hw=sum(mL) are equivalent to many raw triggers, labeled by hw.
  • pL[.] list maps to prescale probabilities for each raw trigger

The reason I doubt the code above is working properly is the output (aka MSB)
should be linear weighted sum of 3 input bits ‘words’.

Case 1: I choose weights : pL=[1, 0, 0, 0, 1] I should get ‘1’ as MSB for inputs: ‘000’ or ‘111’, and indeed it is the case:

{'0001': 110,
 '0010': 129,
 '0100': 125,
 '0110': 126,
 '1000': 120,
 '1010': 129,
 '1100': 148,
 '1111': 113}

Case 2: I choose : pL=[0.5, 0, 0, 0, 1] I should get one more state ‘0000’ with ~60 shots and ‘0001’ should have only ~60 shots, a half of what I got above. But I see no ‘0001’ bitstrings:

{'0000': 135,
 '0010': 112,
 '0100': 132,
 '0110': 115,
 '1000': 126,
 '1010': 126,
 '1100': 127,
 '1111': 127}

It is worth noting the prescaling can be accomplished also by adding state memory to the circuit which would deterministically accept only 1/K triggers with given hw.
This would mean pL=[1/ka, 1/kb, … ] where ka,kb… are integers defining prescale probabilities for each hw.
Since input triggers happen at random we still have random acceptance, despite the order in which given value of hw is accepted is deterministic. This is what is typically done in the HEP settings.
(It hints for a very different implementation for quantum circuit , no need for bit = np.random.random())

Another issue: does your example run 1 circuit for 1k shots or 1k circuits with 1 shot?
What does this mean: shots=[1] * 1000
It may not matter for PannyLane simulations, but eventually I want to run this circuit on the real HW and the penalty for loading new circuit is very high.
Thanks, Jan

Hi @Jan_Balewski. Sorry it doesn’t seem to be working :frowning_face:, this is definitely pushing PennyLane’s current capabilities for classical postprocessing. But it’s also great feedback as we are actively developing PennyLane for improved dynamic/adaptive circuit capabilities. For now I’d recommend investigating Catalyst and seeing if you run into any issues there. Catalyst is built on top of PennyLane and allows us to more easily capture classical pre- and post-processing in a circuit/program.

Another issue: does your example run 1 circuit for 1k shots or 1k circuits with 1 shot?
What does this mean: shots=[1] * 1000
It may not matter for PannyLane simulations, but eventually I want to run this circuit on the real HW and the penalty for loading new circuit is very high.

Right, it is running 1k circuits of 1 shot each in the example I gave above. For more efficient hardware runs, we’d need to capture the full PennyLane program (e.g., with Catalyst) and have this sent off to a runtime that sits next to the hardware device and dynamically interacts with it (e.g., by telling the hardware device whether to run the conditional operation in qml.cond).

Hi Tom,
you were close. Below is the code which implements plan B - I used ancilla as random generator, combined with your implicit loop mapping hw to index in pL[.]. The results are as expected. It is 1 circuit with 8k shots, rather than 8k circuits with 1 shot (as I wanted it). Will it run on a real HW ?- I’ll investigate it next.
I think this thread is long enough - so I’ll stop posting here

You suggested I learn Catalyst. Perhaps you can point me to a tutorial showing it works for a non-trivial feed-forward circuit executed on a real HW from any cloud vendor?


dev = qml.device("default.qubit",wires=4,shots=8000)

def f5():
    mL=[None for i in range(ninp)] # locks memory
    for i in range(ninp):  
    pL=np.array([0.5, 0.7, 0.,  1.])
    [qml.cond(hw ==i,  qml.RX)(angL[i],qout) for i in range(ninp+1)]
    return qml.sample()

samples = f5()
counts = qml.counts(wires=[0, 1, 2, 3])._samples_to_counts(samples)


0: ──H──┤↗├─────────────────────────────────────────────────────────┤  Sample
1: ──────║───H──┤↗├─────────────────────────────────────────────────┤  Sample
2: ──────║───────║───H──┤↗├─────────────────────────────────────────┤  Sample
3: ──────║───────║───────║───RX(1.57)──RX(1.98)──RX(0.00)──RX(3.14)─┤  Sample
         ╚═══════║═══════║═══╬═════════╬═════════╬═════════╣           Sample
                 ╚═══════║═══╬═════════╬═════════╬═════════╣           Sample
                         ╚═══╩═════════╩═════════╩═════════╝           Sample
{'0000': 497,
 '0001': 522,
 '0010': 322,
 '0011': 690,
 '0100': 308,
 '0101': 708,
 '0110': 994,
 '1000': 316,
 '1001': 694,
 '1010': 1016,
 '1100': 948,
 '1111': 985}

Analysis of the output for requested probability vs. hw : pL=[0.5, 0.7, 0., 1.]

  • hw=0 has one input ‘000’ should have appended 1 with 50% prob - it is the case:
'0000': 497,
'0001': 522,  --> p=0.5
  • hw=1 has 3 inputs: 001, 010, 110, all should have attached ‘1’ with prob 0.7 - it is the case
 '0010': 322,
 '0011': 690,  -->p=0.7
 '0100': 308,
 '0101': 708,  -->p=0.7
 '1000': 316,
 '1001': 694, -->p=0.7
  • hw=2 has inputs 011, 101, 110, should never have ‘1’ attached, correct, this are only strings I see
 '0110': 994,
 '1010': 1016,
 '1100': 948,
  • hw=3 has one input 111 should always have attached 1, correct, I see only
    '1111': 985 → p=1.0