Integrating Catalyst QIR - CUDA for Quantum Algo challenge

D_Karli · June 6, 2025, 9:59am

We are designing a challenge where innovator submit their circuits for pre-image search for keccak-256 4-round hash. We are aiming to get clear benchmarks for cryptanalysis and incentives algorithmic innovation. We are running couple of issues:

We wanna make sure innovators can’t cheat with oracle, as far as I know oracle doesn’t know the answer in Grover’s search algorithm, it only output yes/no ( correct me if it’s wrong)

We want innovators to design oracles, but in Grover’s algorithm, the oracle is constrained to a very specific role: it should only apply a phase shift to target states (leaving other states unchanged)
Therefore, we need a way to verify that a submitted oracle conforms to this requirement: it performs only a phase shift (it should not be altering amplitude or applying some other transform)

So we need help with how to prevent innovators to cheat and make sure they are running quantum simulation such as Catalyst to find the solution. (e.g How to quantum just-in-time (QJIT) compile Grover’s algorithm with Catalyst | PennyLane Demos )

We are thinking about setting constrains for valid oracle such as:

Gate Whitelisting: Only allows X, H, and multi-controlled X gates
Phase Verification: Mathematically verifies phase flip behavior
Circuit Constraints: Prevents hidden operations with barrier gates
Input Validation: Checks all computational basis states
Cryptographic Integrity: Maintains Keccak’s security properties
Solution Isolation: Solutions only appear in control strings

Can someone guide us to implement this challenge safely using Pennylane modules? or suggest alternative path to with oracle design? Thank you

daniela.murcillo · June 9, 2025, 2:03pm

Hi @D_Karli
Welcome to the forum

A member of our team has the following suggestion based on the information you provided.

Define the oracle on your company’s server side.
Don’t give competitors access to the oracle source code
Have two oracles with different methods, one which only runs for the team overseeing the challenge

(Just adding that it is not clear to us whether users will be coding up oracles or what your challenge looks like exactly)

I hope this is helpful, let us know.

YingTIG · June 10, 2025, 11:23am

Hi @daniela.murcillo, thank you for the warm welcome and for your response.

I’m Ying Chan, CTO at The Innovation Game (TIG). We’re collaborating with @D_Karli to design a challenge on our platform centered on Grover’s Search. The goal is to incentivize open development of performant oracles by independent contributors.

To give some clarity to our context/setup:

In TIG, “innovators” submit custom oracles, and “benchmarkers” evaluate their performance. Here’s a simplified interface of what innovators implement:

def generate_oracle(solution: List[bool]):
    """
    The oracle must perform a phase flip for the `solution` state only.
    """
    def oracle():
        # gate logic defined by innovator here
        # e.g. for i in range(num_wires): qml.PauliZ(i)
        pass

    return oracle

These oracles are open-source and submitted to our platform, where they are executed by benchmarkers:

def run_simulation(seed: int, num_wires: int, iterations: int):
    # Deterministically generate solution
    random.seed(seed)
    n = random.getrandbits(num_wires)
    solution = [(n >> i) & 1 == 1 for i in reversed(range(num_wires))]

    oracle = generate_oracle(solution)

    dev = qml.device("lightning.qubit", wires=num_wires)

    @qml.qnode(dev)
    def circuit():
        for i in range(num_wires):
            qml.Hadamard(i)
        for _ in range(iterations):
            oracle()
            diffusion_operator()
        return qml.probs(wires=range(num_wires))

    # Compile & execute with Catalyst
    circuit_qjit = qml.qjit(circuit)
    results_qjit = circuit_qjit()

To verify that benchmarkers are running simulations honestly, each run produces a runtime signature. We intend to either modify the compiled IR by Catalyst to produce this signature, or we will customise the Catalyst runtime.

What we need your help with:

The key issue we’re seeking guidance on is oracle verification:

How can we verify that submitted oracles conform to Grover’s specification, i.e. they only apply a phase shift, rather than altering amplitude, etc?

For example, can we apply the Oracle to another circuit, and wrap that with some pennylane code to perform the verification?

Also, how can we ensure that the simulation is performed deterministically?

Best,
Ying

Diego · June 10, 2025, 3:05pm

Hi @YingTIG
Thanks for the additional context, it’s very helpful.

How can we verify that submitted oracles conform to Grover’s specification, i.e. they only apply a phase shift, rather than altering amplitude, etc?

Assuming oracles are written using PennyLane here are some options I can think of:

[Computationally expensive] To be completely sure an oracle does what’s expected, we can obtain the matrix of the oracle with qml.matrix(). The circuit matrix should be almost equal to Identity. The only exception should be -1 in the entry corresponding to the solution.
[Computationally expensive] We can apply the oracle in a circuit, return qml.state() at the end, and compare the state vector with the correct outcome. Note each oracle generated should be tested with several input states.
[Computationally expensive] Run a quantum circuit with the oracle, return qml.probs(). Verify that the probabilities remain unchanged after the oracle is applied. This provides less information about the correctness of the final state, but ensures no amplitudes are changed. Note each oracle generated should be tested with several input states.

Ensuring that the simulation is performed deterministically

Quantum circuit simulations in PennyLane are probabilistic when we define a number of shots for a device or qnode. Simulations on devices that don’t set a number of shots are typically analytical. For example, returning qml.state() or qml.expval() on lightning.qubit will give you the exact state or expectation value, not averaged from a set of samples.

Hope this helps

YingTIG · June 10, 2025, 4:11pm

Thanks for the tips! To confirm, is this what you were suggesting for 1):

import numpy as np
import pennylane as qml

num_wires = 5
epsilon = 1e-5

def oracle():
    for i in range(num_wires): 
        qml.PauliZ(i)

def cheating_oracle():
    """Incorrect oracle that artificially boosts |1000⟩'s amplitude"""
    # Apply Hadamards to wire 0 to create interference favoring |1⟩
    qml.Hadamard(0)
    qml.RY(1.2, wires=0)
    qml.RZ(0.7, wires=0)
    qml.Hadamard(0)
    # Apply controlled rotations to entangle wire 0 with others
    qml.CRY(0.9, wires=[0, 1])
    qml.CRY(0.9, wires=[0, 2])
    qml.CRY(0.9, wires=[0, 3])
    # Add some phase shift to create bias
    qml.PhaseShift(1.2, wires=0)

def verify_oracle(oracle):
    diff = np.identity(2**num_wires) - qml.matrix(oracle, wire_order=range(num_wires))()
    assert np.all(np.absolute(diff) <= epsilon), "Oracle not performing phase shift"

verify_oracle(oracle)
verify_oracle(cheating_oracle)

Quantum circuit simulations in PennyLane are probabilistic when we define a number of shots for a device or qnode . Simulations on devices that don’t set a number of shots are typically analytical. For example, returning qml.state() or qml.expval() on lightning.qubit will give you the exact state or expectation value, not averaged from a set of samples.

By default, are shots defined or undefined? And to double confirm, if shots are undefined, then it will be deterministic?

Thanks!

Ying

Diego · June 10, 2025, 9:39pm

To confirm, is this what you were suggesting for 1):

It’s similar. Some points:

The oracle defined in that code changes the phase for multiple input states. Is this intentional?
I don’t think I understand how the cheating_oracle is cheating. Does this result in an increased amplitude for |1000\rangle after applying Grover’s diffusion operator?
In my suggestion, verify_oracle needs to compare with a slightly different matrix. Instead of np.identity it would be more like this:

mat = np.identity(2**num_wires)
mat[solution, solution] = -1
diff = mat - qml.matrix(oracle, wire_order=range(num_wires))()

By default, are shots defined or undefined?

Shots are usually undefined by default. Although this might depend on the device.

And to double confirm, if shots are undefined, then it will be deterministic?

When shots are not defined, the simulation is analytical/deterministic and can be used to return properties like the state or exact statistical measures like the expectation value expval

YingTIG · June 11, 2025, 9:33am

The oracle defined in that code changes the phase for multiple input states. Is this intentional?

Yes. We just want to verify that the oracle is performing a phase shift. We don’t need to verify its performing the correct phase shift as benchmarkers will be able to report this from the simulation results.

I don’t think I understand how the cheating_oracle is cheating.

I should of labelled it as an invalid oracle rather than cheating. Its invalid because it is not applying a phase shift.

In my suggestion, verify_oracle needs to compare with a slightly different matrix. Instead of np.identity it would be more like this

Yes got it. That would be checking that the correct phase shift is being applied. A follow up question I have on this is, would it be possible to reduce computation by checking that particular state? i.e. rather than computing the entire matrix, just compute the row specific to the solution state

CatalinaAlbornoz · June 12, 2025, 7:11pm

Hi @YingTIG , thanks for following up!

We’re a bit busy closing up some projects at the moment. How time-critical is your question? We should be able to get back to you within a week or so.

Diego · June 23, 2025, 10:47pm

would it be possible to reduce computation by checking that particular state?

Not 100% sure here. I can share some suggestions, but these would need to be benchmarked for performance.

You could try preparing the basis (solution) state you’re interested in with qml.BasisState, applying the oracle and returning qml.probs, the probability should be 1 for the input basis state.

Another option is to again create the basis state, but apply the oracle, then apply qml.FlipSign and return qml.state. if the oracle is equivalent to FlipSign, you should get back the original basis state.

D_Karli · June 24, 2025, 8:34pm

Hi Diego - the goal here is to verify the solution( target state ) without running all the computation. I think, statistical sampling from non-target states can be our shot here. And Yes, both option seems viable for our oracle based challenge set up. Another question will be, what is the max wires ( including ancilla qubits ) we can use with catalyst plugin? Thanks

Diego · June 25, 2025, 5:58pm

statistical sampling from non-target states can be our shot here.

This could work, but if the simulation device is calculating the full distribution to provide samples from that distribution, the problem remains. Would be good to benchmark this to make sure

what is the max wires ( including ancilla qubits ) we can use with catalyst plugin?

For a personal laptop ~15 qubits usually runs well. But this will depend on the simulation device used (e.g. lightning.qubit vs lightning.tensor) and the hardware available (e.g. GPU, HPC center, personal computer). Other than limitations on computational time and memory, Catalyst should not impose a hard limit.

D_Karli · July 2, 2025, 7:31pm

Ok I see, how do you suggest we benchmark this efficiently? Here is the sampling approach spesific to our challenge:

def smart_statistical_verification(oracle, solution_message, confidence=0.999):
“”“Probabilistically verify oracle with high confidence”“”

# 1. Always test the solution state (must pass)
verify_solution_state(oracle, solution_message)

# 2. Smart sampling of non-solution states
sample_strategies = [
    lambda: random_uniform_sampling(1000),           # Random states
    lambda: hamming_distance_sampling(solution_message, distances=[1,2,3]), # Near misses
    lambda: bit_pattern_sampling(["000...000", "111...111"]), # Edge cases
    lambda: structured_sampling(keccak_weak_points)  # Cryptographically interesting
]

for strategy in sample_strategies:
    test_states = strategy()
    for state in test_states:
        verify_non_solution_state(oracle, state)

print(f"Oracle verified with {confidence*100}% confidence")

Structural Verification practical:

def structural_verification(oracle, solution_message):

# 1. Test solution state (O(1))
verify_solution_gets_minus_phase(oracle, solution_message)

# 2. Test oracle properties without exhaustive state checking
verify_oracle_is_diagonal(oracle)      # O(circuit_depth) not O(2^n)
verify_oracle_is_unitary(oracle)       # O(circuit_depth) not O(4^n) 
verify_ancilla_cleanup(oracle)         # O(ancilla_count) not O(2^n)

# 3. Mathematical verification
verify_keccak_component_correctness(oracle)  # Analyze circuit structure

print(" Oracle verified through structural analysis")

For qubits / wires, we are planning to use GPU and HPCs. Do you think that 40 qubit( keccak round is 40bit) + 50 ancilla workpace for registry is feasible to set up with catalyst plugin + cuQuantum? We expect shortest preimage 40 bit output ( flip state ) with clean ancillas to verify results. If that’s sounds not efficient, would love to hear your thoughts what is the max qubit registry we can go for 40 bit classical data ? Thanks again

Diego · July 3, 2025, 6:42pm

Ok I see, how do you suggest we benchmark this efficiently?

It’s probably enough to time each approach for a few points between 1 and 20 qubits and extrapolate.

Feasibility of 40 qubit + 50 ancilla qubit simulation

A total of 90 qubits sounds difficult to simulate exactly for general circuits, even with GPU and HPC. A state vector representing the state of 90 qubits will have 10^{27} complex elements. This is something like 3.2 \times 10^{19} gigabytes of memory in python (infeasible).

Catalyst should be able to handle compilation for these circuits, but I’m not sure about simulating them with cuQuantum.

Tensor network simulators like lightning.tensor can reach a high number of qubits if the structure of the circuit is suitable or if we use approximate tensor network methods. This demo explains how we can use default.tensor to simulate large circuits. It even reaches 200 qubits. But this is for specific circuits and tensor network bond dimension (level of error).

Topic		Replies	Views
Issues using step_and_cost as part of an optimization method PennyLane Help	15	828	July 26, 2023
Quantum gate deployment laws for Oracle on Grover's algorithm PennyLane Help	1	331	October 3, 2023
A mathematically rigorous proof of quantum ORACLE completeness and quantum gate realizability PennyLane Help	1	38	May 26, 2025
QHack 2023 challenge: The Super Parameter PennyLane Challenges	6	70	November 4, 2024
NVIDIA cu-Quantum support PennyLane Help	4	799	June 10, 2022

Integrating Catalyst QIR - CUDA for Quantum Algo challenge

What we need your help with:

Related topics