Pennylane Quantum Tapes yield different outcomes

Hi! I am trying to optimize the way I measure the expectation value of certain observables in a given quantum circuit. To give some context, I want to calculate the classical shadow of a given state not with random Pauli matrices, but providing which Paulis to use in each shot.

This was my previous code to do this:

def calculate_classical_shadow(circuit_template, meas_scheme):
    """
    Function to calculate the classical shadow of a quantum circuit.
    
    Args:
        circuit_template (function): circuit that prepares the quantum state.
        meas_scheme (np.array): this array has shape (n_shots, n_qubits) and each row represents which Pauli matrices (coded by integers X:1, Y:2, Z:3) to measure for the classical shadow.
    
    Returns:
        array: classical shadow of the quantum circuit. (outcomes, recipes)
    """
    
    # use mapping to convert integers to Pauli operators
    pauli_map = {0: qml.Identity, 1: qml.PauliX, 2: qml.PauliY, 3: qml.PauliZ}
    # get some parameters and initialize outcomes array
    shadow_size, n_qubits = meas_scheme.shape
    outcomes = np.zeros((shadow_size, n_qubits))
    
    # for each measurement setting in the measurement scheme, calculate the expectation value of the Pauli operators provided by the measurement setting
    for snapshot in range(shadow_size):
        obs = [pauli_map[int(meas_scheme[snapshot, i])](i) for i in range(n_qubits)]
        outcomes[snapshot] = circuit_template(obs)

    return (outcomes, meas_scheme)

This code is working and gives me accurate predictions of expectation values when increasing the number of shots. However, it is really slow when going past 100 shots.

Then, I learned about qml.tape.QuantumTape(). I would like to execute all these measurements using this class, since I believe it will be faster. To this end, I wrote the following code:

def calculate_classical_shadow(circuit_template, meas_scheme):

    # same as before
    pauli_map = {0: qml.Identity, 1: qml.PauliX, 2: qml.PauliY, 3: qml.PauliZ}
    shadow_size, n_qubits = meas_scheme.shape
    outcomes = np.zeros((shadow_size, n_qubits))

    #### new code ####
    # create a device with just one shot
    dev = qml.device("lightning.qubit", wires=n_qubits, shots=1)

    # parameters to prepare the quantum state
    x = np.arange(n_qubits, requires_grad=False)
    ops = [qml.RY(x[i], wires=i) for i in range(n_qubits)]
    
    # list that will contain the tapes, one for each shot
    tape_list = []
    # create tapes
    for shot in range(shadow_size):
        # obtain Pauli matrices to measure for this shot
        measurements = [qml.expval(pauli_map[int(meas_scheme[shot, j])](j)) for j in range(n_qubits)]
        # create a tape
        tape = qml.tape.QuantumTape(ops=ops, measurements=measurements, shots=1)
        tape_list.append(tape)
    
    # execute all tapes separately
    outcomes = np.array(qml.execute(tape_list, device=dev, diff_method=None))

    return (outcomes, meas_scheme)

The code is compiling and running, but I obtain completely different results in terms of the accuracy of prediction of expectation values from the shadow calculated from qml.tape.QuantumTape(). Namely, using the tape class decreases the accuracy.

I believe the problem has to be in the implementation of these tapes from my part, since if the new code did the same as the old one, the expectation value estimations should be the same, just with faster calculations.

With that said, does someone know what am I missing?

Hi @marsiuniub
I would like to know why you believe using QuantumTape() will be faster.
There is a Warning in the documentation for qml.tape that says Unless you are a PennyLane or plugin developer, you likely do not need to use these classes directly.
Could you provide a simplified version of your full code so I can run it and see if I or anyone from the team can suggest a way to make it faster?

Thanks!

Hi! Thank you for your answer. I do not really know if using QuantumTape() will improve the computation time. Here is the relevant part of the code:

for i, n_shots in enumerate(n_shotss):
        
        ### this part is not slow, but I have included it for completeness
        # obtain measurement scheme until the current n_shots
        meas_scheme = complete_meas_scheme[:n_shotss[i]]
        # get qwc observables
        obs_compat = [[idx for idx in compat if idx < n_shots] for compat in complete_obs_compat]

        for _ in range(number_of_runs):
            ### !!! THIS calculation of the classical shadow is really slow!!!
            # calculate classical shadow using the measurement scheme
            shadow = calculate_classical_shadow(qnode, meas_scheme)
            
            ### from here on, it is not slow anymore (0.008 seconds to execute)
            # estimate the (exp_val of the) Hamiltonian by estimating (the exp_val of) each observable O^(i) in the decomposition of the Hamiltonian
            sg_estimate = estimate_hamiltonian((coeffs, H), shadow, obs_compat, non_idle_wires) + offset
            rmsds.append(rmsd(sg_estimate, ideal_H))

As you can see, the slowest part of the code is the calculation of the classical shadow, which for 10,000 shots takes about 10 seconds. The function calculate_classical_shadow() is implemented as shown in my original question.

My task is to compute, for each shot, the outcomes of measuring the expectation value of the observables provided in each measurement setting, i.e. building the classical shadow, as explained in the original question. That process is the one slowing things down, but I cannot think of a better way of doing it. That’s what made me explore the QuantumTape() class.

It is especially frustrating when comparing it to Pennylane’s classical_shadow() measurement, which runs much faster than mine (in less than a second for 10,000 shots, whereas mine runs in 10 seconds) and has to choose Pauli matrices at random. Mine, however, is already provided with the Pauli operators to measure!

I really appreciate your help! Thank you. If you need further information, just let me know :blush:

Ah!
I think I understand this better now.
It seems like you are implementing a slightly different version of calculate_classical_shadow of the one presented in this demo. Could I phrase your question then in terms of the code of that demo as well? I mean, since your implementation and the one from the demo are very similar, the question is then "How to get the function calculate_classical_shadow faster?
If you look at their results for the time taken to obtain a classical shadow, those are consistent with your estimated times. It seems that you are doing everything just right and the manual method is just slow.
However, what is the reason for you to not use the optimized PennyLane function qml.ClassicalShadow?
Take a look at this demo as well. They talked about qml.shadow_expval which could be useful as well.

Let me know.

Hi! That’s exactly it! I adapted the code given in this demo to my needs, and this links to the reason for not using qml.ClassicalShadow nor qml.shadow_expval. Basically, the difference between Pennylane’s classical shadow function and mine is that, in your function, you randomly choose the Pauli observables you will measure (in agreement to Preskill’s paper). However, in my case, I provide which Pauli matrices should be measured, i.e. they are not chosen at random, they come as lists in the meas_scheme array in my code. I know this is not the “main” approach to classical shadows, but I am trying this for research purposes. The main incompatibility with your built-in function is that I cannot choose the recipes there, and therefore have to do it this way.

And you are right: my main problem is to make the calculate_classical_shadow function much faster, but I do not know how to do it.

Ok, I understand. I will ask for advice from someone from the team. Maybe there is a way.

However, for that to be easier, I really need a simplified and self-contained version of the code.
Something relatively simple that I can copy and paste and be able to run. I guess it will include the circuit template and how you set up your measurements.
I do have a question. When you used qml.tape.QuantumTape() were you able to verify that it was faster?

Thanks!

Hi! Here is a simplified version of my code. I really cannot make it shorter :sweat_smile:. I have tried running it in my computer and it works. The structure of the code is the following:

  • Imports.
  • Setup of the Hamiltonian I use for testing.
  • Utility functions: to define measurement schemes, compute classical shadows based on those schemes, and estimate the expectation value of a given Hamiltonian.
  • Device configuration.
  • Plotting results.

If you use verbose = True in the function shadow_grouping_estimation_rmsd, you will see the time it takes for some steps, and you will be able to verify that calculate_classical_shadow doesn’t scale well with larger measurement budgets. Note also that the function get_measurement_scheme also takes some time, but you should not worry about it, since it will typically be done once for a given Hamiltonian and then loaded from a file.

Also, when I used qml.QuantumTape, the code was definitely faster, but the RMSDs did not decrease with the number of shots, which makes no sense! Maybe I was using them wrongly… It seemed as if using qml.QuantumTape did not properly construct the classical shadow…

The code is commented for clarity, but if anything’s unclear or you have further questions, feel free to ask!

### Imports
import pennylane as qml
from pennylane import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
from datetime import datetime 


### Set up of the Hamiltonian
C1 = qml.PauliX(1) @ qml.PauliZ(2) + qml.PauliX(0) + qml.PauliX(0) @ qml.PauliZ(2) 
C2 = qml.PauliX(0) @ qml.PauliZ(1) + qml.PauliZ(1) 
C3 = qml.PauliX(0) @ qml.PauliY(1) @ qml.PauliX(2) + qml.PauliX(0) @ qml.PauliX(2)

terms = [*C1, *C2, *C3]

np.random.seed(66)
n_qubits = 3
coeffs = np.round(np.random.rand(len(terms)), 2)

H = qml.Hamiltonian(coeffs, terms, grouping_type="qwc")


### Important Functions
def format_pauli_string(op, n_qubits):
    """Format Pauli string as integers. E.g. X0 @ Z1 @ I2 -> [1, 3, 0]
    Args:
        op (qml.PauliX @ qml.PauliZ @ qml.Identity): Pauli string to be formatted.
        n_qubits (int): number of qubits in the system.
    Returns:
        np.array: array of integers representing the Pauli string.
    """
    
    pauli_map = {
        qml.PauliX: 1,
        qml.PauliY: 2,
        qml.PauliZ: 3,
        qml.Identity: 0,
    }
    
    pauli_array = np.zeros(n_qubits, dtype=int)
    
    # check if term is product of Paulis
    if isinstance(op, qml.ops.Prod):
        for sub_op in op:
            wire = sub_op.wires[0]
            pauli_array[wire] = pauli_map.get(type(sub_op), 0)

    # elif isinstance(op, qml.Identity):
    #     pass  # already all zeros

    # check if term is single Pauli operator
    else:
        wire = op.wires[0]
        pauli_array[wire] = pauli_map.get(type(op), 0)

    return pauli_array



def are_pauli_words_qwc(obs1, obs2):
    """Check qubit-wise commutativity between two Pauli strings in integer format.
    
    Args:
        pauli_a (np.ndarray): Array of integers representing Pauli string A.
        pauli_b (np.ndarray): Array of integers representing Pauli string B.
        
    Returns:
        bool: True if A and B are QWC, False otherwise.
    """
    
    for a, b in zip(obs1, obs2):
        if a == 0 or b == 0:
            continue # identity always commutes
        elif a != b:
            return False
    return True
        


def format_hamiltonian(H):
    """Format Hamiltonian as a list of Pauli strings in integer format.

    Args:
        H (qml.Hamiltonian): Hamiltonian to be formatted.
        n_qubits (int): number of qubits in the system.

    Returns:
        tuple (n_qubits, formatted Hamiltonian): Formatted Hamiltonian is an array representing the Hamiltonian. Each row is a Pauli string.
    """
    n_qubits = len(H.wires)
    ham = np.zeros((len(H.ops), n_qubits), dtype=int)
    for i in range(len(H.ops)):
        ham[i] = format_pauli_string(H.ops[i], n_qubits)
        
    return H.coeffs, ham



def get_measurement_scheme(Ham, n_shots, return_weights=False, return_compat_obs=False):
    """
    Function to obtain the measurement scheme for the Hamiltonian H.
    The measurement scheme is obtained by first ordering the terms in H
    by their weight and then iteratively replacing the Identity
    operators in the measurement setting with the terms of the Hamiltonian
    that are compatible with the measurement setting.

    Args:
        Ham (tuple): (coeffs, Pauli strings as int arrays), output of format_hamiltonian().
        n_shots (int): Number of measurement settings (rounds) to generate.
        return_weights (bool): If True, return weight history for each term.
        return_compat_obs (bool): If True, return observables compatible with each setting.

    Returns:
        meas_scheme: Array of integer-encoded measurement settings.
        Optionally: weights_tracker, obs_compat.
    """

    coeffs, H_ops = Ham
    n_terms, n_qubits = H_ops.shape

    now = datetime.now()

    meas_scheme = np.zeros((n_shots, n_qubits), dtype=int)  # Measurement settings
    weights_tracker = np.zeros((n_shots, n_terms)) if return_weights else None
    N = np.ones(n_terms, dtype=int)  # QWC counts per observable (start at 1 to avoid division by 0)

    for i in range(n_shots):
        # Efficient weight calculation using closed formula
        weights = np.abs(coeffs) * (np.sqrt(N + 1) - np.sqrt(N)) / np.sqrt(N * (N + 1))

        if return_weights:
            weights_tracker[i] = weights

        # Sort terms by descending weight
        sorted_indices = np.argsort(weights)[::-1]
        sorted_terms = H_ops[sorted_indices]

        Q = np.zeros(n_qubits, dtype=int)  # Start with identity on all qubits
        n_changes = 0

        # Fill in Q by trying to add high-weight compatible terms
        for term in sorted_terms:
            # Check if we have already filled all qubits
            if n_changes >= n_qubits:
                break

            if are_pauli_words_qwc(Q, term):
                # Substitute Pauli operators into available identity slots
                for j in range(n_qubits):
                    # if Q is identity on qubit j and the term is non-identity on qubit j, replace Q_j with term_j
                    if Q[j] == 0 and term[j] != 0:
                        Q[j] = term[j]
                        n_changes += 1
                        if n_changes >= n_qubits:
                            break

        # Store the new measurement setting
        meas_scheme[i] = Q  

        # Update QWC compatibility counts for each term
        for t in range(n_terms):
            if are_pauli_words_qwc(H_ops[t], Q):
                N[t] += 1

    # Optional: build list of compatible measurement settings per observable
    # Get, for each term in H, the indexes in meas_scheme of the settings that QWC with that observable
    obs_compat = None
    if return_compat_obs:
        obs_compat = np.empty(n_terms, dtype=object)
        for t in range(n_terms):
            compat_idxs = np.array(
                [i for i, Q in enumerate(meas_scheme) if are_pauli_words_qwc(H_ops[t], Q)]
            )
            obs_compat[t] = compat_idxs

    finish = datetime.now()
    print(f"Shots: {n_shots} \t Time for measurement scheme: {(finish - now).total_seconds()} s")

    # Return according to flags
    if return_weights:
        if return_compat_obs:
            return meas_scheme, weights_tracker, obs_compat
        else:
            return meas_scheme, weights_tracker
    elif return_compat_obs:
        return meas_scheme, obs_compat
    else:
        return meas_scheme

    

def calculate_classical_shadow(circuit_template, meas_scheme):
    """
    Function to calculate the classical shadow of a quantum circuit.
    
    Args:
        circuit_template (function): circuit that prepares the quantum state.
        meas_scheme (list): measurement scheme.
        shadow_size (int): number of shots to be used in the measurement scheme.
        n_qubits (int): number of qubits in the circuit.
    
    Returns:
        array: classical shadow of the quantum circuit. (outcomes, recipes)
    """
    
    # use mapping to convert Pauli operators to integers
    pauli_map = {0: qml.Identity, 1: qml.PauliX, 2: qml.PauliY, 3: qml.PauliZ}
    shadow_size, n_qubits = meas_scheme.shape
    outcomes = np.zeros((shadow_size, n_qubits), dtype=int)
    
    # for each measurement setting in the measurement scheme, calculate the expectation value of the Pauli operators provided by the measurement setting
    for snapshot in range(shadow_size):
        obs = [pauli_map[int(meas_scheme[snapshot, i])](i) for i in range(n_qubits)]
        outcomes[snapshot] = circuit_template(obs)
    
    return (outcomes, meas_scheme)


def estimate_hamiltonian(Ham, shadow, obs_compat, non_idle_wires):
    """Function to estimate the expectation value of a Hamiltonian given its Pauli decomposition and a classical shadow obtained from a measurement scheme.
    The expectation value is calculated using formula (12) from the original paper.

    Args:
        H (qml.Hamiltonian): Hamiltonian to estimate
        shadow (tuple): classical shadow obtained from a measurement scheme.

    Returns:
        float: estimation of expected value of the Hamiltonian.
    """
    
    coeffs, H = Ham
    outcomes = shadow[0]  # shape: (n_shots, n_qubits)
    estimates = np.zeros(len(H))  # Store one estimate per term in the Hamiltonian

    for i, (compat, wires) in enumerate(zip(obs_compat, non_idle_wires)):

        # Skip if no compatible measurements were found for this observable
        if len(compat) == 0:
            continue

        # Extract the subset of outcomes for compatible measurement settings and only on the relevant qubits (non-identity positions)
        compat_outcomes = outcomes[compat][:, wires]

        # Estimate the expected value of O^(i) as the mean of the product of outcome bits across qubits
        estimates[i] = np.mean(np.prod(compat_outcomes, axis=1))

    # Return the full Hamiltonian expectation value
    return np.dot(coeffs, estimates)



def shadow_grouping_estimation_rmsd(Ham, qnode, number_of_runs, n_shotss, ideal_H, write_to_file=True, folder_name="Data", offset=0, verbose=False, k=10):
    
    coeffs, H = format_hamiltonian(Ham)
    n_terms, n_qubits = H.shape
    # distances = np.zeros((number_of_runs, len(n_shotss)))
    rmsds = []
    # rmsds_std_devs = np.zeros(len(n_shotss))

    # Precompute non-idle wires: for each observable O^(i) in the Hamiltonian decomposition, obtain the wires on which O^(i)_j != Id(wire=j)
    non_idle_wires = [np.nonzero(H[i])[0] for i in range(n_terms)]
    
    # Compute full measurement scheme and compatibility map for max shots.
    # For each term in H, return which measurement settings (their indices in complete_meas_scheme) QWC with it
    max_shots = n_shotss[-1]
    complete_meas_scheme, complete_obs_compat = get_measurement_scheme((coeffs, H), max_shots, return_compat_obs=True)
    
    # Populate the table with data
    if verbose:
        print(f"{'n_shots':20} {'time/run (s)':20} {'total_time (s)':25}")

    for i, n_shots in enumerate(n_shotss):
        
        meas_scheme = complete_meas_scheme[:n_shotss[i]]
        obs_compat = [[idx for idx in compat if idx < n_shots] for compat in complete_obs_compat]
        # get Paulis for each measurement setting in the measurement scheme
        start_time = datetime.now()
        # estimates of H for each run
        for _ in range(number_of_runs):
            shadow_time = datetime.now()
            # calculate classical shadow using the measurement scheme
            shadow = calculate_classical_shadow(qnode, meas_scheme)
            if verbose:
                print(f"\t Shadow {_:<20} {(datetime.now() - shadow_time).total_seconds():<20.4f}")
            # estimate the (exp_val of the) Hamiltonian by estimating (the exp_val of) each observable O^(i) in the decomposition of the Hamiltonian
            estimate_time = datetime.now()
            sg_estimate = estimate_hamiltonian((coeffs, H), shadow, obs_compat, non_idle_wires) + offset
            if verbose:
                print(f"\t Estimate {_:<20} {(datetime.now() - estimate_time).total_seconds():<20.4f}")
                
            rmsds.append(rmsd(sg_estimate, ideal_H))
        
        total_time = (datetime.now() - start_time).total_seconds()
        # time_per_run = total_time / number_of_runs
        if verbose:
            print(f"{n_shots:<20} {total_time / number_of_runs:<20.4f} {total_time:<25.4f}")
        

    rmsds_arr = np.array(rmsds).reshape(len(n_shotss), number_of_runs)
    means, std_dev = np.mean(rmsds_arr, axis=1), np.std(rmsds_arr, axis=1)
    
    if write_to_file:
        data = np.column_stack((n_shotss, means, std_dev))
        # os.makedirs(folder_name, exist_ok=True)
        np.savetxt(f"{folder_name}/rmsds_{n_shotss[0]}_{n_shotss[-1]}_opt.csv", data, delimiter=",", header="shots, rmsd, std_dev", comments='')
        
    return means, std_dev


def rmsd(x, y):
    """root mean square difference"""
    return np.sqrt(np.mean((x - y)**2))


### Device configuration
dev_ideal = qml.device("default.qubit", wires=n_qubits, shots=None)
dev = qml.device("lightning.qubit", wires=n_qubits, shots=1)

# circuit template to prepare the state
x = np.arange(2*n_qubits, dtype="float64")
def circuit():
    for i in range(n_qubits):
        qml.RY(x[i], wires=i)
    for i in range(n_qubits-1):
        qml.CNOT(wires=[i, i+1])
    for i in range(n_qubits):
        qml.RY(x[i+n_qubits], wires=i)    


# ideal device to calculate the exact expectation value of the Hamiltonian
@qml.qnode(dev_ideal)
def qnode_ideal():
    circuit()
    return qml.expval(H)

ideal_H = qnode_ideal()
print("Ideal H =", np.round(ideal_H, 5))


# circuit to prepare state
@qml.qnode(dev)
def qnode(observables):
    circuit()
    return [qml.expval(obs) for obs in observables]



### Run algorithm and plot results
n_shotss = (
    list(range(10, 110, 10)) +  
    list(range(200, 1100, 100)) +
    list(range(2000, 11000, 1000))
)

number_of_runs = 10


means, std_dev = shadow_grouping_estimation_rmsd(H, qnode, number_of_runs, n_shotss, ideal_H, write_to_file=False, folder_name="Data", offset=0, verbose=True)


plt.rcParams['font.size'] = 14
plt.rcParams['axes.labelsize'] = 16
plt.rcParams['axes.titlesize'] = 18
plt.rcParams['xtick.labelsize'] = 14
plt.rcParams['ytick.labelsize'] = 14

# Create figure
fig, ax = plt.subplots(figsize=(8,6))

# Plot with error bars
ax.errorbar(n_shotss, means, yerr=std_dev, fmt='o-', color="teal", ecolor='lightgray', elinewidth=2, capsize=4, markersize=5)

# Titles and labels
ax.set_title(f"RMSD for different number of shots ({number_of_runs} runs for each)", pad=15)
ax.set_xlabel("Shots", labelpad=10)
ax.set_ylabel("RMSD", labelpad=10)
ax.set_xscale("log")

# Tight layout
plt.tight_layout()

# Show plot
plt.show()

Hi!
I spoke to someone from the team and also did some experimenting myself. These two methods should be consistent with each other if implemented correctly, that’s the expected behaviour.
We realized that in the first post of this thread, the list of ops that you provided to build the quantum tape differs from the circuit that is actually implemented in the circuit_template shown in the full code. The tape one only has one layer of RY and the circuit has CNOTs and another layer of RYs. Could that be the problem?

I tried to do some simple experimenting by comparing the behaviour standard deviation of samples obtained by both methods in a simple circuit and they seem to be consistent with each other. See the two codes below.
Circuit based:

import pennylane as qml
from pennylane import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
from datetime import datetime 

np.random.seed(666)

pauli_map = {0: qml.Identity, 1: qml.PauliX, 2: qml.PauliY, 3: qml.PauliZ}

n_qubits=2
# circuit template to prepare the state

dev = qml.device("lightning.qubit", wires=n_qubits, shots=10000)

@qml.qnode(dev)
def circuit(observables):
    qml.RY(0.5,wires=1)
    return [qml.sample(obs) for obs in observables]

observables = [pauli_map[i](i) for i in range(n_qubits)]

a,b=circuit(observables)

print(np.std(b))

Tapes based:

import pennylane as qml
from pennylane import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
from datetime import datetime 

np.random.seed(666)

pauli_map = {0: qml.Identity, 1: qml.PauliX, 2: qml.PauliY, 3: qml.PauliZ}

n_qubits=2
# circuit template to prepare the state

dev = qml.device("lightning.qubit", wires=n_qubits, shots=1)
measurements = [qml.sample(pauli_map[i](i)) for i in range(n_qubits)]
ops=[qml.RY(0.5, wires=1)]
tape = qml.tape.QuantumTape(ops=ops, measurements=measurements, shots=10000)

outcomes=qml.execute([tape], device=dev, diff_method=None)
b=outcomes[0][1]
print(np.std(b))

Let me know if this is helpful.

This is helpful and I really appreciate your time! Oh and, by the way, the different circuits that I used for tapes in my first question was just because I was messing around with the code and forgot to restore it, but that was not causing the problem.

Based on your answer, I have tried the following two approaches recalling that, for each shot, I have to perform different measurements of Pauli strings:

The first one is to do a tape for each measurement that I want to make (of a Pauli string) and immediately measure it:

pauli_map = {0: qml.Identity, 1: qml.PauliX, 2: qml.PauliY, 3: qml.PauliZ}
shadow_size, n_qubits = meas_scheme.shape
outcomes = np.zeros((shadow_size, n_qubits), dtype=int)

dev = qml.device("lightning.qubit", wires=n_qubits, shots=1)

x = np.arange(2*n_qubits, dtype="float64")
ops = [qml.RY(x[i], wires=i) for i in range(n_qubits)]
    
tape_list = []
for shot in range(shadow_size):
     measurements = [qml.sample(pauli_map[int(meas_scheme[shot, j])](j)) for j in range(n_qubits)]
     tape = qml.tape.QuantumTape(ops=ops, measurements=measurements, shots=1)
     outcomes[shot, :] = qml.execute([tape], device=dev, diff_method=None)[0]

This works fine and is a little bit faster than the previous method!

The second approach is to create a list of tapes to measure and execute them all at once. However, when I try this, the results make no sense, since the RMSD just goes up and down randomly as the number of shots increases.

### same as before
pauli_map = {0: qml.Identity, 1: qml.PauliX, 2: qml.PauliY, 3: qml.PauliZ}
shadow_size, n_qubits = meas_scheme.shape
outcomes = np.zeros((shadow_size, n_qubits), dtype=int)

dev = qml.device("lightning.qubit", wires=n_qubits, shots=1)

x = np.arange(2*n_qubits, dtype="float64")
ops = [qml.RY(x[i], wires=i) for i in range(n_qubits)]
    
# append tapes to be measured
tape_list = []
for shot in range(shadow_size):
     measurements = [qml.sample(pauli_map[int(meas_scheme[shot, j])](j)) for j in range(n_qubits)]
     tape = qml.tape.QuantumTape(ops=ops, measurements=measurements, shots=1)
     tape_list.append(tape)

# get results all at once
outcomes = np.array(qml.execute(tape_list, device=dev, diff_method=None), dtype=int)

I was expecting the second code to do the same as the first one, but it does not. Why would that be? Aren’t they different implementations of the same procedure?

Hi!
I will get back to you on this executing tape list vs execute each tape sequentially issue either later today or next week.
However, I wanted to pass along a suggestion from a member of the team. They say that the preferred usage with pennylane for this kind of problem will be to execute a single circuit with a single circuit with n_shots and post-process the results accordingly. Instead of executing a single shot circuit n_shots times. Just to illustrate:

# single shot circuit executed n times
dev = qml.device("default.qubit", shots=1)

@qml.qnode(dev)
def f(x):
    ...

res = [f(x) for _ in range(n)]

# n shot circuit executed once
dev = qml.device("default.qubit", shots=n)

@qml.qnode(dev)
def f(x):
    ...

res = f(x)

Is this something you can implement? Spending more work in the post-processing and less time overall.

Let us know.

Hi! I can’t really implement this since, for each shot, I have to make diffrent measurements of Pauli observables.

Hi!
You are right, the behaviours are different and specifically, when you use the list of tapes you will get the wrong results in your application because randomness is lost.
I tried two codes that implement what we are discussing and verified this.
For the first one, I use the same tape for 10 shots.

### Imports
import pennylane as qml
from pennylane import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
from datetime import datetime 

np.random.seed(666)

pauli_map = {0: qml.Identity, 1: qml.PauliX, 2: qml.PauliY, 3: qml.PauliZ}

n_qubits=2

dev = qml.device("lightning.qubit", wires=n_qubits, shots=1)
measurements = [qml.sample(pauli_map[i](i)) for i in range(n_qubits)]
ops = [qml.RY(0.5, wires=1)]
tape = qml.tape.QuantumTape(ops=ops, measurements=measurements, shots=10)

outcomes = qml.execute([tape], device=dev, diff_method=None)
b = outcomes[0][1]
#print(np.std(b))
print(outcomes)

and then I built a list of tapes. Basically, repeated the same tape 10 times and execute it hoping to get something that made sense, but got the same result ten times unfortunately. See the code.

### List of tapes (same tape n times)
import pennylane as qml
from pennylane import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
from datetime import datetime 

np.random.seed(666)

pauli_map = {0: qml.Identity, 1: qml.PauliX, 2: qml.PauliY, 3: qml.PauliZ}

n_qubits=2
n_tapes=10

dev = qml.device("lightning.qubit", wires=n_qubits, shots=1)

measurements = [qml.expval(pauli_map[i](i)) for i in range(n_qubits)]
ops=[qml.RY(0.5, wires=1)]
tape_list=[]
for i in range(n_tapes):
	tape = qml.tape.QuantumTape(ops=ops, measurements=measurements, shots=1)
	tape_list.append(tape)

outcomes=qml.execute(tape_list, device=dev, diff_method=None)
b=outcomes[0][1]
#print(np.std(b))
print(outcomes)

The only way you could implement the list of tapes and get results that made sense would be if we could input a different random seed for each tape. But that’s not possible.

I have reported this to the team and will keep you posted. However, as of now, the only option is to execute the tapes individually, which is a bit faster than the circuit-based implementation. The best would be if you could execute each tape for more than one shot, but as you have told me, this is not suitable for your application.

Thanks for your question.

Thank you very much for your time! I appreciate the effort and the attention :blush: .

Sorry about the confusion. Caching is turned on by default with qml.execute. You can turn off caching of the result by using qml.execute(tape_list, device=dev, diff_method=None, cache=False) instead.

Hi!
Just letting you know that I have reported this as a Github issue here and it has been addressed by the team (see the PR attached to the issue).
This will make qml.execute to match the behaviour of QNode by default so your issue will never happen in the first place and always warn about caching, in case people are using finite shots.

Have a nice day and thanks again for your question!