"unsupported type SequenceBox" error encountered in gradient descent

Hello! I have been running this optimization with gradient descent - calculating the cost function gives no error, but in the optimization loop I ran into “Unsupported type in the model: <class ‘autograd.builtins.SequenceBox’>” from unflatten.

import pennylane as qml
from pennylane import numpy as np

def xyzyz_layer(params, n_qubits): #params is matrix of n_qubits*5
    # first Rx layer
    for i in range(n_qubits):
        qml.RX(params[i,0],wires=i)
    
    # now chained c-Ry layer
    if n_qubits> 1:
        for i in range(n_qubits):
            if i+1 < n_qubits:
                ctrl_idx = i+1
            else:
                ctrl_idx = 0
            qml.CRY(params[i,1],wires=[ctrl_idx,i])
    
    # Rz layer
    for i in range(n_qubits):
        qml.RZ(params[i,2],wires=i)
        
    # now chained c-Ry layer
    if n_qubits> 1:
        for i in range(n_qubits):
            if i+1 < n_qubits:
                ctrl_idx = i+1
            else:
                ctrl_idx = 0
            qml.CRY(params[i,3],wires=[ctrl_idx,i])
            
    # Rz layer
    for i in range(n_qubits):
        qml.RZ(params[i,4],wires=i)
        
def xyzy_layer(params, n_qubits): #params is matrix of n_qubits*4
    # first Rx layer
    for i in range(n_qubits):
        qml.RX(params[i,0],wires=i)
    
    # now chained c-Ry layer
    if n_qubits> 1:
        for i in range(n_qubits):
            if i+1 < n_qubits:
                ctrl_idx = i+1
            else:
                ctrl_idx = 0
            qml.CRY(params[i,1],wires=[ctrl_idx,i])
    
    # Rz layer
    for i in range(n_qubits):
        qml.RZ(params[i,2],wires=i)
        
    # now chained c-Ry layer
    if n_qubits> 1:
        for i in range(n_qubits):
            if i+1 < n_qubits:
                ctrl_idx = i+1
            else:
                ctrl_idx = 0
            qml.CRY(params[i,3],wires=[ctrl_idx,i])
    return
    
def encoder_ansatz(params, n_qubits): # for now, just let encoder be one xyzyz layer
    xyzyz_layer(params, n_qubits) 
    return
    
def decoder_ansatz(params, n_qubits): # for now, just let decoder be one xyzy layer
    xyzy_layer(params, n_qubits) 
    return
    
def accumulate_signal(phi, n_probes): # assume qubits 0-n_probes-1 are probes
    for i in range(n_probes):
        qml.RZ(phi, wires=i)

n_probes = 2
n_ancillas = 1
n_qubits = n_probes+n_ancillas
dev = qml.device('default.qubit', analytic=True, wires=n_probes+n_ancillas)
@qml.qnode(dev)

def measure_full_sys(phi, params):
    enc = params[:5*n_qubits].reshape(n_qubits,5)
    dec = params[-4*n_qubits:].reshape(n_qubits,4)
    encoder_ansatz(enc, n_qubits)
    accumulate_signal(phi, n_probes)
    decoder_ansatz(dec, n_qubits)
    return qml.probs(wires=range(n_qubits))

    
def CFI_full(phi, params):
    probs = measure_full_sys(phi, params) # len=2**n_qubits
    grad_phi = qml.jacobian(measure_full_sys,argnum=0)# len=2**n_qubits
    grad_val = grad_phi(phi,params)
    cfi = np.sum(grad_val**2/probs) # each term is grad^2/prob
    return cfi
    

def avg_CFI_full(params):
    phi_vec = [0.1]
    avg_cfi = 0
    for i in range(len(phi_vec)):
        avg_cfi += CFI_full(phi_vec[i],params)
    avg_cfi /= len(phi_vec)
    return -avg_cfi

steps = 200

        
params_init = np.random.rand(n_qubits*9)

gd_cost = []
opt = qml.RMSPropOptimizer(0.01)

theta = params_init
for _ in range(steps):
    theta = opt.step(avg_CFI_full, theta)
    gd_cost.append(avg_CFI_full(theta))

The error I am getting is:
when I am calculating jacobian in my cost function (note that even without optimization my cost function includes a jacobian evaluation), it calls _flatten somewhere and that throws an “unsupported type SequenceBox” error.
However, when I am running the cost function without doing gradient descent it gives no error.

Hi @ziqi-ma! Welcome :slightly_smiling_face:

Just double checking that you are converting the measure_full_sys function into a QNode, by using the @qml.qnode(device) decorator, or qml.QNode(measure_full_sys, device) function?

Thanks! Yes I did - added that code to above. Now my full code is here.

I tried this minimal example, basically using a cost function that calls jacobian, and it gives the same error:

@qml.qnode(dev)
def circuit(x,y):
    qml.RX(x, wires=0)
    qml.RY(y, wires=0)
    return qml.probs(0)

def cost(x):
    dcircuit = qml.jacobian(circuit, argnum=0)
    return dcircuit(x,1)[0]

steps = 10

        
params_init = np.random.rand()

gd_cost = []
opt = qml.RMSPropOptimizer(0.01)

theta = params_init
for _ in range(steps):
    theta = opt.step(cost, theta)
    cur_cost = cost(theta)
    gd_cost.append(cur_cost)
    print(cur_cost)

Hi @ziqi-ma and welcome to the forum!

Thanks for sharing your full code. I managed to replicate the error - which is due to your cost function being incompatible with the NumPy/Autograd interface of PennyLane.

As you point out, your cost function itself includes the gradient. Unfortunately, training on the gradient with the NumPy/Autograd interface is not yet supported. However, it can be done if you switch to the Torch interface and I give the code below on how to do that.

In case you’re not familiar with the different interfaces, using the Torch interface can be specified in the qml.qnode decorator:
@qml.qnode(dev, interface="torch")
The output of the qnode (in your case, measure_full_sys) is then a PyTorch tensor, which requires a few changes to the CFI_full and avg_CFI_full functions.

Here is the code:

import pennylane as qml
import torch

n_probes = 2
n_ancillas = 1
n_qubits = n_probes + n_ancillas
dev = qml.device("default.qubit", analytic=True, wires=n_probes + n_ancillas)

##############################################################################
# Layers
##############################################################################


def xyzyz_layer(params, n_qubits):  # params is matrix of n_qubits*5
    # first Rx layer
    for i in range(n_qubits):
        qml.RX(params[i, 0], wires=i)

    # now chained c-Ry layer
    if n_qubits > 1:
        for i in range(n_qubits):
            if i + 1 < n_qubits:
                ctrl_idx = i + 1
            else:
                ctrl_idx = 0
            qml.CRY(params[i, 1], wires=[ctrl_idx, i])

    # Rz layer
    for i in range(n_qubits):
        qml.RZ(params[i, 2], wires=i)

    # now chained c-Ry layer
    if n_qubits > 1:
        for i in range(n_qubits):
            if i + 1 < n_qubits:
                ctrl_idx = i + 1
            else:
                ctrl_idx = 0
            qml.CRY(params[i, 3], wires=[ctrl_idx, i])

    # Rz layer
    for i in range(n_qubits):
        qml.RZ(params[i, 4], wires=i)


def xyzy_layer(params, n_qubits):  # params is matrix of n_qubits*4
    # first Rx layer
    for i in range(n_qubits):
        qml.RX(params[i, 0], wires=i)

    # now chained c-Ry layer
    if n_qubits > 1:
        for i in range(n_qubits):
            if i + 1 < n_qubits:
                ctrl_idx = i + 1
            else:
                ctrl_idx = 0
            qml.CRY(params[i, 1], wires=[ctrl_idx, i])

    # Rz layer
    for i in range(n_qubits):
        qml.RZ(params[i, 2], wires=i)

    # now chained c-Ry layer
    if n_qubits > 1:
        for i in range(n_qubits):
            if i + 1 < n_qubits:
                ctrl_idx = i + 1
            else:
                ctrl_idx = 0
            qml.CRY(params[i, 3], wires=[ctrl_idx, i])
    return


##############################################################################
# Ansatz
##############################################################################


def encoder_ansatz(params, n_qubits):  # for now, just let encoder be one xyzyz layer
    xyzyz_layer(params, n_qubits)
    return


def decoder_ansatz(params, n_qubits):  # for now, just let decoder be one xyzy layer
    xyzy_layer(params, n_qubits)
    return


def accumulate_signal(phi, n_probes):  # assume qubits 0-n_probes-1 are probes
    for i in range(n_probes):
        qml.RZ(phi, wires=i)


##############################################################################
# Cost
##############################################################################


@qml.qnode(dev, interface="torch")
def measure_full_sys(phi, params):
    enc = params[: 5 * n_qubits].reshape(n_qubits, 5)
    dec = params[-4 * n_qubits :].reshape(n_qubits, 4)
    encoder_ansatz(enc, n_qubits)
    accumulate_signal(phi, n_probes)
    decoder_ansatz(dec, n_qubits)
    return qml.probs(wires=range(n_qubits))


def CFI_full(phi, params):
    probs = measure_full_sys(phi, params)
    jac = torch.autograd.functional.jacobian(measure_full_sys, (phi, params))
    cfi = torch.sum(jac[0] ** 2 / probs)
    return cfi


def avg_CFI_full(params):
    phi_vec = torch.tensor([0.1, 0.2], requires_grad=True)
    return -torch.mean(torch.stack([CFI_full(phi, params) for phi in phi_vec]))


##############################################################################
# Optimization
##############################################################################

params = torch.randn(9 * n_qubits, requires_grad=True)
opt = torch.optim.RMSprop([params], lr=0.01)

steps = 200
gd_cost = []

for _ in range(steps):
    opt.zero_grad()
    cost = avg_CFI_full(params)
    cost.backward()
    opt.step()
    gd_cost.append(cost)
    print(cost)

This definitely works without error, but I found the cost function behaves a bit unpredictably. Note that I also had to upgrade my version of PyTorch (pip install torch --upgrade) to use the jacobian function.

Hopefully this helps, and let us know if you have any more questions!

1 Like

Thank you so much! Yes this works - very much appreciated!

A follow-up:
Currently all gradients are analytically derived?

For the jacobian in my cost function, is it obtained by parameter-shift rule? If I want to run this by simulation (i.e. get probs of measurement via sampling rather than analytically deriving the probabilities, and also maybe add noise to my circuit), how should I do that?

Thank you!

That’s a great question @ziqi-ma!

In the example above, the gradient is analytically calculated using the parameter shift rule (which gives the answer exactly, compared to the finite-difference method which is an approximation).

However, in PennyLane you can set whether expectation values are analytic or approximate and whether the gradient calculation method is parameter-shift or finite-difference.

Analytic vs. approximate devices

You can control whether expectation values and variances are calculated analytically or approximately by specifying the device.

Default qubit is a simulator, and we can specify either an analytic or a sampling mode via an argument to qml.device:

dev = qml.device('default.qubit', wires=2, analytic=False)

However, there is a wide range of different devices available. Some devices, such as hardware devices, only allow for a sampling mode. For example, a qiskit.aer device can be loaded with:

dev = qml.device('qiskit.aer', wires=2)

(note that you need to install the PennyLane-Qiskit plugin)

Gradient calculation method

The gradient can be calculated using either the parameter-shift or the finite-difference method. Both of these work with analytic or approximate devices. The method can be specified in qml.qnode:

@qml.qnode(dev, diff_method='finite-diff')

In general, we recommend using the (default*) parameter-shift method: it is exact when the simulator is analytic and more accurate in most situations when the simulator is sample-based.

*Actually, some devices such as default.tensor.tf support analytic calculation of the gradient.

Adding noise

Default qubit is a noise-free simulator. I recommend checking out the IBM device or Rigetti device. A description of adding noise for the IBM device can be found here, although one caveat is that it doesn’t yet support returning probabilities, which is relevant for your problem.

1 Like

Thank you so much!

Yes, actually the reason I am using penny lane is because I have been using qiskit+scipy.optimize, and taking gradient via finite difference. Things worked fine when I calculate analytically, but when I sample, finite difference stopped working due to statistical noise.

-That’s why I wanted the built-in parameter-shift gradient calculation provided by pennylane.

One thing that surprised me a little was that now, switching to pytorch optimizer and auto-differentiation, my optimization stopped working - it seems to be going the opposite direction:

import pennylane as qml
import torch

n_probes = 2
n_ancillas = 1
n_qubits = n_probes + n_ancillas
dev = qml.device("default.qubit", analytic=True, wires=n_probes + n_ancillas)

##############################################################################
# Layers
##############################################################################


def xyzyz_layer(params, n_qubits):  # params is matrix of n_qubits*5
    # first Rx layer
    for i in range(n_qubits):
        qml.RX(params[i, 0], wires=i)

    # now chained c-Ry layer
    if n_qubits > 1:
        for i in range(n_qubits):
            if i + 1 < n_qubits:
                ctrl_idx = i + 1
            else:
                ctrl_idx = 0
            qml.CRY(params[i, 1], wires=[ctrl_idx, i])

    # Rz layer
    for i in range(n_qubits):
        qml.RZ(params[i, 2], wires=i)

    # now chained c-Ry layer
    if n_qubits > 1:
        for i in range(n_qubits):
            if i + 1 < n_qubits:
                ctrl_idx = i + 1
            else:
                ctrl_idx = 0
            qml.CRY(params[i, 3], wires=[ctrl_idx, i])

    # Rz layer
    for i in range(n_qubits):
        qml.RZ(params[i, 4], wires=i)


def xyzy_layer(params, n_qubits):  # params is matrix of n_qubits*4
    # first Rx layer
    for i in range(n_qubits):
        qml.RX(params[i, 0], wires=i)

    # now chained c-Ry layer
    if n_qubits > 1:
        for i in range(n_qubits):
            if i + 1 < n_qubits:
                ctrl_idx = i + 1
            else:
                ctrl_idx = 0
            qml.CRY(params[i, 1], wires=[ctrl_idx, i])

    # Rz layer
    for i in range(n_qubits):
        qml.RZ(params[i, 2], wires=i)

    # now chained c-Ry layer
    if n_qubits > 1:
        for i in range(n_qubits):
            if i + 1 < n_qubits:
                ctrl_idx = i + 1
            else:
                ctrl_idx = 0
            qml.CRY(params[i, 3], wires=[ctrl_idx, i])
    return


##############################################################################
# Ansatz
##############################################################################


def encoder_ansatz(params, n_qubits):  # for now, just let encoder be one xyzyz layer
    xyzyz_layer(params, n_qubits)
    return


def decoder_ansatz(params, n_qubits):  # for now, just let decoder be one xyzy layer
    xyzy_layer(params, n_qubits)
    return


def accumulate_signal(phi, n_probes):  # assume qubits 0-n_probes-1 are probes
    for i in range(n_probes):
        qml.RZ(phi, wires=i)


##############################################################################
# Cost
##############################################################################


@qml.qnode(dev, interface="torch")
def measure_full_sys(phi, params):
    #enc = params[: 5 * n_qubits].reshape(n_qubits, 5)
    #dec = params[-4 * n_qubits :].reshape(n_qubits, 4)
    #encoder_ansatz(enc, n_qubits)
    #accumulate_signal(phi, n_probes)
    #decoder_ansatz(dec, n_qubits)
    qml.RY(params[0],wires=0)
    qml.RZ(phi,wires=0)
    qml.RY(params[1],wires=0)
    return qml.probs(wires=0)
    #return qml.probs(wires=range(n_qubits))
    
def log_measure_full_sys(phi, params):
    return torch.log(measure_full_sys(phi,params))


def CFI_full(phi, params):
    probs = measure_full_sys(phi, params)
    #print(probs)
    jac = torch.autograd.functional.jacobian(log_measure_full_sys, (phi, params))
    cfi = torch.sum(jac[0]**2 * probs)
    #jac = torch.autograd.functional.jacobian(measure_full_sys, (phi, params))
    #cfi = torch.sum(jac[0] ** 2 / probs)
    return cfi


def avg_CFI_full(params):
    phi_vec = torch.tensor([0.1], requires_grad=True)
    mean_cfi = torch.mean(torch.stack([CFI_full(phi, params) for phi in phi_vec]))
    #print(mean_cfi)
    return mean_cfi

##############################################################################
# Optimization
##############################################################################

#params = torch.randn(9 * n_qubits, requires_grad=True)
params = torch.randn(2, requires_grad=True)
opt = torch.optim.SGD([params],lr=0.5)

steps = 500
gd_cost = []

for i in range(steps):
    def closure():
        opt.zero_grad()
        loss = -avg_CFI_full(params)
        gd_cost.append(loss)
        loss.backward()
        if i % 10 == 0:
            print("loss")
            print(loss)
            print(params)
            print("grad")
            print(params.grad)
        return loss
    opt.step(closure)

I modified it to take in only 2 arguments, and I know the min of this cost function should be -1, but this doesn’t seem to be working… Am I doing anything wrong? (I want to maximize mean_cfi so I am just setting loss to be the negative)

Actually, I think the gradient part is still problematic: look at the simplified example below:

@qml.qnode(dev, interface="torch")
def measure_full_sys(phi, params):
    #enc = params[: 5 * n_qubits].reshape(n_qubits, 5)
    #dec = params[-4 * n_qubits :].reshape(n_qubits, 4)
    #encoder_ansatz(enc, n_qubits)
    #accumulate_signal(phi, n_probes)
    #decoder_ansatz(dec, n_qubits)
    qml.RY(params[0],wires=0)
    qml.RZ(phi,wires=0)
    qml.RY(params[1],wires=0)
    return qml.probs(wires=0)

params = torch.tensor([1.57,1.57], requires_grad=True)
phi_vec = torch.tensor([0.1], requires_grad=True)
phi = phi_vec[0]
probs = measure_full_sys(phi, params)
print(probs)
jac = torch.autograd.functional.jacobian(measure_full_sys, (phi, params),create_graph=True, strict=True)
print(jac[0])

This is giving the error:
RuntimeError: The jacobian of the user-provided function is independent of input 0. This is not allowed in strict mode when create_graph=True.

However, the jacobian indeed depends on input 0 (if you turn strict=False, and vary phi, you can observe the change of jac[0]).

Is this due to some sort of incompatibility between pytorch autograd and qml?

(And to provide more info, in this simplified example, meas_full_system should output sin^2(phi)/4)

The reason why I think it’s an incompatibility is: even if I set the meas_full_system function to just output sin^2(phi)/4, with the qml decorator it gives type errors (Variable vs. Tensor vs. ndarray), but without the decorator it works fine.

Hi @ziqi-ma,

You’re right! This problem boils down to us not having implemented second order derivatives yet in PennyLane - it’s in the pipeline to add a hessian() method to the QNode but this has not yet been implemented.

The above means that, while torch.autograd.functional.jacobian evaluates to the right answer, the gradient is inaccessible. I think the strange behaviour with the cost function was because of the line cfi = torch.sum(jac[0] ** 2 / probs), which has an accessible gradient through probs but couldn’t access the gradient from jac.

The best option for now is to hard-code the derivative of the QNode with respect to phi. For example, you could swap CFI_full with:

import math
shift = torch.tensor(0.01)

def CFI_full(phi, params):
    probs = measure_full_sys(phi, params)
    probs_plus = measure_full_sys(phi + shift, params)
    probs_minus = measure_full_sys(phi - shift, params)
    jac = (probs_plus - probs_minus) / (2 * shift)
    cfi = torch.sum(jac ** 2 / probs)
    return cfi

and the result should train. I checked and the cost function does get smaller. Note that the above uses the finite-difference method. Parameter-shift should be possible but needs a bit more care because of the multiple uses of phi in accumulate_signal.

Thanks a lot for this feedback, you’ve asked some great questions which help us plan our priorities going forward!

@ziqi-ma, I just had a go at a hard-coded parameter shift method, here is the full code:

import pennylane as qml
import torch
import math

n_probes = 2
n_ancillas = 1
n_qubits = n_probes + n_ancillas
dev = qml.device("default.qubit", analytic=True, wires=n_probes + n_ancillas)

shift = torch.tensor(math.pi / 2)

##############################################################################
# Layers
##############################################################################


def xyzyz_layer(params, n_qubits):  # params is matrix of n_qubits*5
    # first Rx layer
    for i in range(n_qubits):
        qml.RX(params[i, 0], wires=i)

    # now chained c-Ry layer
    if n_qubits > 1:
        for i in range(n_qubits):
            if i + 1 < n_qubits:
                ctrl_idx = i + 1
            else:
                ctrl_idx = 0
            qml.CRY(params[i, 1], wires=[ctrl_idx, i])

    # Rz layer
    for i in range(n_qubits):
        qml.RZ(params[i, 2], wires=i)

    # now chained c-Ry layer
    if n_qubits > 1:
        for i in range(n_qubits):
            if i + 1 < n_qubits:
                ctrl_idx = i + 1
            else:
                ctrl_idx = 0
            qml.CRY(params[i, 3], wires=[ctrl_idx, i])

    # Rz layer
    for i in range(n_qubits):
        qml.RZ(params[i, 4], wires=i)


def xyzy_layer(params, n_qubits):  # params is matrix of n_qubits*4
    # first Rx layer
    for i in range(n_qubits):
        qml.RX(params[i, 0], wires=i)

    # now chained c-Ry layer
    if n_qubits > 1:
        for i in range(n_qubits):
            if i + 1 < n_qubits:
                ctrl_idx = i + 1
            else:
                ctrl_idx = 0
            qml.CRY(params[i, 1], wires=[ctrl_idx, i])

    # Rz layer
    for i in range(n_qubits):
        qml.RZ(params[i, 2], wires=i)

    # now chained c-Ry layer
    if n_qubits > 1:
        for i in range(n_qubits):
            if i + 1 < n_qubits:
                ctrl_idx = i + 1
            else:
                ctrl_idx = 0
            qml.CRY(params[i, 3], wires=[ctrl_idx, i])
    return


##############################################################################
# Ansatz
##############################################################################


def encoder_ansatz(params, n_qubits):  # for now, just let encoder be one xyzyz layer
    xyzyz_layer(params, n_qubits)
    return


def decoder_ansatz(params, n_qubits):  # for now, just let decoder be one xyzy layer
    xyzy_layer(params, n_qubits)
    return


def accumulate_signal(phi, n_probes):  # assume qubits 0-n_probes-1 are probes
    for i in range(n_probes):
        qml.RZ(phi[i], wires=i)


##############################################################################
# Cost
##############################################################################


@qml.qnode(dev, interface="torch")
def measure_full_sys(phi, params):
    enc = params[: 5 * n_qubits].reshape(n_qubits, 5)
    dec = params[-4 * n_qubits :].reshape(n_qubits, 4)
    encoder_ansatz(enc, n_qubits)
    accumulate_signal(phi, n_probes)
    decoder_ansatz(dec, n_qubits)
    return qml.probs(wires=range(n_qubits))


def CFI_full(phi, params):
    phi_split = torch.stack([phi] * n_probes)
    probs = measure_full_sys(phi_split, params)
    jac = []
    
    for shift_pos in torch.eye(n_probes):
        probs_plus = measure_full_sys(phi_split + shift_pos * shift, params)
        probs_minus = measure_full_sys(phi_split - shift_pos * shift, params)
        jac.append((probs_plus - probs_minus) / (2 * torch.sin(shift)))

    jac = torch.sum(torch.stack(jac), 0)
    cfi = torch.sum(jac ** 2 / probs)
    return cfi


def avg_CFI_full(params):
    phi_vec = torch.tensor([0.1, 0.2], requires_grad=True)
    return -torch.mean(torch.stack([CFI_full(phi, params) for phi in phi_vec]))


##############################################################################
# Optimization
##############################################################################

params = torch.randn(9 * n_qubits, requires_grad=True)
opt = torch.optim.RMSprop([params], lr=0.01)

steps = 200
gd_cost = []

for _ in range(steps):
    opt.zero_grad()
    cost = avg_CFI_full(params)
    cost.backward()
    opt.step()
    gd_cost.append(cost)
    print(cost)
1 Like

Thank you so much! This is super helpful!

Hi Tom!

The analytic version with parameter-shift is working. Thanks for that!

However, now when I switch to sampling mode, i.e. by setting
dev = qml.device(‘default.qubit’, wires=2, analytic=False), seems like it’s still giving me analytic results. i.e. I am printing out my probabilities repeatedly and it’s always the same values, even when I set shots=10.

No problem @ziqi-ma, happy to help!

Not sure what’s happening regarding the seemingly analytic results when in sampling mode. I had a quick look and managed to get results that look sample based (i.e., are different each time) with the following code:

import pennylane as qml
import torch
import math

shots = 1000

n_probes = 2
n_ancillas = 1
n_qubits = n_probes + n_ancillas
dev = qml.device("default.qubit", analytic=False, wires=n_probes + n_ancillas, shots=shots)

shift = torch.tensor(math.pi / 2)

##############################################################################
# Layers
##############################################################################


def xyzyz_layer(params, n_qubits):  # params is matrix of n_qubits*5
    # first Rx layer
    for i in range(n_qubits):
        qml.RX(params[i, 0], wires=i)

    # now chained c-Ry layer
    if n_qubits > 1:
        for i in range(n_qubits):
            if i + 1 < n_qubits:
                ctrl_idx = i + 1
            else:
                ctrl_idx = 0
            qml.CRY(params[i, 1], wires=[ctrl_idx, i])

    # Rz layer
    for i in range(n_qubits):
        qml.RZ(params[i, 2], wires=i)

    # now chained c-Ry layer
    if n_qubits > 1:
        for i in range(n_qubits):
            if i + 1 < n_qubits:
                ctrl_idx = i + 1
            else:
                ctrl_idx = 0
            qml.CRY(params[i, 3], wires=[ctrl_idx, i])

    # Rz layer
    for i in range(n_qubits):
        qml.RZ(params[i, 4], wires=i)


def xyzy_layer(params, n_qubits):  # params is matrix of n_qubits*4
    # first Rx layer
    for i in range(n_qubits):
        qml.RX(params[i, 0], wires=i)

    # now chained c-Ry layer
    if n_qubits > 1:
        for i in range(n_qubits):
            if i + 1 < n_qubits:
                ctrl_idx = i + 1
            else:
                ctrl_idx = 0
            qml.CRY(params[i, 1], wires=[ctrl_idx, i])

    # Rz layer
    for i in range(n_qubits):
        qml.RZ(params[i, 2], wires=i)

    # now chained c-Ry layer
    if n_qubits > 1:
        for i in range(n_qubits):
            if i + 1 < n_qubits:
                ctrl_idx = i + 1
            else:
                ctrl_idx = 0
            qml.CRY(params[i, 3], wires=[ctrl_idx, i])
    return


##############################################################################
# Ansatz
##############################################################################


def encoder_ansatz(params, n_qubits):  # for now, just let encoder be one xyzyz layer
    xyzyz_layer(params, n_qubits)
    return


def decoder_ansatz(params, n_qubits):  # for now, just let decoder be one xyzy layer
    xyzy_layer(params, n_qubits)
    return


def accumulate_signal(phi, n_probes):  # assume qubits 0-n_probes-1 are probes
    for i in range(n_probes):
        qml.RZ(phi[i], wires=i)


##############################################################################
# Cost
##############################################################################


@qml.qnode(dev, interface="torch")
def measure_full_sys(phi, params):
    enc = params[: 5 * n_qubits].reshape(n_qubits, 5)
    dec = params[-4 * n_qubits :].reshape(n_qubits, 4)
    encoder_ansatz(enc, n_qubits)
    accumulate_signal(phi, n_probes)
    decoder_ansatz(dec, n_qubits)
    return qml.probs(wires=range(n_qubits))


def CFI_full(phi, params):
    phi_split = torch.stack([phi] * n_probes)
    probs = measure_full_sys(phi_split, params)
    jac = []
    
    for shift_pos in torch.eye(n_probes):
        probs_plus = measure_full_sys(phi_split + shift_pos * shift, params)
        probs_minus = measure_full_sys(phi_split - shift_pos * shift, params)
        jac.append((probs_plus - probs_minus) / (2 * torch.sin(shift)))

    jac = torch.sum(torch.stack(jac), 0)
    cfi = torch.sum(jac ** 2 / probs)
    return cfi


def avg_CFI_full(params):
    phi_vec = torch.tensor([0.1, 0.2], requires_grad=True)
    return -torch.mean(torch.stack([CFI_full(phi, params) for phi in phi_vec]))

params = torch.ones(9 * n_qubits, requires_grad=True)
phi = torch.tensor([0.1] * n_qubits, requires_grad = True)

print(measure_full_sys(phi, params))
print(CFI_full(phi[0], params))
print(avg_CFI_full(params))

Does that work for you?

One thing I did notice is that CFI_full can be inf when the number of shots is very low, since torch.sum(jac ** 2 / probs) has probs on the denominator, which can evaluate to zero.

Hmm that’s very weird - this is giving me the same output every time. This is the result I get even if I set shots=10

tensor([0.5396, 0.1406, 0.0543, 0.0562, 0.0621, 0.0451, 0.0628, 0.0392],
       dtype=torch.float64, grad_fn=<_TorchQNodeBackward>)
tensor(1.4880, dtype=torch.float64, grad_fn=<SumBackward0>)
tensor(-1.4002, dtype=torch.float64, grad_fn=<NegBackward>)

And a follow-up question: eventually I hope to add noise, so will need to switch to qiskit backend - however, I got the error saying qiskit does not support “probs” - I know qiskit doesn’t directly provide a probability vector, but it provides result.counts(), and that could be converted to probs. However, in pennylane it seems like I can only do sample() - however, when I try to loop over all qubits and sample pauliZ, put them in a matrix and do calculation, that doesn’t work since each sample is of type pauliZ and I’m not sure how to do calculation on that. Do you have suggestions on how to get the probability vector? Thanks!

Hi @ziqi-ma, the analytic probabilities in sampling mode was a bug that has been corrected here: https://github.com/XanaduAI/pennylane/pull/573. Also, PennyLane-Qiskit has recently been updated to support probabilities.

We’re due to release new versions of PennyLane and PennyLane-Qiskit shortly, but you can access the latest master branches using:

  • PennyLane: pip install git+https://github.com/XanaduAI/pennylane.git
  • PennyLane-Qiskit: pip install git+https://github.com/XanaduAI/pennylane-qiskit.git

With those updated,

dev = qml.device("qiskit.aer", analytic=False, wires=n_probes + n_ancillas, shots=shots)

should work!

Thank you! It works now.