Specific derivatives of a quantum circuit

Hi,

I have a parametrized quantum circuit (PQC) where I measure probs() at the end of it, and I want to get it’s derivatives with respect to the input vector (and not to the parameters).
I tried using jacobian(circuit)(input_points, params), but it has the following problem:
For each probability, I need to get the derivative with respect to a single or several points in the input vector, this means that using the jacobian function (or especially jacobian(jacobian())) is a very costly computation where I don’t need most of it.
Is it possible to specify the probabilities and the corresponding points I want to differentiate with respect to in order to get like a jacobian matrix but sparse?

another problem I have is probably due to my own implementation, but when I optimize I don’t get the results I’m expecting:
In my QC the first step is to use AmplitudeEmbedding(input_points). I refer to the input as points on a line (where each state’s amplitude is a location of the point number <#state>). The output oof the circuit is supposed to mimic a function by updating the gates’ parameters, where each probability is an amp encoding of a field in the point (similar to the input).
As I see it, if I differentiate the each output probability from the QC according to the corresponding input_point, it should be similar to taking spatial derivatives, right? or did I not understand it?

Thanks a lot!

Code:
# Pennylane
import pennylane as qml
from pennylane import numpy as pnp
import matplotlib.pyplot as plt

seed = 42
n_qubits = 5
dev = qml.device('default.qubit')
range_qubits = range(n_qubits)
n_layers = 5
n_input_points = 2 ** n_qubits
steps = 50
stepsize = 0.01
opt = qml.AdamOptimizer(stepsize=stepsize)
n_parameters_per_layer = 2 * (n_qubits - 1)
n_total_parameters = n_layers * n_parameters_per_layer + n_qubits
params = pnp.random.rand(n_total_parameters, requires_grad=True)


def init_input_points():
    """
    initializing the collocation points.
    """
    x = pnp.linspace(0, pnp.pi, n_input_points)  # remove the requires_grad kwarg
    xnorm = x / pnp.linalg.norm(x)
    xnorm = xnorm + 1e-10  # to avoid division by zero
    xnorm = xnorm / pnp.linalg.norm(xnorm)
    plt.plot(xnorm)
    return xnorm, x


input_points, x = init_input_points()
input_points.requires_grad = True
x.requires_grad = False
cos = pnp.cos(x)  # I want the answer to be a sine, so it's derivative is a cosine
cos_norm = cos / pnp.linalg.norm(cos)

def fixedStructureAnsatz(params):
    """
    create a hardware efficient, fixed structure parameterized ansatz
    on qubits specified by their indices 'qubitindices'.

    Args:
        params (array): the given the angles for the quantum gates.
    """
    n_sublayers = 2
    j = 0
    # Base layer: apply RZ on all qubits
    for q in range_qubits:
        qml.RZ(params[j], q)
        j += 1

    ctrlq = []
    trgtq = []
    ctrlq.append(range_qubits[:-1:2])
    ctrlq.append(range_qubits[1:-1:2])
    trgtq.append(range_qubits[1::2])
    trgtq.append(range_qubits[2::2])

    for l in range(1, n_layers + 1):
        for s in range(n_sublayers):
            for i in range(len(ctrlq[s])):
                qml.CZ([ctrlq[s][i], trgtq[s][i]])
            # Edited this to make it work
            a = set(ctrlq[s])
            a |= set(trgtq[s])
            for q in list(a):
                qml.RY(params[j], q)
                j += 1


@qml.qnode(dev)
def circuit(input_points, params, max_diff=2):  # Added max_diff and input_points
    qml.AmplitudeEmbedding(features=input_points, wires=range_qubits,
                           normalize=True)  # AmplitudeEmbedding instead of StatePrep
    fixedStructureAnsatz(params)  # this is some HW-efficient ansatz preperation according to the parameters
    return qml.probs(wires=range_qubits)


def loss_fn(input_points, params):
    d_psi_dx_vec = qml.jacobian(circuit, argnum=0)(input_points, params)
    d_psi_dx_vec = pnp.diag(d_psi_dx_vec)
    pdeloss = d_psi_dx_vec - cos_norm
    loss = pnp.mean(pdeloss ** 2)  # MSE of the pdeloss using pnp instead of np
    return loss


def optimization(input_points, params):
    best_loss = pnp.inf
    for n in range(steps):  # Edited this to make it work
        [_, params], loss = opt.step_and_cost(loss_fn, input_points, params)
        print(f"Step {n}, Loss: {loss}, Params: {params}")
        if loss < best_loss:
            best_loss = loss
            best_params = params.copy()
    return best_params, best_loss


best_params, best_loss = optimization(input_points, params)

Hi @ZivChen,

I see several things here.

On one hand, max_diff should be used in the QNode definition instead of the circuit. So @qml.qnode(dev, max_diff=2)

On the other hand I’m not sure what you mean about AmplitudeEmbedding. If you compare the state produced by your circuit just after AmplitudeEmbedding, it is indeed the same state as the one given by your input parameters. If you run the code below you’ll see that they’re the same.

# Amplitude Embedding
# PennyLane
import pennylane as qml
from pennylane import numpy as pnp
import matplotlib.pyplot as plt

n_qubits = 5
dev = qml.device('default.qubit')
range_qubits = range(n_qubits)
n_layers = 5
n_input_points = 2 ** n_qubits
n_parameters_per_layer = 2 * (n_qubits - 1)
n_total_parameters = n_layers * n_parameters_per_layer + n_qubits
params = pnp.random.rand(n_total_parameters, requires_grad=True)

fig,ax = plt.subplots()

def init_input_points():
    """
    initializing the collocation points.
    """
    x = pnp.linspace(0, pnp.pi, n_input_points)
    xnorm = x / pnp.linalg.norm(x)
    xnorm = xnorm + 1e-10  # to avoid division by zero
    xnorm = xnorm / pnp.linalg.norm(xnorm)
    
    ax.plot(xnorm,marker='o')
    return xnorm, x

input_points, x = init_input_points()
print(input_points)

@qml.qnode(dev, max_diff=2) # max_diff should be used in the QNode definition instead of the circuit
def circuit(input_points, params):
    qml.AmplitudeEmbedding(features=input_points, wires=range_qubits,
                           normalize=True)  # AmplitudeEmbedding instead of StatePrep
    #fixedStructureAnsatz(params)  # this is some HW-efficient ansatz preperation according to the parameters
    return qml.state()

state = circuit(input_points, params)
print(state)
ax.plot(state,marker='x')

print(state-input_points)

When you mention spatial derivatives I’m not super clear about what you’re trying to do.
I’ll give you an example and please let me know if this is what you’re trying to do:
Let’s say you have 10 points (X) equally spaced between 0 and 2pi. Your function is Y(X)=sin(X). So you want your quantum circuit to ‘learn’ to always produce an output cos(X) because this is dY/dX.

For this particular example you can construct your circuit using a workflow very similar to the one in this blog post. I’ll add a code example below.

However there’s a big disclaimer here: this code doesn’t work for all functions. In fact because of it’s very simple structure it will only work for very simple sine functions, but it won’t work for a straight line for example. Quantum circuits work like Fourier series int his way. So you’ll need to test with different cost functions or Ansatze or embeddings to see if any of these work for other more general functions.

Some other algorithms (like the ones here) might be necessary to achieve this goal, and even then there are no guarantees. For most problems quantum algorithms perform worse than classical ones so you’ll have to test whether this is one of the good use cases or not.

See my example below of how to train a small circuit to learn a cosine function. Note that we don’t use the sine function at all in the code.

#Import your favourite libraries
import pennylane as qml
from pennylane import numpy as pnp
# We import matplotlib.pyplot to get a nice graph of the results
import matplotlib.pyplot as plt

# Our data consists of 10 datapoints equally spaced from 0 to 2pi
X = pnp.linspace(0, 2*pnp.pi, 10) # Training data

# Our function is the sine of the input
Y = pnp.sin(X)

# The labels for training follow a cosine function
dYdX = pnp.cos(X)

# Create the test data
# 10 test datapoints, shifted from the training data by 0.2
X_test = pnp.linspace(0.2, 2*pnp.pi+0.2, 10)
Y_test = pnp.sin(X_test) 
dYdX_test = pnp.cos(X_test) # The labels for the test datapoints

dev = qml.device('default.qubit', wires=1)

@qml.qnode(dev)
def quantum_circuit(datapoint, params):
    # Encoding
    qml.RX(datapoint, wires=0)
    # Ansatz
    qml.Rot(params[0],params[1],params[2], wires=0)
    return qml.expval(qml.PauliZ(wires=0))

qml.draw_mpl(quantum_circuit, style='pennylane')(0.1, [0.2, 0.3, 0.3]);

def loss_func(predictions, targets):
    # This is a postprocessing step. Here we use a least squares metric
    # based on the predictions of the quantum circuit and the labels
    # of the training data points.

    total_loss = 0
    for i in range(len(targets)):
        label = targets[i]
        prediction = predictions[i]
        loss = (prediction - label)**2
        total_loss += loss
    return total_loss

def cost_fn(params, X, Y):
    # For a single value of the parameters we run the quantum circuit 10 times, once per datapoint (x)
    predictions = [quantum_circuit(x, params)  for x in X]
    cost = loss_func(predictions, Y)
    return cost

# Choose an optimizer
opt = qml.GradientDescentOptimizer(stepsize=0.3)

# Make an initial guess for the parameters and set them as trainable
init_params = pnp.array([1.,1.,1.],requires_grad=True)

# Set your data as non-trainable
X.requires_grad = False

# Choose a number of steps
max_iter = 100

def optimization_routine(max_iter,opt,cost_fn,params,X,Y):
  # Iterate over a number of defined steps (max_iter)
  for i in range (max_iter):
      # At each step the parameters change to give a better cost
      [params,_,_], prev_cost = opt.step_and_cost(cost_fn,params,X,Y)
      if i%10 == 0:
          # We print the result after every 10 steps
          print(f'Step = {i} Cost = {cost_fn(params,X,Y)}')

  # Print the value of your optimized parameters
  print('Final parameters: ', params)
  return params

final_params = optimization_routine(max_iter,opt,cost_fn,init_params,X,dYdX)

# Find the output of your quantum circuit for your test data
test_predictions = [quantum_circuit(x_test,final_params) for x_test in X_test]

# Create a scatter plot showing the training and test data as well as the test predictions
def plot_results(X,dYdX,X_test,dYdX_test,test_predictions):
  fig = plt.figure()
  ax1 = fig.add_subplot(111)

  ax1.scatter(X, dYdX, s=30, c='b', marker="s", label='Train labels')
  ax1.scatter(X_test,dYdX_test, s=60, c='r', marker="o", label='Test labels')
  ax1.scatter(X_test,test_predictions, s=30, c='k', marker="x", label='Test predicitons')
  plt.xlabel("Inputs")
  plt.ylabel("Labels")
  plt.title("QML results")

  plt.legend(loc='upper right');
  plt.show()

plot_results(X,dYdX,X_test,dYdX_test,test_predictions)
1 Like

Hi @CatalinaAlbornoz,

Thanks for the thorough reply, I really appreciate it!

  1. AmplitudeEmbedding: the outcome that the results of the AmplitudeEmbedding are the same as the input vector is what I want. But the circuit continues with the hardware-efficient ansatz that makes the circuit variational.
  2. The example you gave is not exactly what I’m trying to do; what I want to implement here is an Ordinary Differential Equations (ODE) solver (which will later become PDE solver if all goes well). this means that the circuit tries to imitate the original function Y (sine in this case), but it needs to learn it only according to it’s derivative dYdX (boundary conditions are not added here for simplicity). This is where my deep interest in jacobian() comes from.
  3. I want to encode all of the points together into the circuit and solve them all at once. I agree with you that quantum circuits usually perform worse than classical in such tasks, but if it gets exponential speed up it could be better (at least worth checking), that is why I want to differentiate each probability with respect to the corresponding point in the input, and I want to understand if this kind of differentiation resembles taking spatial derivatives.

If we look at it classically:
We have a 1D function Y, a neural network (NN) and parameters (params). The input to the NN are the X coordinates of several points, and the output is U(X, params).
The loss function is:

loss_fn(NN, params, X, dYdX, Y): # this is not the exact function but I think it clears what I want to achieve 
    U = NN(params, X)
    dUdX = differentiate(U,X) # this is pseudo code...
    train_loss = dUdX - dYdX
    test_loss = U - Y

And the optimization process will update the parameters according to the train_loss, where I hope that the test_loss will go down as well.

I hope it is clearer now what I’m trying to do, sorry for not making the first post as informative.

Thanks a lot in advance,
Ziv

Hi @ZivChen ,

If you’re looking to do ODEs then you’ll indeed need something completely different. You’ll need a ParametrizedHamiltonian in this case.

I would recommend looking at the demos here to see how this works, especially the demos on pulse programming and ODEgen. Maybe the ones on optimal control and implicit differentiation can help too.

Let me know if you have any questions after taking a look at these demos :smiley:

1 Like

Thanks a lot! I’ll look into those links.
Anyway, I’ve simplified the code a lot and realized that the jacobian is really not what I’m looking for as it doesn’t give a result of a spatial derivative…

Oh that’s good to know @ZivChen !

Hi,

I looked into the links you’ve sent and it is not exactly what I meant.
To address and simplify everything, I want to see if I can differentiate the circuit according to the input_points and get a spatial derivative of the points as the output, which is what I expect to get.

To check that, I created the following circuit:

def circuit(input_points):
    qml.MottonenStatePreparation(state_vector=input_points, wires=range_qubits)
    prob = qml.probs(wires=range_qubits)
    return prob

my input_points are 2^(#qubits) equispaced values on a line, normalized.
But when I calculate it’s derivative (using jacobian), I expect to get a constant but instead I get something that looks a lot like the linear case.

My questions are:

  1. Did I get what the derivative does wrong?
  2. Is there a way to actually get the spatial derivative from according to the input?

Thanks!

Here is some working code:
import pennylane as qml
from pennylane import numpy as pnp
import matplotlib.pyplot as plt
import pylab as pl

seed = 42
n_qubits = 5
dev = qml.device('default.qubit')
range_qubits = range(n_qubits)
n_collocation_points = 2 ** n_qubits

def normalize(x):
    return x / pnp.linalg.norm(x)

def init_collocation_points(): # original
    """
    initializing the collocation points.
    """
    x = pnp.linspace(0, pnp.pi, n_collocation_points)  # remove the requires_grad kwarg
    xnorm = normalize(x)
    xnorm = xnorm + 1e-10  # to avoid division by zero
    xnorm = normalize(xnorm)
    return xnorm, x


collocation_points, x = init_collocation_points()
xnorm = collocation_points.copy()
collocation_points.requires_grad = True
x.requires_grad = False
grad_x = pnp.gradient(xnorm) # the actual spatial derivative of the collocation points
grad_x = normalize(grad_x)

@qml.qnode(dev)
def circuit(collocation_points):
    qml.MottonenStatePreparation(state_vector=collocation_points, wires=range_qubits)
    prob = qml.probs(wires=range_qubits)
    return prob

def plot_grad(derivative):
    fig, axs = pl.subplots(1)
    axs.plot(derivative)
    axs.plot(grad_x)
    axs.plot(xnorm)
    axs.legend(['derivative', 'grad_x', 'collocation_points'])
    plt.show()

def grad_diff_check(collocation_points):
    d_psi_dx_mat = qml.jacobian(circuit)(collocation_points)
    d_psi_dx_vec = pnp.diag(d_psi_dx_mat) # I want to have only the derivatives of the each state with respect to the corresponding x
    plot_grad(d_psi_dx_vec)

grad_diff_check(collocation_points)

Hi @ZivChen,

Thank you for these additional details. Mottonen isn’t always differentiable so it might cause issues here. Let me check and get back to you next week.

1 Like

Hi @CatalinaAlbornoz,

Thanks for your quick reply!
I tried it with StatePrep and AmplitudeEmbedding but didn’t get the correct results as well.

I also tried to implement “manual” derivative where I use finite difference method and change by a small amout h each collocation_point (the name for my input points) and measure the specific observable for that point.
Using this method I got the same results as with the Mottonen’s derivative, so I think that there is something deeper in it.

thanks a lot for your kind help,
Ziv

You’re right @ZivChen ,

There might be something else going on. I’m making some tests and will get back to you with any findings.

1 Like

Hi @ZivChen ,

I’ll get someone else in the team to take a look at your question. We’ll be back soon.

1 Like