Using `argnum` in parameter shift rule

Hello,

I am trying to use parameter shift rule to obtain the gradient of a QNode with respect to one of the input parameters. To speed-up the process, I want to use argnum= parameter to take the gradient w.r.t just one parameter of my choice. I need to specify the parameter index in the argnum=. I am not able to understand how to count the index of the parameter, when there are two sets of inputs. In particular, I have two sets of parameters, weights and X as shown in the code below. I want to take the derivative w.r.to the parameters in X (it has just 1 element).

By trial and error, I found that the index goes like: the elements of weights and X and then weights again, as it is the order of gate layers in the circuit. Therefore, to obtain the gradient w.r.t the elements in X, I chose the argnum=3*x +1.

I tried to take the gradient w.r.t all parameters without specifying argnum and by specifying argnum to be 3*x+1 to match the results. While these results match for 2 qubits, for 3 qubits, these results do not match as shown below. When I tried to make similar tests for the argnum=1,2,3.. which correspond the elements W, they seem to match in both the cases as well.

I might be doing a simple arithmatic error. However, I would truly appreciate if someone gives an idea about how to count the index of the parameters when there are multiple sets of input parameters. Thanks in advance!

# Put code here
import pennylane as qml
from pennylane import numpy as np
from pennylane.templates import StronglyEntanglingLayers
dev = qml.device('default.qubit', wires=3)

r=3

np.random.seed(42)
def S(x):
    """Data encoding circuit block."""
    for w in range(r):
        qml.RX(x, wires=w)


def W(theta):
    """Trainable circuit block."""
    StronglyEntanglingLayers(theta, wires=range(r))


@qml.qnode(dev)
def circuit(w, x):
    W(w[0])
    S(x)
    W(w[1])
    return qml.expval(qml.PauliZ(wires=0))

X = np.linspace(0, 1, 2, requires_grad=True)
weights = 2 * np.pi * np.random.random(size=(2, 1, r, 3), requires_grad=True)

d1 = []
d2 = []
for i in X:
    d1.append(qml.gradients.param_shift(circuit, argnum=r*3+1)(weights, i)[1])
    d2.append(qml.gradients.param_shift(circuit)(weights, i)[1])
print(d1)
print(d2)

If you want help with diagnosing an error, please put the full error message below:

Output

[tensor(-0.1784049, requires_grad=True), tensor(-0.35629666, requires_grad=True)]
[tensor(-0.27362937, requires_grad=True), tensor(-0.43677208, requires_grad=True)]
And, finally, make sure to include the versions of your packages. Specifically, show us the output of qml.about().
Name: PennyLane
Version: 0.35.1
Summary: PennyLane is a cross-platform Python library for quantum computing, quantum machine learning, and quantum chemistry. Train a quantum computer the same way as a neural network.
Home-page: GitHub - PennyLaneAI/pennylane: PennyLane is a cross-platform Python library for quantum computing, quantum machine learning, and quantum chemistry. Train a quantum computer the same way as a neural network.
Author:
Author-email:
License: Apache License 2.0
Location: c:\users\hedge\miniconda3\envs\quantum_fourier_model\lib\site-packages
Requires: appdirs, autograd, autoray, cachetools, networkx, numpy, pennylane-lightning, requests, rustworkx, scipy, semantic-version, toml, typing-extensions
Required-by: PennyLane_Lightning

Platform info: Windows-10-10.0.22631-SP0
Python version: 3.10.13
Numpy version: 1.26.4
Scipy version: 1.12.0
Installed devices:

  • default.clifford (PennyLane-0.35.1)
  • default.gaussian (PennyLane-0.35.1)
  • default.mixed (PennyLane-0.35.1)
  • default.qubit (PennyLane-0.35.1)
  • default.qubit.autograd (PennyLane-0.35.1)
  • default.qubit.jax (PennyLane-0.35.1)
  • default.qubit.legacy (PennyLane-0.35.1)
  • default.qubit.tf (PennyLane-0.35.1)
  • default.qubit.torch (PennyLane-0.35.1)
  • default.qutrit (PennyLane-0.35.1)
  • null.qubit (PennyLane-0.35.1)
  • lightning.qubit (PennyLane_Lightning-0.35.1)

Hi @Pratibha_Hegde , welcome to the Forum!

argnum counts the arguments of your circuit so argnum=0 indicates ‘w’ and argnum=1 indicates ‘x’. Note that argnum starts counting at zero.

I think your issue here was in how you were calculating the gradients. I’ve adapted your code below to show the output of calculating the gradient with respect to both arguments and each of them individually. I hope this helps!

import pennylane as qml
from pennylane import numpy as np
from pennylane.templates import StronglyEntanglingLayers
dev = qml.device('default.qubit', wires=3)

r=3 # num_qubits

np.random.seed(42)
def S(x):
    """Data encoding circuit block."""
    for w in range(r):
        qml.RX(x, wires=w)


def W(theta):
    """Trainable circuit block."""
    StronglyEntanglingLayers(theta, wires=range(r))

# You can add the differentiation method when you define your QNode
@qml.qnode(dev,diff_method='parameter-shift')
def circuit(w, x):
    W(w[0])
    S(x)
    W(w[1])
    return qml.expval(qml.PauliZ(wires=0))

X = np.linspace(0., 1., num=2, requires_grad=True)
weights = 2 * np.pi * np.random.random(size=(2, 1, r, 3), requires_grad=True)

# Draw your circuit
qml.draw_mpl(circuit,decimals=1,expansion_strategy='device')(weights,X[0])

# Calculate your gradient function with respect to all or just one argument
g = qml.grad(circuit) # Gradient wrt w and x
g0 = qml.grad(circuit,argnum=0) # Gradient wrt w
g1 = qml.grad(circuit,argnum=1) # Gradient wrt x


d = []
d0 = []
d1 = []

for i in X:
    d.append(g(weights, i))
    d0.append(g0(weights, i))
    d1.append(g1(weights, i))
  
print('d: ',d)
print('d0: ',d0)
print('d1: ',d1)

Dear Catalina,

Thank you very much! This helps a lot! Just a follow-up question. If I want to calculate the second order derivative w.r.t x, can I use the higher order gradients as shown in qml.gradients — PennyLane 0.36.0 documentation, i.e, when I use @qml.qnode, I add an additional argument “max_diff”

@qml.qnode(dev,diff_method='parameter-shift', max_diff=2)

and then define second order derivative with few additional lines as follows

#Defining second order derivative
g2 = qml.jacobian(qml.jacobian(circuit, argnum=1), argnum=1)
d2 = []
for i in X:
     d2.append(g2(weights, i))

Thank you once again!

Hi @Pratibha_Hegde ,

Yes I think you should be able to calculate second order derivatives in this way. With the code that you have you’re calculating a hessian (so a matrix). You just need to be careful if using this within an optimization loop because it can sometimes be tricky to make it work. So please let me know if you have any issues with this!

Hi @CatalinaAlbornoz, thank you once again and I apologize for getting back quite late on the topic. I am not using the second derivate for the optimization loop for now. I used qml.jacobian(qml.jacobian(circuit, argnum=1), argnum=1) to compute the second gradient and it works fine. I also tried to use qml.gradients.param_shift_hessian() function to compute the second derivatives directly w.r.t. x. Again, I am stuck in using argnum in this case. I resorted to count the arguments like before(starting from 0 this time) where I was counting the rotation angles parameters of the first layer of gates followed the by the second layer which encodes the variable x. Since x appears r times in the second layer, it seems I have to give all the argnum= between 3r to 4r. With this method, the hessian that I obtained is the same as qml.jacobian(qml.jacobian(circuit, argnum=1), argnum=1) method, and it is considerably faster and that is why I would like to use the qml.gradients.param_shift_hessian(). However, I wanted to make sure that counting argument number this way is correct (which was clearly wrong when using qml.grad()). I would be very grateful if you could share your thoughts and suggestions on this. I have attached the code that I am using below:

import pennylane as qml
from pennylane import numpy as np
from pennylane.templates import StronglyEntanglingLayers
import time

dev = qml.device('default.qubit', wires=3)

r=3 # num_qubits

np.random.seed(42)
def S(x):
    """Data encoding circuit block."""
    for w in range(r):
        qml.RX(x, wires=w)


def W(theta):
    """Trainable circuit block."""
    StronglyEntanglingLayers(theta, wires=range(r))

# You can add the differentiation method when you define your QNode
@qml.qnode(dev,diff_method='parameter-shift', max_diff=2)
def circuit(w, x):
    W(w[0])
    S(x)
    W(w[1])
    return qml.expval(qml.PauliZ(wires=0))

X = np.linspace(0., 1., num=10, requires_grad=True)
weights = 2 * np.pi * np.random.random(size=(2, 1, r, 3), requires_grad=True)

# Draw your circuit
qml.draw_mpl(circuit,decimals=1,expansion_strategy='device')(weights, X[0])

# Calculate your gradient function with respect to all or just one argument
g2 = qml.jacobian(qml.jacobian(circuit, argnum=1), argnum=1)

#Defining Hessian
def hessian(func, w, x):
    argnum = np.arange(3*r, 4*r, 1) #argument index corresponding to x
    hess = [qml.gradients.param_shift_hessian(func, argnum=argnum)(w, x_)[1] for x_ in x]
    return hess

d2 = []
h = []

t1 = time.time()
for i in X:
    d2.append(g2(weights, i))
t2 = time.time()
print('time to compute jacobian(jacobian):', t2-t1)

t3 = time.time()
h.append(hessian(circuit, weights, X))
t4 = time.time()
print('time to compute hessian:', t4-t3)

print('d2 using jacobian(jacobian):', d2)
print('h using hessian:', h)

Hi @Pratibha_Hegde ,

Your method does seem to work because it’s counting the parameters from the tape that runs the circuit under the hood. So you have 3 parameters per wire per layer of StronglyEntanglingLayers, meaning that parameters 0-8 correspond to this. Then you have 3 parameters corresponding to each of your RX gates, one per wire, and then you have more parameters for your other StronglyEntanglingLayers. So your choice of argnum = np.arange(3*r, 4*r, 1) does seem correct in this case.

I’m glad to see that you see such a big performance improvement!

Thank you very much @CatalinaAlbornoz ! This clarifies my questions and I can run my codes with confidence now. Have a great day and a great summer :slight_smile:

Same @Pratibha_Hegde !