TypeError while computing the gradient

Hi,

I’m facing an error with the qml.GradientDescentOptimizer . Well, basically the issue is with computing the gradient of my function which results in the error

TypeError: must be real number, not ArrayBox

It is to be noted that I’m also computing gradients using qml.grad() within the function being optimized (which is computed perfectly).

Snippet of the function which throws the error upon attempting to find gradient:

def FI(probe_state_params, encoded_phase=encoded_phase):
    mu_x, sigma_x = experiment(probe_state_params, encoded_phase)
    # derivative of mean
    grad_function = qml.grad(qnode1)
    mu_x2 = (grad_function(probe_state_params, encoded_phase)[1])
    # derivative of variance
    grad_function = qml.grad(qnode2)
    sigma_x2 = (grad_function(probe_state_params, encoded_phase)[1])
    FI = (mu_x2 ** 2)*(1/sigma_x) + 0.5*(((1/sigma_x)*sigma_x2) ** 2)
    print("FI =", FI)

return FI

Basically trying to find the gradient of FI() to optimize it.
Could you perhaps suggest what could be wrong and how I could rectify it?

Regards,
Kannan

Hey Kannan, thanks for the question!

I think we’ll need more context around the error. Can you post the code you have for the qnode?

Edit: can you also give us the full traceback of the error?

1 Like

Hey @Chase_Roberts,

Thank you for the reply. The qnodes basically return the expval and variance of an observable for single mode squeezed state.

def qnode1(probe_state_params, encoded_phase):
    preparation(probe_state_params[0],probe_state_params[1],probe_state_params[2],probe_state_params[3])
    encoding(encoded_phase)
    detection(HD_angle)
    return qml.expval(qml.X(0))

Similarly for qnode with variance.

And the full error is :

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-26-0d143eebdd14> in <module>
      1 grad_function = qml.grad(FI)
----> 2 grad = (grad_function(probe_state_params, encoded_phase))

~\Anaconda3\lib\site-packages\pennylane\_grad.py in __call__(self, *args, **kwargs)
     94         """Evaluates the gradient function, and saves the function value
     95         calculated during the forward pass in :attr:`.forward`."""
---> 96         grad_value, ans = self._get_grad_fn(args)(*args, **kwargs)
     97         self._forward = ans
     98         return grad_value

~\Anaconda3\lib\site-packages\autograd\wrap_util.py in nary_f(*args, **kwargs)
     18             else:
     19                 x = tuple(args[i] for i in argnum)
---> 20             return unary_operator(unary_f, x, *nary_op_args, **nary_op_kwargs)
     21         return nary_f
     22     return nary_operator

~\Anaconda3\lib\site-packages\pennylane\_grad.py in _grad_with_forward(fun, x)
    111         difference being that it returns both the gradient *and* the forward pass
    112         value."""
--> 113         vjp, ans = _make_vjp(fun, x)
    114 
    115         if not vspace(ans).size == 1:

~\Anaconda3\lib\site-packages\autograd\core.py in make_vjp(fun, x)
      8 def make_vjp(fun, x):
      9     start_node = VJPNode.new_root()
---> 10     end_value, end_node =  trace(start_node, fun, x)
     11     if end_node is None:
     12         def vjp(g): return vspace(x).zeros()

~\Anaconda3\lib\site-packages\autograd\tracer.py in trace(start_node, fun, x)
      8     with trace_stack.new_trace() as t:
      9         start_box = new_box(x, t, start_node)
---> 10         end_box = fun(start_box)
     11         if isbox(end_box) and end_box._trace == start_box._trace:
     12             return end_box._value, end_box._node

~\Anaconda3\lib\site-packages\autograd\wrap_util.py in unary_f(x)
     13                 else:
     14                     subargs = subvals(args, zip(argnum, x))
---> 15                 return fun(*subargs, **kwargs)
     16             if isinstance(argnum, int):
     17                 x = args[argnum]

<ipython-input-24-b1eb58ab96f4> in FI(probe_state_params, encoded_phase)
      6     # derivative of mean
      7     grad_function = qml.grad(qnode1)
----> 8     grad = (grad_function(probe_state_params, encoded_phase)[1])
      9     mu_x2 = grad
     10 

~\Anaconda3\lib\site-packages\pennylane\_grad.py in __call__(self, *args, **kwargs)
     94         """Evaluates the gradient function, and saves the function value
     95         calculated during the forward pass in :attr:`.forward`."""
---> 96         grad_value, ans = self._get_grad_fn(args)(*args, **kwargs)
     97         self._forward = ans
     98         return grad_value

~\Anaconda3\lib\site-packages\autograd\wrap_util.py in nary_f(*args, **kwargs)
     18             else:
     19                 x = tuple(args[i] for i in argnum)
---> 20             return unary_operator(unary_f, x, *nary_op_args, **nary_op_kwargs)
     21         return nary_f
     22     return nary_operator

~\Anaconda3\lib\site-packages\pennylane\_grad.py in _grad_with_forward(fun, x)
    119             )
    120 
--> 121         grad_value = vjp(vspace(ans).ones())
    122         return grad_value, ans
    123 

~\Anaconda3\lib\site-packages\autograd\core.py in vjp(g)
     12         def vjp(g): return vspace(x).zeros()
     13     else:
---> 14         def vjp(g): return backward_pass(g, end_node)
     15     return vjp, end_value
     16 

~\Anaconda3\lib\site-packages\autograd\core.py in backward_pass(g, end_node)
     19     for node in toposort(end_node):
     20         outgrad = outgrads.pop(node)
---> 21         ingrads = node.vjp(outgrad[0])
     22         for parent, ingrad in zip(node.parents, ingrads):
     23             outgrads[parent] = add_outgrads(outgrads.get(parent), ingrad)

~\Anaconda3\lib\site-packages\autograd\core.py in <lambda>(g)
     65                     "VJP of {} wrt argnum 0 not defined".format(fun.__name__))
     66             vjp = vjpfun(ans, *args, **kwargs)
---> 67             return lambda g: (vjp(g),)
     68         elif L == 2:
     69             argnum_0, argnum_1 = argnums

~\Anaconda3\lib\site-packages\pennylane\tape\interfaces\autograd.py in gradient_product(g)
    200             # pass, so we do not need to re-unwrap the parameters.
    201             self.set_parameters(self._all_params_unwrapped, trainable_only=False)
--> 202             jac = self.jacobian(device, params=params, **self.jacobian_options)
    203             self.set_parameters(self._all_parameter_values, trainable_only=False)
    204 

~\Anaconda3\lib\site-packages\pennylane\tape\tapes\qubit_param_shift.py in jacobian(self, device, params, **options)
    122         self._append_evA_tape = True
    123         self._evA_result = None
--> 124         return super().jacobian(device, params, **options)
    125 
    126     def parameter_shift(self, idx, params, **options):

~\Anaconda3\lib\site-packages\pennylane\tape\tapes\jacobian_tape.py in jacobian(self, device, params, **options)
    563 
    564         # execute all tapes at once
--> 565         results = device.batch_execute(all_tapes)
    566 
    567         # post-process the results with the appropriate function to fill jacobian columns with gradients

~\Anaconda3\lib\site-packages\pennylane\_device.py in batch_execute(self, circuits)
    360             self.reset()
    361 
--> 362             res = self.execute(circuit.operations, circuit.observables)
    363             results.append(res)
    364 

~\Anaconda3\lib\site-packages\pennylane\_device.py in execute(self, queue, observables, parameters, **kwargs)
    288 
    289             for operation in queue:
--> 290                 self.apply(operation.name, operation.wires, operation.parameters)
    291 
    292             self.post_apply()

~\Anaconda3\lib\site-packages\pennylane\devices\default_gaussian.py in apply(self, operation, wires, par)
    738 
    739         # get the symplectic matrix
--> 740         S = self._operation_map[operation](*par)
    741 
    742         # expand the symplectic to act on the proper subsystem

~\Anaconda3\lib\site-packages\pennylane\devices\default_gaussian.py in squeezing(r, phi)
    205     cp = math.cos(phi)
    206     sp = math.sin(phi)
--> 207     ch = math.cosh(r)
    208     sh = math.sinh(r)
    209     return np.array([[ch - cp * sh, -sp * sh], [-sp * sh, ch + cp * sh]])

TypeError: must be real number, not ArrayBox

Thank you in advance for your time.

Regards,
Kannan

Aw, that definitely looks like a bug then. Let me see if I can find someone to fix it.

1 Like

Hi @kannan_v :slightly_smiling_face:

Based on the description created the following example that executes well:

import pennylane as qml
from pennylane import numpy as np

dev = qml.device('default.gaussian', wires=1)

@qml.qnode(dev)
def circuit(x,y):
    qml.Squeezing(x,y, wires=[0])
    return qml.expval(qml.X(0))

opt = qml.GradientDescentOptimizer(0.3)

params = np.array([0.3, 0.4], requires_grad=True)

for i in range(100):
    params, _ = opt.step_and_cost(circuit, *params)

Could you perhaps provide a non-working example that reproduces the error?

Just a thought based on the error message, could it be that the r argument to Squeezing (squeezing amount) is not a free-parameter within the circuit, but was defined as such? :thinking:

Hello @antalszava,

Thank you for your reply.
Based on the example that you gave, the error can be reproduced as follows:

import pennylane as qml
from pennylane import numpy as np

dev = qml.device('default.gaussian', wires=1)

@qml.qnode(dev)
def circuit(x,y):
    qml.Squeezing(x,y, wires=[0])
    return qml.var(qml.X(0))

def myfun(x,y):
    var = circuit(x,y)
    grad_fn = qml.grad(circuit)
    var_grad = grad_fn(x,y)
    return var

opt = qml.GradientDescentOptimizer(0.3)

params = np.array([0.3, 0.4], requires_grad=True)

for i in range(100):
    params, _ = opt.step_and_cost(myfun, *params)

As I realize, the problem comes because of the usage of qml.grad() in the cost function, which is myfun().
If my diagnosis is right, I think the ArrayBox output of autograd or qml.grad() cannot be used in computing gradient again, which is what happens in the optimizer.

Is there any way around this problem?

Regards,
Kannan

Hi @kannan_v,

As I realize, the problem comes because of the usage of qml.grad() in the cost function, which is myfun() .

Yes, this looks like the source of the problem. Currently, the default.gaussian device does not support second derivatives.

We’re working hard to add support for second derivatives — currently, default.qubit supports second derivatives — however have not yet started work upgrading default.gaussian.


One solution that might work in the meantime is to use the strawberryfields.tf device; this device does support second derivatives!

import pennylane as qml
import tensorflow as tf

dev = qml.device('strawberryfields.tf', wires=1, cutoff_dim=12)


@qml.qnode(dev, interface="tf")
def circuit(x, y):
    qml.Squeezing(x, y, wires=[0])
    return qml.var(qml.X(0))


def myfun(x, y):
    with tf.GradientTape() as tape:
        var = circuit(x, y)

    var_grad = tape.gradient(var, [x, y])

    # this cost function depends on both the variance
    # and the gradient of the variance
    return tf.reduce_sum(var_grad) - tf.cast(var, tf.float64)


x = tf.Variable(0.3, dtype=tf.float64)
y = tf.Variable(0.3, dtype=tf.float64)

opt = tf.keras.optimizers.SGD(learning_rate=0.1)

for i in range(100):
    with tf.GradientTape() as tape:
        loss = myfun(x, y)

    gradients = tape.gradient(loss, [x, y])
    opt.apply_gradients(zip(gradients, [x, y]))

    print(f"loss: {loss}")

Note that:

  • Since the device is written using TensorFlow, we must also write the optimization using TensorFlow.
  • This device is a Fock-based simulator, so we need to provide a cutoff dimension.
2 Likes

Hello,

Thank you for the solution. I will try it out.

At the same time, I think I have figured a workaround the problem for now by computing the gradient manually using parameter shift rule instead of using qml.grad().

def myfun(x,y):
    var = circuit(x,y)
    shift = 0.1
    plus = circuit(x + shift, y)
    minus = circuit(x - shift, y)
    grad = (0.5 * (plus - minus))/np.sinh(shift)
    var_grad = grad
    
    return var_grad

The gradient values were verified to be correct and now the optimizer works without errors. I’m not sure if the second gradient may be right. Do you think this approach is technically correct?

Regards,
Kannan

The gradient values were verified to be correct and now the optimizer works without errors. I’m not sure if the second gradient may be right. Do you think this approach is technically correct?

This is a very nice solution! It does look correct to me.

1 Like

Hello @josh,

I was trying out the strawberryfields.tf device and the sample code you provided. But throws an error:

ValueError: Tensor conversion requested dtype float64 for Tensor with dtype float32: <tf.Tensor: shape=(1,), dtype=float32, numpy=array([0.5772516], dtype=float32)>

I haven’t used this simulator or tensorflow before. Could you perhaps help me with that?

Thank you in advance.

Regards,
Kannan

Are you initializing anything with dtype=float32? Do you know which variable is throwing this error?

No, I’m not. The minimal code that throws the error is:

import pennylane as qml
import tensorflow as tf

dev = qml.device('strawberryfields.tf', wires=1, cutoff_dim=5)

x = tf.Variable(0.3, dtype=tf.float64)
y = tf.Variable(0.3, dtype=tf.float64)

@qml.qnode(dev, interface="tf")
def circuit(x, y):
    qml.Squeezing(x, y, wires=[0])
    return qml.var(qml.X(0))

var = circuit(x, y)

Hey @kannan_v, I can’t seem to reproduce your error.

It might be a case where you need to upgrade your TensorFlow version? Locally, I am able to get the script running using TensorFlow 2.3, PennyLane 0.14.1, and Strawberry Fields 0.17.0.

That’s odd. I seem to be running TensorFlow 2.4.1 and the same version of PennyLane and StrawberryFields.

@kannan_v I upgraded my local version of TensorFlow to 2.4.1, but unfortunately can still not reproduce the issue :thinking:

@josh Here is the entire error log if that helps:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-22-246ad8e4d46a> in <module>
     12     return qml.var(qml.X(0))
     13 
---> 14 var = circuit(x, y)

~\Anaconda3\lib\site-packages\pennylane\tape\qnode.py in __call__(self, *args, **kwargs)
    531 
    532         # execute the tape
--> 533         res = self.qtape.execute(device=self.device)
    534 
    535         # FIX: If the qnode swapped the device, increase the num_execution value on the original device.

~\Anaconda3\lib\site-packages\pennylane\tape\tapes\tape.py in execute(self, device, params)
   1068             params = self.get_parameters()
   1069 
-> 1070         return self._execute(params, device=device)
   1071 
   1072     def execute_device(self, params, device):

~\Anaconda3\lib\site-packages\tensorflow\python\ops\custom_gradient.py in __call__(self, *a, **k)
    259 
    260   def __call__(self, *a, **k):
--> 261     return self._d(self._f, a, k)
    262 
    263 

~\Anaconda3\lib\site-packages\tensorflow\python\ops\custom_gradient.py in decorated(wrapped, args, kwargs)
    213 
    214     if context.executing_eagerly():
--> 215       return _eager_mode_decorator(wrapped, args, kwargs)
    216     else:
    217       return _graph_mode_decorator(wrapped, args, kwargs)

~\Anaconda3\lib\site-packages\tensorflow\python\ops\custom_gradient.py in _eager_mode_decorator(f, args, kwargs)
    436   """Implement custom gradient decorator for eager mode."""
    437   with tape_lib.VariableWatcher() as variable_watcher:
--> 438     result, grad_fn = f(*args, **kwargs)
    439   args = nest.flatten(args)
    440   all_inputs = list(args) + list(kwargs.values())

~\Anaconda3\lib\site-packages\pennylane\tape\interfaces\tf.py in _execute(self, params, **input_kwargs)
    166             res = np.hstack(res)
    167 
--> 168         return tf.convert_to_tensor(res, dtype=self.dtype), grad
    169 
    170     @classmethod

~\Anaconda3\lib\site-packages\tensorflow\python\util\dispatch.py in wrapper(*args, **kwargs)
    199     """Call target, and fall back on dispatchers if there is a TypeError."""
    200     try:
--> 201       return target(*args, **kwargs)
    202     except (TypeError, ValueError):
    203       # Note: convert_to_eager_tensor currently raises a ValueError, not a

~\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py in convert_to_tensor_v2_with_dispatch(value, dtype, dtype_hint, name)
   1403   """
   1404   return convert_to_tensor_v2(
-> 1405       value, dtype=dtype, dtype_hint=dtype_hint, name=name)
   1406 
   1407 

~\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py in convert_to_tensor_v2(value, dtype, dtype_hint, name)
   1413       name=name,
   1414       preferred_dtype=dtype_hint,
-> 1415       as_ref=False)
   1416 
   1417 

~\Anaconda3\lib\site-packages\tensorflow\python\profiler\trace.py in wrapped(*args, **kwargs)
    161         with Trace(trace_name, **trace_kwargs):
    162           return func(*args, **kwargs)
--> 163       return func(*args, **kwargs)
    164 
    165     return wrapped

~\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py in convert_to_tensor(value, dtype, name, as_ref, preferred_dtype, dtype_hint, ctx, accepted_result_types)
   1507       raise ValueError(
   1508           "Tensor conversion requested dtype %s for Tensor with dtype %s: %r" %
-> 1509           (dtype.name, value.dtype.name, value))
   1510     return value
   1511 

ValueError: Tensor conversion requested dtype float64 for Tensor with dtype float32: <tf.Tensor: shape=(1,), dtype=float32, numpy=array([0.5772516], dtype=float32)> 

The code used is the same as above.

Thanks @kannan_v! Could you try updating the QNode decorator to explicitly request backpropagation mode?

@qml.qnode(dev, interface="tf", diff_method="backprop") 
1 Like

@josh Yes, that did it! The error was resolved. Thank you very much for the support :slight_smile:

Regards,
Kannan

1 Like

Glad to hear it @kannan_v!

Hello again :slight_smile:

I wanted to know if I can use the paramter shift rules to compute the gradients on the X8 chip.

I read in another thread (Using PyTorch Gradients - #5 by josh) suggesting that its possible.

I understand that the gates are gaussian, but given that the measurement operation i.e, photon counting is non-Gaussian, can I still use the parameter-shift rules to find the gradient w.r.t to say a Rotation gate?

Regards,
Kannan