Training with QNG optimizer on circuit with data argument

Hi Pennylane-Community!

I recently implemented a circuit for a Quantum Autoencoder.
Using a classical optimizer like Adam, everything works great, but I would like to use Quantum Natual Gradient Descent and I am running into some problems trying to implement that.

My circuit is a QNode, which takes the trainable parameters and some input data as arguments and return an expectation value:

@qml.qnode(dev1)
def circuit(params, data):
    // data encoding and parametrized circuit ...
    return qml.expval(qml.PauliZ(TOTAL_QBITS-1))

To train the model I defined a cost function for a single data sample:

def cost(params, single_sample):
     return (1 - circuit(params, single_sample)) ** 2

Now I would like to iterate over all training samples and optimizes the cost with the QNGoptimizer. All the examples I found had the data for the QNode hard coded into the circuit but of course I want to optimize for a large dataset and I can’t seem to make the QNGoptimizer work with a QNode with two arguments.

opt = qml.QNGOptimizer(learning_rate)

for it in range(epochs):
    for j, sample in enumerate(x_train):
        metric_fn = qml.metric_tensor(circuit, approx="block-diag")
        params, _ = opt.step(cost, params, sample,  metric_tensor_fn=metric_fn)

    loss = cost(params)

    print(f"Epoch: {it} | Loss: {loss} |")

I checked the output of the metric function metric_fn for some arbitrary input of weights and a data sample and it returns a tuple of two metric tensors, one for the weights and one for the input data. This tuple can’t be used by the opt.step to optimize.

Any suggestions how I can fix this?

A second question would be If it is possible to extend this training to batches.

Thanks!

Greetings
Tom

Hey @ToMago! Looking at the documentation for the QNG optimizer, it’s tough to say why your code isn’t working without seeing the exact error readout. Can you provide that?

Thanks for your answer!
Of course I can post the full error trace, however I found it to be a bit misleading:

Error trace
TypeError                                 Traceback (most recent call last)
Input In [72], in <cell line: 3>()
      6     metric_fn = qml.metric_tensor(circuit, approx="block-diag")
      7     print(metric_fn(np.array([[1,2,3,4]]),sample))
----> 8     params, _ = opt.step(cost_sample, params, sample, metric_tensor_fn=metric_fn)
      9     print(j, end="\r")
     11 loss = cost(params)

File ~/.conda/envs/tfq/lib/python3.9/site-packages/pennylane/optimize/qng.py:269, in QNGOptimizer.step(self, qnode, grad_fn, recompute_tensor, metric_tensor_fn, *args, **kwargs)
    245 def step(
    246     self, qnode, *args, grad_fn=None, recompute_tensor=True, metric_tensor_fn=None, **kwargs
    247 ):
    248     """Update the parameter array :math:`x` with one step of the optimizer.
    249 
    250     Args:
   (...)
    267         array: the new variable values :math:`x^{(t+1)}`
    268     """
--> 269     new_args, _ = self.step_and_cost(
    270         qnode,
    271         *args,
    272         grad_fn=grad_fn,
    273         recompute_tensor=recompute_tensor,
    274         metric_tensor_fn=metric_tensor_fn,
    275         **kwargs,
    276     )
    277     return new_args

File ~/.conda/envs/tfq/lib/python3.9/site-packages/pennylane/optimize/qng.py:212, in QNGOptimizer.step_and_cost(self, qnode, grad_fn, recompute_tensor, metric_tensor_fn, *args, **kwargs)
    210 shape = qml.math.shape(_metric_tensor)
    211 size = qml.math.prod(shape[: len(shape) // 2])
--> 212 self.metric_tensor = qml.math.reshape(_metric_tensor, (size, size))
    213 # Add regularization
    214 self.metric_tensor = self.metric_tensor + self.lam * qml.math.eye(
    215     size, like=_metric_tensor
    216 )

File ~/.conda/envs/tfq/lib/python3.9/site-packages/autoray/autoray.py:85, in do(fn, like, *args, **kwargs)
     82 else:
     83     backend = infer_backend(like)
---> 85 return get_lib_fn(backend, fn)(*args, **kwargs)

File <__array_function__ internals>:180, in reshape(*args, **kwargs)

File ~/.conda/envs/tfq/lib/python3.9/site-packages/numpy/core/fromnumeric.py:298, in reshape(a, newshape, order)
    198 @array_function_dispatch(_reshape_dispatcher)
    199 def reshape(a, newshape, order='C'):
    200     """
    201     Gives a new shape to an array without changing its data.
    202 
   (...)
    296            [5, 6]])
    297     """
--> 298     return _wrapfunc(a, 'reshape', newshape, order=order)

File ~/.conda/envs/tfq/lib/python3.9/site-packages/numpy/core/fromnumeric.py:54, in _wrapfunc(obj, method, *args, **kwds)
     52 bound = getattr(obj, method, None)
     53 if bound is None:
---> 54     return _wrapit(obj, method, *args, **kwds)
     56 try:
     57     return bound(*args, **kwds)

File ~/.conda/envs/tfq/lib/python3.9/site-packages/numpy/core/fromnumeric.py:43, in _wrapit(obj, method, *args, **kwds)
     41 except AttributeError:
     42     wrap = None
---> 43 result = getattr(asarray(obj), method)(*args, **kwds)
     44 if wrap:
     45     if not isinstance(result, mu.ndarray):

TypeError: 'numpy.float64' object cannot be interpreted as an integer

In this case the error is thrown because the metric_fn function returns a tuple of two metric tensors, one for the weights, one for the data and of course the reshaping in the QNG class does not work on a tuple.

I tried to kind of fix this but never got anywhere, so I suppose the QNG optimizer only works on functions with a single argument?

But how to use it in a real application with a larger dataset and a custom loss function?

Hi @ToMago! When you create the data, have you tried setting requires_grad=False?

data = np.array(data, requires_grad=False)
2 Likes

Thanks @josh!
This does indeed fix the first issue, when I call metric_fn with
data = np.array(data, requires_grad=False)
it only gives one metric tensor and not a tuple. Yay!

However I still get an error:

Error Trace
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [74], in <cell line: 3>()
      6     metric_fn = qml.metric_tensor(circuit, approx="block-diag")
      7     print(metric_fn(np.array([[1,2,3,4]]),np.array(sample,requires_grad=False)))
----> 8     params, _ = opt.step(cost_sample, params, np.array(sample,requires_grad=False), metric_tensor_fn=metric_fn)
      9     print(j, end="\r")
     11 loss = cost(params)

File ~/.conda/envs/tfq/lib/python3.9/site-packages/pennylane/optimize/qng.py:269, in QNGOptimizer.step(self, qnode, grad_fn, recompute_tensor, metric_tensor_fn, *args, **kwargs)
    245 def step(
    246     self, qnode, *args, grad_fn=None, recompute_tensor=True, metric_tensor_fn=None, **kwargs
    247 ):
    248     """Update the parameter array :math:`x` with one step of the optimizer.
    249 
    250     Args:
   (...)
    267         array: the new variable values :math:`x^{(t+1)}`
    268     """
--> 269     new_args, _ = self.step_and_cost(
    270         qnode,
    271         *args,
    272         grad_fn=grad_fn,
    273         recompute_tensor=recompute_tensor,
    274         metric_tensor_fn=metric_tensor_fn,
    275         **kwargs,
    276     )
    277     return new_args

File ~/.conda/envs/tfq/lib/python3.9/site-packages/pennylane/optimize/qng.py:219, in QNGOptimizer.step_and_cost(self, qnode, grad_fn, recompute_tensor, metric_tensor_fn, *args, **kwargs)
    214     self.metric_tensor = self.metric_tensor + self.lam * qml.math.eye(
    215         size, like=_metric_tensor
    216     )
    218 g, forward = self.compute_grad(qnode, args, kwargs, grad_fn=grad_fn)
--> 219 new_args = np.array(self.apply_grad(g, args), requires_grad=True)
    221 if forward is None:
    222     forward = qnode(*args, **kwargs)

File ~/.conda/envs/tfq/lib/python3.9/site-packages/pennylane/optimize/qng.py:293, in QNGOptimizer.apply_grad(self, grad, args)
    291 grad_flat = np.array(list(_flatten(grad)))
    292 x_flat = np.array(list(_flatten(args)))
--> 293 x_new_flat = x_flat - self.stepsize * np.linalg.solve(self.metric_tensor, grad_flat)
    294 return unflatten(x_new_flat, args)

File ~/.conda/envs/tfq/lib/python3.9/site-packages/pennylane/numpy/tensor.py:155, in tensor.__array_ufunc__(self, ufunc, method, *inputs, **kwargs)
    151 args = [i.unwrap() if hasattr(i, "unwrap") else i for i in inputs]
    153 # call the ndarray.__array_ufunc__ method to compute the result
    154 # of the vectorized ufunc
--> 155 res = super().__array_ufunc__(ufunc, method, *args, **kwargs)
    157 if ufunc.nout == 1:
    158     res = (res,)

ValueError: operands could not be broadcast together with shapes (24,) (20,) 

I only changed

params, _ = opt.step(cost, params, np.array(sample, requires_grad=False),  metric_tensor_fn=metric_fn)

from my above code.

I guess the optimizer still sees the input data as parameters and calculates gradients for them so that the dimension with the metric tensor does not match anymore?

For clarity i should mention, the shape of the parameters is (5,4) and the datapoints are 4 dimensional leading to the dimension mismatch of 20 and 24.

1 Like

Glad @josh was able to help you! Is it possible that you need to change the problematic line to

params = opt.step(cost, params, np.array(sample, requires_grad=False), metric_tensor_fn=metric_fn)

The function opt.step returns an array (your new parameters), not a tuple.

Yes, that is probably true now that the data has no grad, however that does not resolve the error with the two different dimensions in apply_grad.

Hi @ToMago, I believe I have a solution, which entails using creating a lambda function so that cost and metric_fn only accept params as input:

import pennylane as qml
from pennylane import numpy as np

dev = qml.device("default.qubit", wires=3)

@qml.qnode(dev)
def circuit(params, data):
    qml.AngleEmbedding(data, wires=[0, 1, 2])
    qml.StronglyEntanglingLayers(params, wires=[0, 1, 2])
    return qml.expval(qml.PauliZ(2))

data = np.random.random([3], requires_grad=False)
params = np.random.random(qml.StronglyEntanglingLayers.shape(3, 3), requires_grad=True)

def cost(params, single_sample):
     return (1 - circuit(params, single_sample)) ** 2

opt = qml.QNGOptimizer()

for it in range(10):
    cost_fn = lambda p: cost(p, data)
    metric_fn = lambda p: qml.metric_tensor(circuit, approx="block-diag")(p, data)

    params, loss = opt.step_and_cost(cost_fn, params,  metric_tensor_fn=metric_fn)

    print(f"Epoch: {it} | Loss: {loss} |")
Epoch: 0 | Loss: 0.5582732050743116 |
Epoch: 1 | Loss: 0.41371200375696354 |
Epoch: 2 | Loss: 0.3163450001396077 |
Epoch: 3 | Loss: 0.2494155603676695 |
Epoch: 4 | Loss: 0.2020055226900949 |
Epoch: 5 | Loss: 0.16737305015180642 |
Epoch: 6 | Loss: 0.1413480948325721 |
Epoch: 7 | Loss: 0.12129520018671695 |
Epoch: 8 | Loss: 0.10550239221136215 |
Epoch: 9 | Loss: 0.09282592043979121 |

Let me know if that ends up working for you! [Note: for simplicity, I slightly modified your example to remove the minibatches]

1 Like

Yes this way it works!
Thanks so much, this is super helpful!

So would it be possible to implement a version with minibatches and the QNG optimizer ?

I’m not sure how this would be done since the Fubiny-Study metric can only be calculated for a single sample of input data, so if I use the update rule for QNG
w:= w - \eta g^{-1} \nabla_w\mathcal{L}_i(w)
where \mathcal{L}_i is the loss for a single sample and replace it with \mathcal{L}_i \rightarrow \frac{1}{n}\sum_{i=0}^n \mathcal{L}_i, how do I update the weights ?

Sorry, I’m not too familiar with natural gradient descent, do I also average the metric tensors for the different data points?

@josh Correct me if I’m wrong here, but we’d just need to modify the cost function to accommodate batches as follows:

batch_size = 10
data = [np.random.random([3], requires_grad=False) for _ in range(batch_size)]
params = np.random.random(qml.StronglyEntanglingLayers.shape(3, 3), requires_grad=True)

def cost(params, batch):
     return np.sum((1 - circuit(params, single_sample)) ** 2 for single_sample in batch) / batch_size

(this would be for a “mean” reduction method of the cost function). Then, you loop over all of your batches, perform the parameter update in the same way, and that constitutes one “iteration”.