Hi, everyone!

I recently learned about the adjoint differentiation method and its superior performance on the lightning.qubit simulator.

I’m working with hybrid models (using keras and pennylane), and it would be great to have such performance increases, to iterate faster over my experiments.

The basic code I’m using is the following (this is adapted to the iris dataset, so 4 features and 3 categorical levels for the target):

```
n_qubits = X_train.shape[1]
n_classes = y_train.nunique()
n_var_layers = 2
# ================================================
dev = qml.device("default.qubit", wires=n_qubits)
@qml.qnode(dev)
def vqc_layer(inputs, weights):
qml.AngleEmbedding(inputs, wires=range(n_qubits))
qml.StronglyEntanglingLayers(weights, wires=range(n_qubits))
return [qml.expval(qml.PauliZ(i)) for i in range(n_qubits)]
weight_shapes = {"weights": (n_var_layers, n_qubits, 3)}
# ================================================
ql = qml.qnn.KerasLayer(vqc_layer, weight_shapes, output_dim=n_qubits)
cl = tf.keras.layers.Dense(n_classes, activation='softmax')
qcnn = tf.keras.models.Sequential([ql, cl])
# ================================================
opt = tf.keras.optimizers.Adam(learning_rate=0.01)
qcnn.compile(loss='categorical_crossentropy', optimizer=opt, metrics=["accuracy"])
# ===========================================
history = qcnn.fit(X_train, y_train_tf,
epochs=10, batch_size=10,
validation_data=(X_val, y_val_tf))
```

The code above works perfectly, but it takes ~2 minutes to run the 10 epochs – and that’s entirely because of the vqc_layer: for a similar but fully classical architecture, the training is almost instantaneous. (And an analogous behavior is observed for larger datasets, although, in these cases, the times for training the hybrid models easily reach hours).

Given that, I had high hopes that adjoint differentiation would help to reduce the training times, so, at first I tried to simply change the device and decorator of the qnode to this (keeping everything else the same):

```
dev = qml.device("lightning.qubit", wires=n_qubits)
@qml.qnode(dev, diff_method="adjoint")
def vqc_layer(inputs, weights):
qml.AngleEmbedding(inputs, wires=range(n_qubits))
qml.StronglyEntanglingLayers(weights, wires=range(n_qubits))
return [qml.expval(qml.PauliZ(i)) for i in range(n_qubits)]
```

But then I got the following error:

```
InvalidArgumentError: cannot compute Mul as input #1(zero-based) was expected to be a float tensor but is a double tensor [Op:Mul]
```

And I think I know the reason: as far as I know, the adjoint differentiation only works for a scalar output of the qnode. But, as you can see, I’m measuring all qubits and returning a list of expectation values, which is the reason why adjoint differentiation is failing, as far as I can tell.

I tried to make a qnode which returns a single measurement (only to see if it works), but it didn’t either (although in this case, the error was due to a dimensionality mismatch between layers). Anyway though, this wouldn’t be the best solution, since it really changes the overall model architecture.

Given all this context, my question is the following: **is there a way to use adjoint differentiation to speedup simulations of hybrid models whose quantum layers return a list of measurements?**

In case it’s helpful: I’m using pennylane version 0.22.1 and tensorflow version 2.8.0.

If you need any further details, please let me know.

Thank you very much!