Hybrid Network not differentiating

Hey @Daniel63656!

Could you point me in the right direction on how I could do that?

Regarding the suggestion of using MultiControlledX or PauliRot, the idea would be just to swap out the lines

U = np.array([[m.cos(phi),  m.sin(phi)], [-m.sin(phi),  m.cos(phi)]])
qml.ControlledQubitUnitary(U, control_wires=range(num_con),

with something like

qml.MultiControlledX(wires=targ_wire, control_wires=range(num_con))
qml.RY(phi, wires=targ_wire)
qml.MultiControlledX(wires=targ_wire, control_wires=range(num_con))

This approach may not be exactly what you have above, but is similar.

I make this suggestion because it seems like your issues involve trying to manually make a unitary and then using qml.ControlledQubitUnitary or qml.DiagonalQubitUnitary. Although this is something we’re improving on, it’s a use case that can quite often break differentiability. Instead, it’s better to use gates that have a well defined input parameter, such as qml.RX and qml.Rot. Finding the derivatives with respect to parametrized gates is much more of an established use case in PennyLane.

Also, another piece of advice when debugging errors in your model is to try to break it down into the elementary nodes/layers and see if the gradient is accessible for each. For example, instead of training the whole hybrid_model, it’s easier to focus on differentiating diagonal_embedding.

qml.MultiControlledX(wires=targ_wire, control_wires=range(num_con))
qml.RY(phi, wires=targ_wire)
qml.MultiControlledX(wires=targ_wire, control_wires=range(num_con))

I doubt that that is an equivalent of the previous formulation. I added the control_values and it doesn’t produce the same result.

 qml.MultiControlledX(wires=num_con, control_wires=range(num_con), control_values=binary_index)
qml.RY(angles[idx], wires=num_con)
 qml.MultiControlledX(wires=num_con, control_wires=range(num_con), control_values=binary_index)
            

Unfortunately using this Gate I can’t even print the circuit.

Hey @Daniel63656!

I doubt that that is an equivalent of the previous formulation. I added the control_values and it doesn’t produce the same result.

Agreed. It looks like you are trying to do a multi-controlled-Y gate. PennyLane supports qml.CRY for control on an additional wire, but beyond that we need to think a bit more carefully. One approach to exactly performing a multi-controlled-Y gate is provided here in Sec. 7 (e.g., Lemma 7.9). You should be able to set this up using qml.MultiControlledX and qml.CRot, etc.

However, are you aiming to perform this operation exactly, or simply to have an operation that interacts all of the qubits and with a trainable parameter? In which case, the MultiControlledX - RY - MultiControlledX approach may be sufficient. I’d also recommend prioritizing getting something to work, even if it isn’t exactly your expected transformation, and then evolve from there.

Unfortunately using this Gate I can’t even print the circuit.

Ah, good point! This was a bug that we have fixed in the development version of PennyLane. To access that, you can install following these installation instructions. The circuit should then be printable:

import pennylane as qml

dev = qml.device("default.qubit", wires=5)

targ_wire = 0
control_wires = range(1, 5)

@qml.qnode(dev)
def f(phi):
    qml.MultiControlledX(wires=targ_wire, control_wires=control_wires)
    qml.RY(phi, wires=targ_wire)
    qml.MultiControlledX(wires=targ_wire, control_wires=control_wires)
    return qml.expval(qml.PauliZ(0))

f(0.2)
print(f.draw())

Hi Tom,

I’d also recommend prioritizing getting something to work, even if it isn’t exactly your expected transformation, and then evolve from there

I am just experimenting around a bit to find a more efficient qNode since the standard approach (rotations and entanglement) doesn’t work so well. I came up with a more “sophisticated” variational approach which does converge a lot faster, but I would like to have a data embedding, where the input parameters are independent from each other (which is not the case with rotating each qubit Ry, Rz, Ry what I do now). I basically just want to try if using amplitude embedding leeds to even faster convergence in combination with my variational approach :slight_smile:

I also hope to get a circuit where I can have 2^qubit output neurons (measuring propabilities) without just “inflating” more information from the qubits-expectation values. I guess with that as goal using something like amplitude encoding or a diagonal applied to the hadamard space is a key factor.

Unfortunately, as you can see, none of my encoding attempts work in the context of differentiability

Thanks @Daniel63656!

I basically just want to try if using amplitude embedding leeds to even faster convergence in combination with my variational approach

Coming back to the original discussion: if you just want to train a model using an amplitude-based embedding, why don’t you consider switching to the TensorFlow interface?

The code below shows how you can obtain the gradient of your model with respect to the quantum parameters:

import pennylane as qml
import tensorflow as tf

wires = 2
layers = 3

dev = qml.device("default.qubit.tf", wires=wires)

@qml.qnode(dev, interface="tf", diff_method="backprop")
def f(inputs, weights):
    qml.QubitStateVector(inputs, wires=range(wires))
    qml.templates.StronglyEntanglingLayers(weights, wires=range(wires))
    return qml.probs(range(wires))

weight_shapes = {"weights": (layers, wires, 3)}

qlayer = qml.qnn.KerasLayer(f, weight_shapes, output_dim = 2 ** wires)
clayer1 = tf.keras.layers.Dense(2 ** wires, activation="sigmoid")

def normalize(x):
    x = tf.cast(x, tf.complex128)
    return tf.stack([x_ / tf.sqrt(tf.reduce_sum(tf.math.conj(x_) * x_)) for x_ in x])

clayer_interface = tf.keras.layers.Lambda(normalize)
clayer2 = tf.keras.layers.Dense(2, activation="softmax")

model = tf.keras.Sequential([clayer1, clayer_interface, qlayer, clayer2])

inputs = tf.ones((2, 4))

with tf.GradientTape() as tape:
    output = model(inputs)
    
qlayer_weights = qlayer.trainable_weights
    
tape.jacobian(output, qlayer_weights)

I didn’t consider this an option. I thought I can only use cirq with tensorflow. I would need to train the whole model in tensorflow then. I know that using Tesnorflow for something new like this requires some deeper knowledge about it (which I don’t really have).

I need to define my own training loop and can’t use model.fit, right?

But if this makes QubitStateVector and possibly DiagonalQubitUnitary differentiable, I will certainly give it a try!
But the problem is that pennylane can’t differentiate these unitaries, so I don’t see why switching to Tensorflow would change something.

Hey @Daniel63656,

I need to define my own training loop and can’t use model.fit, right?

The code provided above converts the PennyLane QNode into a Keras layer, so you can use all the standard tools you would normally when dealing with Keras models. You can check out our tutorial here.

But the problem is that pennylane can’t differentiate these unitaries, so I don’t see why switching to Tensorflow would change something.

Right, differentiating the arbitrary unitaries will probably also not work in the TensorFlow interface. However, I’d recommend using parametrized gates rather than arbitrary unitaries.

Hi it’s me again,

This topic got a bit out of focus because I didn’t get good training results with amplitude encoding (using transfer learning). But now I found a good approach, so the question for differentiability of amplitude state preparation is on the table again.
I sum up what options we considered, so maybe we can just make this an open issue?

-MottonenStatePreparation:

The git repository for the cast fix isn’t available. But the suggested fix in the file works. Are the changes already incorporated into pennyLane?
With that fixed, the values become NaN directly.
From docs: “Due to non-trivial classical processing of the state vector, this template is not always fully differentiable.”

-qml.QubitStateVector: decomposes into Möttönen-method -> same problem

-qml.templates.AmplitudeEmbedding
I didn’t tried that one because the docs clearly state non differentiability. What is even the differnce between those three methods?

-My own embedding attempt based on https://www.nature.com/articles/s41598-021-85474-1.pdf?origin=ppub
Pre-processing is involved (findAngles), but can be made differentiable with pyTorch. Problem are the MultiControlledRy-rotations (currently implemented by qml.ControlledQubitUnitary (not differentiable!).

I am also pretty confused, because here Differentiation with AmplitudeEmbedding the same problem seemingly got solved by making inputs a keyword argument, which doesn’t work for me at all (in fact leaving inputs as non-keyword argument works completely fine with autograd when not using amplitude encoding).

I looked into Qiskits initializer-class
https://qiskit.org/documentation/_modules/qiskit/extensions/quantum_initializer/initializer.html

but I’m not sure if this is a new method/would be differentiable if implemented in pennylane:
“Note that Initialize is an Instruction and not a Gate since it contains a reset instruction, which is not unitary.”

Hi @Daniel63656,

Are the changes already incorporated into pennyLane?

The fix should be in the latest release (v0.16.0). You can find it in the second entry under Bug fixes in the release notes.


The three state preparations are very similar indeed.

  • QubitStateVector might be supported natively on a quantum device, and if there’s no need to differentiate it, it’s much quicker to simply use that one instead of decomposing it using the Möttönen state preparation.

  • AmplitudeEmbedding is a template that basically applies a QubitStateVector operation after doing some preprocessing, such as padding of the state and normalizing it.

  • MottonenStatePreparation uses a method to prepare a specific state according to this paper from Möttönen, et al. which usually can work if the device in question does not have native support for a direct state preparation operation, but it will likely not be as fast.

I am also pretty confused, because here Differentiation with AmplitudeEmbedding the same problem seemingly got solved by making inputs a keyword argument, which doesn’t work for me at all

The syntax for marking differentiable inputs or not has changed, and should be done with a requires_grad flag when declared, for NumPy and Torch, or declaring the input as a tf.constant for Tensorflow. You can read more about that on the interfaces page in the documentation.

I hope this clears some things up!

1 Like

Hi theodor,

this clears some things up for me indeed :slight_smile:.

might be supported natively on a quantum device

You mean an actual quantum computer?

My QNode returns probabilities not expectation values.
return qml.probs(wires = measureWires)

Does in this case the automatic differentiation via parameter-shift-rule calculate the correct gradients w.r.t. to specified cost? I mean I don’t return an measured observable

Hi @Daniel63656,

You mean an actual quantum computer?

No, some statevector simulators might support the qml.QubitStateVector operation natively (e.g., by performing matrix vector multiplications or more specialized methods). When running it on an actual quantum computer, its decomposition would be used, which is defined using qml.MottonenStatePreparation.

Does in this case the automatic differentiation via parameter-shift-rule calculate the correct gradients w.r.t. to specified cost? I mean I don’t return an measured observable

Could you elaborate on this question? Gradients are computed with regard to the trainable parameters in the circuit. These are specified by setting the requires_grad=True argument:

from pennylane import numpy as np

trainable_param = np.array([0.123], requires_grad=True)