Hybrid Network not differentiating

Tom_Bromley · June 10, 2021, 2:53pm

Could you point me in the right direction on how I could do that?

Regarding the suggestion of using MultiControlledX or PauliRot, the idea would be just to swap out the lines

U = np.array([[m.cos(phi),  m.sin(phi)], [-m.sin(phi),  m.cos(phi)]])
qml.ControlledQubitUnitary(U, control_wires=range(num_con),

with something like

qml.MultiControlledX(wires=targ_wire, control_wires=range(num_con))
qml.RY(phi, wires=targ_wire)
qml.MultiControlledX(wires=targ_wire, control_wires=range(num_con))

This approach may not be exactly what you have above, but is similar.

I make this suggestion because it seems like your issues involve trying to manually make a unitary and then using qml.ControlledQubitUnitary or qml.DiagonalQubitUnitary. Although this is something we’re improving on, it’s a use case that can quite often break differentiability. Instead, it’s better to use gates that have a well defined input parameter, such as qml.RX and qml.Rot. Finding the derivatives with respect to parametrized gates is much more of an established use case in PennyLane.

Also, another piece of advice when debugging errors in your model is to try to break it down into the elementary nodes/layers and see if the gradient is accessible for each. For example, instead of training the whole hybrid_model, it’s easier to focus on differentiating diagonal_embedding.

Daniel63656 · June 14, 2021, 6:51am

qml.MultiControlledX(wires=targ_wire, control_wires=range(num_con))
qml.RY(phi, wires=targ_wire)
qml.MultiControlledX(wires=targ_wire, control_wires=range(num_con))

I doubt that that is an equivalent of the previous formulation. I added the control_values and it doesn’t produce the same result.

 qml.MultiControlledX(wires=num_con, control_wires=range(num_con), control_values=binary_index)
qml.RY(angles[idx], wires=num_con)
 qml.MultiControlledX(wires=num_con, control_wires=range(num_con), control_values=binary_index)

Unfortunately using this Gate I can’t even print the circuit.

Tom_Bromley · June 14, 2021, 2:10pm

Hey @Daniel63656!

I doubt that that is an equivalent of the previous formulation. I added the control_values and it doesn’t produce the same result.

Agreed. It looks like you are trying to do a multi-controlled-Y gate. PennyLane supports qml.CRY for control on an additional wire, but beyond that we need to think a bit more carefully. One approach to exactly performing a multi-controlled-Y gate is provided here in Sec. 7 (e.g., Lemma 7.9). You should be able to set this up using qml.MultiControlledX and qml.CRot, etc.

However, are you aiming to perform this operation exactly, or simply to have an operation that interacts all of the qubits and with a trainable parameter? In which case, the MultiControlledX - RY - MultiControlledX approach may be sufficient. I’d also recommend prioritizing getting something to work, even if it isn’t exactly your expected transformation, and then evolve from there.

Unfortunately using this Gate I can’t even print the circuit.

Ah, good point! This was a bug that we have fixed in the development version of PennyLane. To access that, you can install following these installation instructions. The circuit should then be printable:

import pennylane as qml

dev = qml.device("default.qubit", wires=5)

targ_wire = 0
control_wires = range(1, 5)

@qml.qnode(dev)
def f(phi):
    qml.MultiControlledX(wires=targ_wire, control_wires=control_wires)
    qml.RY(phi, wires=targ_wire)
    qml.MultiControlledX(wires=targ_wire, control_wires=control_wires)
    return qml.expval(qml.PauliZ(0))

f(0.2)
print(f.draw())

Daniel63656 · June 14, 2021, 2:32pm

Hi Tom,

I’d also recommend prioritizing getting something to work, even if it isn’t exactly your expected transformation, and then evolve from there

I am just experimenting around a bit to find a more efficient qNode since the standard approach (rotations and entanglement) doesn’t work so well. I came up with a more “sophisticated” variational approach which does converge a lot faster, but I would like to have a data embedding, where the input parameters are independent from each other (which is not the case with rotating each qubit Ry, Rz, Ry what I do now). I basically just want to try if using amplitude embedding leeds to even faster convergence in combination with my variational approach

I also hope to get a circuit where I can have 2^qubit output neurons (measuring propabilities) without just “inflating” more information from the qubits-expectation values. I guess with that as goal using something like amplitude encoding or a diagonal applied to the hadamard space is a key factor.

Unfortunately, as you can see, none of my encoding attempts work in the context of differentiability

Tom_Bromley · June 14, 2021, 8:26pm

Thanks @Daniel63656!

I basically just want to try if using amplitude embedding leeds to even faster convergence in combination with my variational approach

Coming back to the original discussion: if you just want to train a model using an amplitude-based embedding, why don’t you consider switching to the TensorFlow interface?

The code below shows how you can obtain the gradient of your model with respect to the quantum parameters:

import pennylane as qml
import tensorflow as tf

wires = 2
layers = 3

dev = qml.device("default.qubit.tf", wires=wires)

@qml.qnode(dev, interface="tf", diff_method="backprop")
def f(inputs, weights):
    qml.QubitStateVector(inputs, wires=range(wires))
    qml.templates.StronglyEntanglingLayers(weights, wires=range(wires))
    return qml.probs(range(wires))

weight_shapes = {"weights": (layers, wires, 3)}

qlayer = qml.qnn.KerasLayer(f, weight_shapes, output_dim = 2 ** wires)
clayer1 = tf.keras.layers.Dense(2 ** wires, activation="sigmoid")

def normalize(x):
    x = tf.cast(x, tf.complex128)
    return tf.stack([x_ / tf.sqrt(tf.reduce_sum(tf.math.conj(x_) * x_)) for x_ in x])

clayer_interface = tf.keras.layers.Lambda(normalize)
clayer2 = tf.keras.layers.Dense(2, activation="softmax")

model = tf.keras.Sequential([clayer1, clayer_interface, qlayer, clayer2])

inputs = tf.ones((2, 4))

with tf.GradientTape() as tape:
    output = model(inputs)
    
qlayer_weights = qlayer.trainable_weights
    
tape.jacobian(output, qlayer_weights)

Daniel63656 · June 15, 2021, 7:16am

I didn’t consider this an option. I thought I can only use cirq with tensorflow. I would need to train the whole model in tensorflow then. I know that using Tesnorflow for something new like this requires some deeper knowledge about it (which I don’t really have).

I need to define my own training loop and can’t use model.fit, right?

But if this makes QubitStateVector and possibly DiagonalQubitUnitary differentiable, I will certainly give it a try!
But the problem is that pennylane can’t differentiate these unitaries, so I don’t see why switching to Tensorflow would change something.

Tom_Bromley · June 15, 2021, 4:52pm

Hey @Daniel63656,

I need to define my own training loop and can’t use model.fit, right?

The code provided above converts the PennyLane QNode into a Keras layer, so you can use all the standard tools you would normally when dealing with Keras models. You can check out our tutorial here.

But the problem is that pennylane can’t differentiate these unitaries, so I don’t see why switching to Tensorflow would change something.

Right, differentiating the arbitrary unitaries will probably also not work in the TensorFlow interface. However, I’d recommend using parametrized gates rather than arbitrary unitaries.

Daniel63656 · June 30, 2021, 8:09am

Hi it’s me again,

This topic got a bit out of focus because I didn’t get good training results with amplitude encoding (using transfer learning). But now I found a good approach, so the question for differentiability of amplitude state preparation is on the table again.
I sum up what options we considered, so maybe we can just make this an open issue?

-MottonenStatePreparation:

The git repository for the cast fix isn’t available. But the suggested fix in the file works. Are the changes already incorporated into pennyLane?
With that fixed, the values become NaN directly.
From docs: “Due to non-trivial classical processing of the state vector, this template is not always fully differentiable.”

-qml.QubitStateVector: decomposes into Möttönen-method -> same problem

-qml.templates.AmplitudeEmbedding
I didn’t tried that one because the docs clearly state non differentiability. What is even the differnce between those three methods?

-My own embedding attempt based on https://www.nature.com/articles/s41598-021-85474-1.pdf?origin=ppub
Pre-processing is involved (findAngles), but can be made differentiable with pyTorch. Problem are the MultiControlledRy-rotations (currently implemented by qml.ControlledQubitUnitary (not differentiable!).

I am also pretty confused, because here Differentiation with AmplitudeEmbedding the same problem seemingly got solved by making inputs a keyword argument, which doesn’t work for me at all (in fact leaving inputs as non-keyword argument works completely fine with autograd when not using amplitude encoding).

I looked into Qiskits initializer-class
https://qiskit.org/documentation/_modules/qiskit/extensions/quantum_initializer/initializer.html

but I’m not sure if this is a new method/would be differentiable if implemented in pennylane:
“Note that Initialize is an Instruction and not a Gate since it contains a reset instruction, which is not unitary.”

theodor · June 30, 2021, 4:18pm

Hi @Daniel63656,

Are the changes already incorporated into pennyLane?

The fix should be in the latest release (v0.16.0). You can find it in the second entry under Bug fixes in the release notes.

The three state preparations are very similar indeed.

QubitStateVector might be supported natively on a quantum device, and if there’s no need to differentiate it, it’s much quicker to simply use that one instead of decomposing it using the Möttönen state preparation.
AmplitudeEmbedding is a template that basically applies a QubitStateVector operation after doing some preprocessing, such as padding of the state and normalizing it.
MottonenStatePreparation uses a method to prepare a specific state according to this paper from Möttönen, et al. which usually can work if the device in question does not have native support for a direct state preparation operation, but it will likely not be as fast.

I am also pretty confused, because here Differentiation with AmplitudeEmbedding the same problem seemingly got solved by making inputs a keyword argument, which doesn’t work for me at all

The syntax for marking differentiable inputs or not has changed, and should be done with a requires_grad flag when declared, for NumPy and Torch, or declaring the input as a tf.constant for Tensorflow. You can read more about that on the interfaces page in the documentation.

I hope this clears some things up!

Daniel63656 · July 1, 2021, 7:33am

Hi theodor,

this clears some things up for me indeed .

might be supported natively on a quantum device

You mean an actual quantum computer?

My QNode returns probabilities not expectation values.
return qml.probs(wires = measureWires)

Does in this case the automatic differentiation via parameter-shift-rule calculate the correct gradients w.r.t. to specified cost? I mean I don’t return an measured observable

antalszava · July 2, 2021, 3:24pm

Hi @Daniel63656,

You mean an actual quantum computer?

No, some statevector simulators might support the qml.QubitStateVector operation natively (e.g., by performing matrix vector multiplications or more specialized methods). When running it on an actual quantum computer, its decomposition would be used, which is defined using qml.MottonenStatePreparation.

Does in this case the automatic differentiation via parameter-shift-rule calculate the correct gradients w.r.t. to specified cost? I mean I don’t return an measured observable

Could you elaborate on this question? Gradients are computed with regard to the trainable parameters in the circuit. These are specified by setting the requires_grad=True argument:

from pennylane import numpy as np

trainable_param = np.array([0.123], requires_grad=True)

Topic		Replies	Views
Hybrid Quantum-Classical network with pytorch PennyLane Help	18	3441	November 4, 2020
A hybrid model was built using Torch layers, but the parameters were not updated during model training PennyLane Help	7	477	March 20, 2024
Turning quantum nodes into Torch Layers PennyLane Help	22	1598	February 6, 2024
Local cost function - hybrid neural networks PennyLane Help	13	1195	May 15, 2022
Differentiation with AmplitudeEmbedding PennyLane Help	2	1064	October 26, 2020

Hybrid Network not differentiating

Related topics