Hey @NikSchet,
This is indeed an interesting thing to think about, i.e., taking @Maria_Schuld’s paper and exploring how our understanding of circuit expressivity applies to a simple ML problem like the moons dataset.
The code block you shared appears to be along the right lines! However, each repetition of StronglyEntanglingLayers
should have a different set of weights. For example:
import pennylane as qml
import numpy as np
import tensorflow as tf
qml.enable_tape()
n_qubits = 4
dev = qml.device("default.qubit.tf", wires=n_qubits)
@qml.qnode(dev, interface="tf", grad_method="backprop")
def qnode(inputs, weights):
for i in range(blocks):
qml.templates.AngleEmbedding(inputs, wires=range(n_qubits))
qml.templates.StronglyEntanglingLayers(weights[i], wires=range(n_qubits))
return [qml.expval(qml.PauliZ(i)) for i in range(n_qubits)]
layers_per_block = 1
blocks = 2
weights_shape = (blocks, layers_per_block, n_qubits, 3)
weights = tf.Variable(np.random.random(weights_shape))
inputs = tf.constant(np.random.random(n_qubits))
print("Output of QNode:", qnode(inputs, weights).numpy())
# Optionally convert to Keras layer:
tf.keras.backend.set_floatx("float64")
weight_shapes = {"weights": weights_shape}
qlayer = qml.qnn.KerasLayer(qnode, weight_shapes, n_qubits)
batch_size = 10
inputs = tf.constant(np.random.random((batch_size, n_qubits)))
print("Output of quantum layer:\n", qlayer(inputs).numpy())
Using the terminology here (see second diagram), we apply an embedding of the same input among multiple blocks. Each block also has a trainable element with independent weights.
We can treat AngleEmbedding
and StronglyEntanglingLayers
as one block, although need to be careful of the terminology between “layers” and “blocks”. In the code above, there can be multiple “layers” of StronglyEntanglingLayers
per block, set by the layers_per_block
variable. We can then vary the number of blocks (see blocks
variable) and see how well the circuit can learn. Of course, in the blocks = 1
case, we recover the tutorial here.
Once set up, I see no reason why the above couldn’t be combined into a hybrid with other classical layers and applied to, e.g., the moons dataset. What would be really cool is to plot the accuracy as a function of blocks
and/or layers_per_block
. As @josh mentioned, if you have any luck with this then it would make an awesome community demo (instructions here).
As a side note, when I plot the decision boundary for the KerasLayer demo, it appears to be a straight line. This is probably linked to having just one block and hence limited expressivity. It would be interesting to check out the boundary if blocks
is higher.
Also, to answer an earlier question:
Moreover, i noticed in the HYBRID you use default LINEAR activation function in the classical layers (before quantum node) instead of Tanh or Relu etc. Is there a reason for that?
Not really! Quite possibly we could have used another activation and it may have trained as well or better. Something like a tanh or sigmoid activation that is nicely bounded in an interval makes sense. Indeed, this is linked to the choice of activation and we could also have rescaled to [-\pi, \pi].