Hey @NikSchet,
This is indeed an interesting thing to think about, i.e., taking @Maria_Schuld’s paper and exploring how our understanding of circuit expressivity applies to a simple ML problem like the moons dataset.
The code block you shared appears to be along the right lines! However, each repetition of StronglyEntanglingLayers should have a different set of weights. For example:
import pennylane as qml
import numpy as np
import tensorflow as tf
qml.enable_tape()
n_qubits = 4
dev = qml.device("default.qubit.tf", wires=n_qubits)
@qml.qnode(dev, interface="tf", grad_method="backprop")
def qnode(inputs, weights):
for i in range(blocks):
qml.templates.AngleEmbedding(inputs, wires=range(n_qubits))
qml.templates.StronglyEntanglingLayers(weights[i], wires=range(n_qubits))
return [qml.expval(qml.PauliZ(i)) for i in range(n_qubits)]
layers_per_block = 1
blocks = 2
weights_shape = (blocks, layers_per_block, n_qubits, 3)
weights = tf.Variable(np.random.random(weights_shape))
inputs = tf.constant(np.random.random(n_qubits))
print("Output of QNode:", qnode(inputs, weights).numpy())
# Optionally convert to Keras layer:
tf.keras.backend.set_floatx("float64")
weight_shapes = {"weights": weights_shape}
qlayer = qml.qnn.KerasLayer(qnode, weight_shapes, n_qubits)
batch_size = 10
inputs = tf.constant(np.random.random((batch_size, n_qubits)))
print("Output of quantum layer:\n", qlayer(inputs).numpy())
Using the terminology here (see second diagram), we apply an embedding of the same input among multiple blocks. Each block also has a trainable element with independent weights.
We can treat AngleEmbedding and StronglyEntanglingLayers as one block, although need to be careful of the terminology between “layers” and “blocks”. In the code above, there can be multiple “layers” of StronglyEntanglingLayers per block, set by the layers_per_block variable. We can then vary the number of blocks (see blocks variable) and see how well the circuit can learn. Of course, in the blocks = 1 case, we recover the tutorial here.
Once set up, I see no reason why the above couldn’t be combined into a hybrid with other classical layers and applied to, e.g., the moons dataset. What would be really cool is to plot the accuracy as a function of blocks and/or layers_per_block. As @josh mentioned, if you have any luck with this then it would make an awesome community demo (instructions here).
As a side note, when I plot the decision boundary for the KerasLayer demo, it appears to be a straight line. This is probably linked to having just one block and hence limited expressivity. It would be interesting to check out the boundary if blocks is higher.
Also, to answer an earlier question:
Moreover, i noticed in the HYBRID you use default LINEAR activation function in the classical layers (before quantum node) instead of Tanh or Relu etc. Is there a reason for that?
Not really! Quite possibly we could have used another activation and it may have trained as well or better. Something like a tanh or sigmoid activation that is nicely bounded in an interval makes sense. Indeed, this is linked to the choice of activation and we could also have rescaled to [-\pi, \pi].