Hey @NikSchet,

This is indeed an interesting thing to think about, i.e., taking @Maria_Schuldâ€™s paper and exploring how our understanding of circuit expressivity applies to a simple ML problem like the moons dataset.

The code block you shared appears to be along the right lines! However, each repetition of `StronglyEntanglingLayers`

should have a *different* set of weights. For example:

```
import pennylane as qml
import numpy as np
import tensorflow as tf
qml.enable_tape()
n_qubits = 4
dev = qml.device("default.qubit.tf", wires=n_qubits)
@qml.qnode(dev, interface="tf", grad_method="backprop")
def qnode(inputs, weights):
for i in range(blocks):
qml.templates.AngleEmbedding(inputs, wires=range(n_qubits))
qml.templates.StronglyEntanglingLayers(weights[i], wires=range(n_qubits))
return [qml.expval(qml.PauliZ(i)) for i in range(n_qubits)]
layers_per_block = 1
blocks = 2
weights_shape = (blocks, layers_per_block, n_qubits, 3)
weights = tf.Variable(np.random.random(weights_shape))
inputs = tf.constant(np.random.random(n_qubits))
print("Output of QNode:", qnode(inputs, weights).numpy())
# Optionally convert to Keras layer:
tf.keras.backend.set_floatx("float64")
weight_shapes = {"weights": weights_shape}
qlayer = qml.qnn.KerasLayer(qnode, weight_shapes, n_qubits)
batch_size = 10
inputs = tf.constant(np.random.random((batch_size, n_qubits)))
print("Output of quantum layer:\n", qlayer(inputs).numpy())
```

Using the terminology here (see second diagram), we apply an embedding of the same input among multiple blocks. Each block also has a trainable element with independent weights.

We can treat `AngleEmbedding`

and `StronglyEntanglingLayers`

as one block, although need to be careful of the terminology between â€ślayersâ€ť and â€śblocksâ€ť. In the code above, there can be multiple â€ślayersâ€ť of `StronglyEntanglingLayers`

per block, set by the `layers_per_block`

variable. We can then vary the number of blocks (see `blocks`

variable) and see how well the circuit can learn. Of course, in the `blocks = 1`

case, we recover the tutorial here.

Once set up, I see no reason why the above couldnâ€™t be combined into a hybrid with other classical layers and applied to, e.g., the moons dataset. What would be really cool is to plot the accuracy as a function of `blocks`

and/or `layers_per_block`

. As @josh mentioned, if you have any luck with this then it would make an awesome community demo (instructions here).

As a side note, when I plot the decision boundary for the KerasLayer demo, it appears to be a straight line. This is probably linked to having just one block and hence limited expressivity. It would be interesting to check out the boundary if `blocks`

is higher.

Also, to answer an earlier question:

Moreover, i noticed in the HYBRID you use default LINEAR activation function in the classical layers (before quantum node) instead of Tanh or Relu etc. Is there a reason for that?

Not really! Quite possibly we could have used another activation and it may have trained as well or better. Something like a tanh or sigmoid activation that is nicely bounded in an interval makes sense. Indeed, this is linked to the choice of activation and we could also have rescaled to [-\pi, \pi].