Data re-uploading classifier in HYBRID

Hello all!

So Variational classifier using angle encoding fails to capture patterns in simple datasets (i.e. moons) easily captured by classical Support vector machines. The main reason for this in my opinion is the data embedding (angle encoding).

On the other hand Data reuploading classifier captures easily patterns when variational classifier fails. And of course this is (in my opinion) due to the fact that it is a more advanced data embedding technique.

My question is whether it is possible to make a hybrid network using
data reuploading classifier instead of variational classifier.

Thanks in advance

Moreover, i noticed in the HYBRID you use default LINEAR activation function in the classical layers (before quantum node) instead of Tanh or Relu etc. Is there a reason for that? Also in my understanding this choice will have to do with the data embedding option. For example since angle encoding uses rotations and maps data using sine it makes sense to use a trigonometric activation function instead of RELU or default LINEAR.

Hi @NikSchet!

My question is whether it is possible to make a hybrid network using
data reuploading classifier instead of variational classifier.

It definitely should be possible! Have you tried downloading the data re-uploading tutorial, and modifying it to classify simple datasets (such as moons?).

I believe your intuition regarding data re-uploading being more expressive is correct; this tutorial on expressivity of quantum models delves into this topic in more detail (and is based on the paper The effect of data encoding on the expressive power of variational quantum machine learning models by Schuld, Sweke, and Meyer).

Moreover, i noticed in the HYBRID you use default LINEAR activation function in the classical layers (before quantum node) instead of Tanh or Relu etc. Is there a reason for that?

Could you point me to the specific tutorial/example you are referring to here?

Thank you for your fast reply and suggestions:

  1. Yes i have benchmarked data reuploading classifier VS variational classifier (with angle encoding) in a a variety of such 2D and 3D artificial datasets (moons, circles, squares etc). Always the data reuploading classifier captures the pattern while variational classifier struggles in certain datasets.

  2. I haven’t tried to implement a hybrid network with data reuploading classifier yet. So i am asking if there is a reason why it wouldnt work or maybe a hint on how to do this.

  3. Regarding activation functions: I am reffering to the tutorial Turning quantum nodes into Keras Layers

The first classical layer is a dense 2 layer. When you dont specify the activation function then by default it is a linear activation function.

Also correct me if i am wrong :slight_smile:

The data reuploading classifier is an extension to the code in Quantum models as Fourier series . so essentially they are based on the same idea. right?

Yes i have benchmarked data reuploading classifier VS variational classifier (with angle encoding) in a a variety of such 2D and 3D artificial datasets (moons, circles, squares etc). Always the data reuploading classifier captures the pattern while variational classifier struggles in certain datasets.

Awesome! Are interested in submitting your benchmarking demo to the QML website? It would make a very interesting addition!

I haven’t tried to implement a hybrid network with data reuploading classifier yet. So i am asking if there is a reason why it wouldnt work or maybe a hint on how to do this.

I can’t think of any reasons why it might not work, so I would recommend giving it a shot and seeing what happens :slightly_smiling_face:

Regarding activation functions: I am reffering to the tutorial Turning quantum nodes into Keras Layers

Perhaps @Tom_Bromley can provide more details here.

1 Like

I want to make it a fair comparison between Variational Classifier and data-reupload classifier so i want to better understand the data embedding for VC. So looking just the template embeddings i am not sure any such embedding will work better. so probably i need to go to custom trainable embeddings

No worries @NikSchet, let us know if you have any questions as you explore this

1 Like

Thank you very much.

So in the tutorial Quantum models as Fourier series
the idea is similar as in the data-reuploading classifier.

In my hybrid model i use a qnode sandwitched between two classical layers:

@qml.qnode(device)
def qnode(inputs, weights):
    qml.templates.AngleEmbedding(inputs, wires=range(n_qubits))
    qml.templates.StronglyEntanglingLayers(weights, wires=range(n_qubits))
   
   return [qml.expval(qml.PauliZ(i)) for i in range(n_qubits)]

So what if i repeat the templates inside the qnode many times , wouldnt that be similar to data reuploading? for example

 @qml.qnode(device)
    def qnode(inputs, weights):
        qml.templates.AngleEmbedding(inputs, wires=range(n_qubits))
        qml.templates.StronglyEntanglingLayers(weights, wires=range(n_qubits))

        qml.templates.AngleEmbedding(inputs, wires=range(n_qubits))
        qml.templates.StronglyEntanglingLayers(weights, wires=range(n_qubits))

        qml.templates.AngleEmbedding(inputs, wires=range(n_qubits))
        qml.templates.StronglyEntanglingLayers(weights, wires=range(n_qubits))

        qml.templates.AngleEmbedding(inputs, wires=range(n_qubits))
        qml.templates.StronglyEntanglingLayers(weights, wires=range(n_qubits))
       
        return [qml.expval(qml.PauliZ(i)) for i in range(n_qubits)]

Maybe it make sense to use different “input” for every repeating layer?!?

Thanks in advance

Hey @NikSchet,

This is indeed an interesting thing to think about, i.e., taking @Maria_Schuld’s paper and exploring how our understanding of circuit expressivity applies to a simple ML problem like the moons dataset.

The code block you shared appears to be along the right lines! However, each repetition of StronglyEntanglingLayers should have a different set of weights. For example:

import pennylane as qml
import numpy as np
import tensorflow as tf

qml.enable_tape()

n_qubits = 4

dev = qml.device("default.qubit.tf", wires=n_qubits)

@qml.qnode(dev, interface="tf", grad_method="backprop")
def qnode(inputs, weights):
    for i in range(blocks):
        qml.templates.AngleEmbedding(inputs, wires=range(n_qubits))
        qml.templates.StronglyEntanglingLayers(weights[i], wires=range(n_qubits))
    return [qml.expval(qml.PauliZ(i)) for i in range(n_qubits)]

layers_per_block = 1
blocks = 2
weights_shape = (blocks, layers_per_block, n_qubits, 3)

weights = tf.Variable(np.random.random(weights_shape))
inputs = tf.constant(np.random.random(n_qubits))

print("Output of QNode:", qnode(inputs, weights).numpy())

# Optionally convert to Keras layer:
tf.keras.backend.set_floatx("float64")

weight_shapes = {"weights": weights_shape}
qlayer = qml.qnn.KerasLayer(qnode, weight_shapes, n_qubits)

batch_size = 10
inputs = tf.constant(np.random.random((batch_size, n_qubits)))
print("Output of quantum layer:\n", qlayer(inputs).numpy())

Using the terminology here (see second diagram), we apply an embedding of the same input among multiple blocks. Each block also has a trainable element with independent weights.

We can treat AngleEmbedding and StronglyEntanglingLayers as one block, although need to be careful of the terminology between “layers” and “blocks”. In the code above, there can be multiple “layers” of StronglyEntanglingLayers per block, set by the layers_per_block variable. We can then vary the number of blocks (see blocks variable) and see how well the circuit can learn. Of course, in the blocks = 1 case, we recover the tutorial here.

Once set up, I see no reason why the above couldn’t be combined into a hybrid with other classical layers and applied to, e.g., the moons dataset. What would be really cool is to plot the accuracy as a function of blocks and/or layers_per_block. As @josh mentioned, if you have any luck with this then it would make an awesome community demo (instructions here).

As a side note, when I plot the decision boundary for the KerasLayer demo, it appears to be a straight line. This is probably linked to having just one block and hence limited expressivity. It would be interesting to check out the boundary if blocks is higher.

Also, to answer an earlier question:

Moreover, i noticed in the HYBRID you use default LINEAR activation function in the classical layers (before quantum node) instead of Tanh or Relu etc. Is there a reason for that?

Not really! Quite possibly we could have used another activation and it may have trained as well or better. Something like a tanh or sigmoid activation that is nicely bounded in an interval makes sense. Indeed, this is linked to the choice of activation and we could also have rescaled to [-\pi, \pi].

I would like to thank you for your detailed answer. Yes i succeeded in using your code cobined with other classical layers. some remarks

  1. I do not need the initial feeding classical layer to get high accuracy. Even with just the quantum node and a final classical layer i get excellent results

  2. each repetition of StronglyEntanglingLayers should have a different set of weights.
    -> Interestingly, even without different set of weights (see my previous code) i get very good results.

  3. increasing blocks significantly improves the prediction grid (Using 3 blocks ,2 layers is sufficient to get amazing results in just 10 epochs)

  4. I think adding the qml.enable_tape() line greatly improves the running time

I am in the process of making such a demo thanks!

Thanks @NikSchet, sounds great and some interesting observations. Let us know if there’s anything else we can help with!

1 Like

Feel free to check the code here. https://github.com/nsansen/Quantum-Machine-Learning

  1. Demo v5. Just plot prediction grid for a given setup
  2. Demo v5 - feature exploring , prints prediction grids for different blocks , so you can see the evolution of the grid

For database construction code i use code written by Adrián Pérez-Salinas, Alba Cervera-Lierta, Elies Gil-Fuster, and José I. Latorre. I think it make more sense to have these databases as reference

1 Like

So i am not sure if the code in github makes a useful demo or not, a presentation with results might be more meaniningful.

In any case thank you very much for your help

Hi @NikSchet - it’s up to you how you would like to structure a demo submission, but demos we’ve found quite useful in the past are ones that tell a story.

That is, they start off with a question that is going to be explored, space out the code cells with text explanations, and finish by presenting some results. Even if the results are negative, the demo as a whole could still tell an interesting story!

1 Like