Data-reuploading classifer: predictions grid

Dear all.

I am using the demo “Data-reuploading classifier” to play around with different datasets.

I have failed to produce the contour plot graph as seen in demo “variational classifier”, because i have many errors when i try to produce the predictions grid.

If i understand correctly in order to evaluate the prediction grid you need
1.to load “params” (params array hold last trained values of qcircuit)
into the test function as shown:

predicted_test, fidel_test = test(params, x, y, state_labels)

  1. Then produce x and y as np.arrays but i somehow fail because x has a weird structure

  2. plot predicted_test values for x,y

Can someone provide more details? I personally believe that the contour plot at the end of the “variational classifier” is very important so maybe you should use it in all demos :slight_smile:

p.s. we are trying to generalize this demo to 2 qubits, any insights?

Thanks in advance.

Hi @NikSchet!

In the variational classifier demo, the points of the grid are regarded as data points and are then fed into the variational classifier with the bias and gate parameters obtained after training (stored as var). The result of classification is then used to create the contour plot graph.

Could you help with how you’ve tried giving it a go? Seeing the approach could help spot what might be going wrong. Posting a minimal (non-)working example should suffice and we could build on that from there. :slightly_smiling_face:

I personally believe that the contour plot at the end of the “variational classifier” is very important so maybe you should use it in all demos :slight_smile:

Thanks for noting that, that’s definitely helpful to know! :slightly_smiling_face:

As for the generalization of the demo to multiple qubits, section 4 has some pointers and description in the original paper. Were there specific questions that you’ve come across?

1 Like

Suppose after training we obtain params

params
array([[-1.39102862,  1.55168641, -0.32320414],
       [ 0.16806471,  0.02874544,  0.75835448],
       [ 0.4688555 ,  0.23718656,  0.21633758]])

We first calculate the Grid datapoints

xx, yy = np.meshgrid(np.linspace(0.0, 1, 100), np.linspace(0.0, 1, 100)) 
X_grid = [np.array([x, y]) for x, y in zip(xx.flatten(), yy.flatten())]

+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=

Finally we need to calculate predictions for the Grid using
qcircuit(params, x=None, y=None)

So my questions is what inpus should i pass to x , y .

Ιn the variational classifier demo they use

predictions_grid = [variational_classifier(var, angles=f) for f in features_grid]
Z = np.reshape(predictions_grid, xx.shape)

I guess i must be doing something terribly wrong since i tried a lot of inputs to x,y

Thanks in advance!

Hey @NikSchet,

The following should work:

params = np.array([[ 0.53841959,  1.21036237, -0.08101526],
       [-0.33445488,  0.64181687, -0.59442591],
       [-2.29400846, -1.18534645,  0.32099704]])
# params = np.random.random((3, 3))

num = 20
xx, yy = np.meshgrid(np.linspace(-1, 1, num), np.linspace(-1, 1, num))

center = np.array([0, 0])
radius = np.sqrt(2 / np.pi)

zz = []
zz_pred = []

for i in range(num):
    for j in range(num):
        x = xx[i, j]
        y = yy[i, j]
        z = 1 if np.linalg.norm(np.array((x, y)) - center) < radius else 0
        
        r = test(params, [[x, y, 0]], [z], state_labels)
        zz_pred.append(r[1][0][0])
        
        zz.append(z)

zz = np.reshape(zz, (num, num))
zz_pred = np.reshape(zz_pred, (num, num))

plt.contourf(xx, yy, zz_pred)

Note that I didn’t write this tutorial, so things are a little unoptimized! Hopefully this can provide something for you to build off of.

Thanks!

1 Like

Thank you very much, it worked!!

This code should work straightaway

num = 50
xx, yy = np.meshgrid(np.linspace(-1, 1, num), np.linspace(-1, 1, num))

center = np.array([0, 0])
radius = np.sqrt(2 / np.pi)

zz = []
zz_pred = []

for i in range(num):
    for j in range(num):
        x = xx[i, j]
        y = yy[i, j]
        z = 1 if np.linalg.norm(np.array((x, y)) - center) < radius else 0
        
        r = test(params, [[x, y, 0]], [z], state_labels)
        zz_pred.append(r[1][0][0])
        
        zz.append(z)

zz = np.reshape(zz, (num, num))
zz_pred = np.reshape(zz_pred, (num, num))


cnt = plt.contourf(xx, yy, zz_pred)
plt.contourf(xx, yy, zz_pred)
plt.colorbar(cnt, ticks=[0, 1, 0.5])


plt.scatter(
    X_test[:, 0][predicted_test == 1],
    X_test[:, 1][predicted_test == 1],
    c="b",
    marker="o",
    edgecolors="k",
    label="class 0 train",)

plt.scatter(
    X_test[:, 0][predicted_test == 0],
    X_test[:, 1][predicted_test == 0],
    c="r",
    marker="o",
    edgecolors="k",
    label="class 1 train",
)


plt.legend()
plt.show()

Thanks for sharing, NikSchet!

Hello again.

somehow i have much better training and testing accuracy when i scale training data from zero to np.pi.

So, i have better much better accuracy when i use
X = minmax_scale(X, feature_range=(0, np.pi))
instead of
scaler = StandardScaler().fit(X_train)
X_train = scaler.transform(X_train)

Hi @NikSchet. Could you elaborate a bit on exactly what you’re doing; perhaps share a minimal code example? It’s difficult to say why you’re getting better accuracy when scaling the training data, but it could have several reasons. Often normalizing and/or centering the data can make learning easier due to e.g. helping with potential instability or precision issues.

Hello i am willing to share code even tho i am not sure if it make sense. The difference between my code and the pennylane code is the scaling. my scale on training data goes from 0 to π. My main question is whether there is a fundamental reason why certain scaling leads to better accuracy, If this is the case then why not scaling everything to positive values… since in my limited experience scaling datasets around zero with implies negative values results in lower metrics (accuracy, auroc etc.)

X = x
y = df["class"].values
Xnorm = minmax_scale(X, feature_range=(0, np.pi))
X_train, X_test, y_train, y_test = train_test_split(Xnorm, y, test_size=test_size, random_state=1)
padding = 0 * np.ones((len(X_train), 1))
X_train = np.c_[np.c_[X_train, padding], ]
padding = 0 * np.ones((len(X_test), 1))
X_test = np.c_[np.c_[X_test, padding], ]
import pandas as pd
y = pd.DataFrame.from_dict(y_train) 
y = y.iloc[:, :]
y_train = y[0].apply(lambda x: 0 if x <= 0 else 1)
y_train = y_train.to_numpy()
y = pd.DataFrame.from_dict(y_test) 
y = y.iloc[:, :]
y_test = y[0].apply(lambda x: 0 if x <= 0 else 1)
y_test = y_test.to_numpy()

Hey @NikSchet,

Interesting find, and maybe not so surprising if you think about it: Pauli rotations (which is how your inputs enter the circuit if I see correctly) are 4\pi-periodic in the input. This means that the quantum circuit cannot distinguish between x and x+4 \pi.
To see this, check the mathematical definition: R(x) = e^{-i\frac{x}{2} \sigma } where \sigma is a Pauli operator. So you get, e^{- i\frac{x + 4\pi}{2} \sigma } = e^{- i\frac{x}{2} \sigma} e^{-i 2\pi \sigma} = e^{- i\frac{x}{2} \sigma} (since e^{-i 2\pi \sigma} = \cos(2\pi) - i \sin(2\pi) = 1 ).
As a result, it is important to scale your inputs to lie between [0, 4\pi] (or in any other interval of that length).

However, it may be beneficial to even restrict yourself to [0, 2\pi], since e^{- i\frac{x + 2\pi}{2} \sigma } = e^{- i\frac{x}{2} \sigma} e^{-i \pi \sigma} = - e^{- i\frac{x}{2} \sigma} - which effectively means that the second half of the interval is just multiplying the gate with a sign, which causes restrictions to what the ML model can learn.

I am not sure if there are further benefits to the restriction to [0,\pi], which - if you look at trigonometric functions - subselects an almost linear part of the cosine and sine functions, which means your model is a lot less nonlinear in the data than if you use a wider scale, which could be beneficial for a given task. Maybe worth trying X = minmax_scale(X, feature_range=(0, 2*np.pi)) to see if it performs equally well? Would be an interesting research question!

1 Like