Data-reuploading classifer: predictions grid

NikSchet · October 7, 2020, 8:13am

Dear all.

I am using the demo “Data-reuploading classifier” to play around with different datasets.

I have failed to produce the contour plot graph as seen in demo “variational classifier”, because i have many errors when i try to produce the predictions grid.

If i understand correctly in order to evaluate the prediction grid you need
1.to load “params” (params array hold last trained values of qcircuit)
into the test function as shown:

predicted_test, fidel_test = test(params, x, y, state_labels)

Then produce x and y as np.arrays but i somehow fail because x has a weird structure
plot predicted_test values for x,y

Can someone provide more details? I personally believe that the contour plot at the end of the “variational classifier” is very important so maybe you should use it in all demos

p.s. we are trying to generalize this demo to 2 qubits, any insights?

Thanks in advance.

antalszava · October 7, 2020, 8:22pm

Hi @NikSchet!

In the variational classifier demo, the points of the grid are regarded as data points and are then fed into the variational classifier with the bias and gate parameters obtained after training (stored as var). The result of classification is then used to create the contour plot graph.

Could you help with how you’ve tried giving it a go? Seeing the approach could help spot what might be going wrong. Posting a minimal (non-)working example should suffice and we could build on that from there.

I personally believe that the contour plot at the end of the “variational classifier” is very important so maybe you should use it in all demos

Thanks for noting that, that’s definitely helpful to know!

As for the generalization of the demo to multiple qubits, section 4 has some pointers and description in the original paper. Were there specific questions that you’ve come across?

NikSchet · October 7, 2020, 10:04pm

Suppose after training we obtain params

params
array([[-1.39102862,  1.55168641, -0.32320414],
       [ 0.16806471,  0.02874544,  0.75835448],
       [ 0.4688555 ,  0.23718656,  0.21633758]])

We first calculate the Grid datapoints

xx, yy = np.meshgrid(np.linspace(0.0, 1, 100), np.linspace(0.0, 1, 100)) 
X_grid = [np.array([x, y]) for x, y in zip(xx.flatten(), yy.flatten())]

+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=

Finally we need to calculate predictions for the Grid using
qcircuit(params, x=None, y=None)

So my questions is what inpus should i pass to x , y .

Ιn the variational classifier demo they use

predictions_grid = [variational_classifier(var, angles=f) for f in features_grid]
Z = np.reshape(predictions_grid, xx.shape)

I guess i must be doing something terribly wrong since i tried a lot of inputs to x,y

Thanks in advance!

Tom_Bromley · October 8, 2020, 8:07pm

Hey @NikSchet,

The following should work:

params = np.array([[ 0.53841959,  1.21036237, -0.08101526],
       [-0.33445488,  0.64181687, -0.59442591],
       [-2.29400846, -1.18534645,  0.32099704]])
# params = np.random.random((3, 3))

num = 20
xx, yy = np.meshgrid(np.linspace(-1, 1, num), np.linspace(-1, 1, num))

center = np.array([0, 0])
radius = np.sqrt(2 / np.pi)

zz = []
zz_pred = []

for i in range(num):
    for j in range(num):
        x = xx[i, j]
        y = yy[i, j]
        z = 1 if np.linalg.norm(np.array((x, y)) - center) < radius else 0
        
        r = test(params, [[x, y, 0]], [z], state_labels)
        zz_pred.append(r[1][0][0])
        
        zz.append(z)

zz = np.reshape(zz, (num, num))
zz_pred = np.reshape(zz_pred, (num, num))

plt.contourf(xx, yy, zz_pred)

Note that I didn’t write this tutorial, so things are a little unoptimized! Hopefully this can provide something for you to build off of.

Thanks!

NikSchet · October 8, 2020, 9:02pm

Thank you very much, it worked!!

NikSchet · October 8, 2020, 9:10pm

This code should work straightaway

num = 50
xx, yy = np.meshgrid(np.linspace(-1, 1, num), np.linspace(-1, 1, num))

center = np.array([0, 0])
radius = np.sqrt(2 / np.pi)

zz = []
zz_pred = []

for i in range(num):
    for j in range(num):
        x = xx[i, j]
        y = yy[i, j]
        z = 1 if np.linalg.norm(np.array((x, y)) - center) < radius else 0
        
        r = test(params, [[x, y, 0]], [z], state_labels)
        zz_pred.append(r[1][0][0])
        
        zz.append(z)

zz = np.reshape(zz, (num, num))
zz_pred = np.reshape(zz_pred, (num, num))


cnt = plt.contourf(xx, yy, zz_pred)
plt.contourf(xx, yy, zz_pred)
plt.colorbar(cnt, ticks=[0, 1, 0.5])


plt.scatter(
    X_test[:, 0][predicted_test == 1],
    X_test[:, 1][predicted_test == 1],
    c="b",
    marker="o",
    edgecolors="k",
    label="class 0 train",)

plt.scatter(
    X_test[:, 0][predicted_test == 0],
    X_test[:, 1][predicted_test == 0],
    c="r",
    marker="o",
    edgecolors="k",
    label="class 1 train",
)


plt.legend()
plt.show()

Maria_Schuld · October 9, 2020, 8:58am

Thanks for sharing, NikSchet!

NikSchet · October 13, 2020, 12:09pm

Hello again.

somehow i have much better training and testing accuracy when i scale training data from zero to np.pi.

So, i have better much better accuracy when i use
X = minmax_scale(X, feature_range=(0, np.pi))
instead of
scaler = StandardScaler().fit(X_train)
X_train = scaler.transform(X_train)

theodor · October 13, 2020, 6:35pm

Hi @NikSchet. Could you elaborate a bit on exactly what you’re doing; perhaps share a minimal code example? It’s difficult to say why you’re getting better accuracy when scaling the training data, but it could have several reasons. Often normalizing and/or centering the data can make learning easier due to e.g. helping with potential instability or precision issues.

NikSchet · October 13, 2020, 9:58pm

Hello i am willing to share code even tho i am not sure if it make sense. The difference between my code and the pennylane code is the scaling. my scale on training data goes from 0 to π. My main question is whether there is a fundamental reason why certain scaling leads to better accuracy, If this is the case then why not scaling everything to positive values… since in my limited experience scaling datasets around zero with implies negative values results in lower metrics (accuracy, auroc etc.)

X = x
y = df["class"].values
Xnorm = minmax_scale(X, feature_range=(0, np.pi))
X_train, X_test, y_train, y_test = train_test_split(Xnorm, y, test_size=test_size, random_state=1)
padding = 0 * np.ones((len(X_train), 1))
X_train = np.c_[np.c_[X_train, padding], ]
padding = 0 * np.ones((len(X_test), 1))
X_test = np.c_[np.c_[X_test, padding], ]
import pandas as pd
y = pd.DataFrame.from_dict(y_train) 
y = y.iloc[:, :]
y_train = y[0].apply(lambda x: 0 if x <= 0 else 1)
y_train = y_train.to_numpy()
y = pd.DataFrame.from_dict(y_test) 
y = y.iloc[:, :]
y_test = y[0].apply(lambda x: 0 if x <= 0 else 1)
y_test = y_test.to_numpy()

Maria_Schuld · October 14, 2020, 5:38am

Hey @NikSchet,

Interesting find, and maybe not so surprising if you think about it: Pauli rotations (which is how your inputs enter the circuit if I see correctly) are 4\pi-periodic in the input. This means that the quantum circuit cannot distinguish between x and x+4 \pi.
To see this, check the mathematical definition: R(x) = e^{-i\frac{x}{2} \sigma } where \sigma is a Pauli operator. So you get, e^{- i\frac{x + 4\pi}{2} \sigma } = e^{- i\frac{x}{2} \sigma} e^{-i 2\pi \sigma} = e^{- i\frac{x}{2} \sigma} (since e^{-i 2\pi \sigma} = \cos(2\pi) - i \sin(2\pi) = 1 ).
As a result, it is important to scale your inputs to lie between [0, 4\pi] (or in any other interval of that length).

However, it may be beneficial to even restrict yourself to [0, 2\pi], since e^{- i\frac{x + 2\pi}{2} \sigma } = e^{- i\frac{x}{2} \sigma} e^{-i \pi \sigma} = - e^{- i\frac{x}{2} \sigma} - which effectively means that the second half of the interval is just multiplying the gate with a sign, which causes restrictions to what the ML model can learn.

I am not sure if there are further benefits to the restriction to [0,\pi], which - if you look at trigonometric functions - subselects an almost linear part of the cosine and sine functions, which means your model is a lot less nonlinear in the data than if you use a wider scale, which could be beneficial for a given task. Maybe worth trying X = minmax_scale(X, feature_range=(0, 2*np.pi)) to see if it performs equally well? Would be an interesting research question!

Topic		Replies	Views
Multiclass Classification with Variational Circuits PennyLane Help	5	1947	December 20, 2018
Data re-uploading classifier in HYBRID PennyLane Help	14	1639	December 9, 2020
Bad example for variational classifier iris classification demo PennyLane Help	6	413	April 19, 2022
Data Reuploading Classifier PennyLane Feedback	1	692	March 8, 2021
Variational classifier — PennyLane error PennyLane Help	11	1116	July 2, 2021

Data-reuploading classifer: predictions grid

Related topics