Ensemble classification with Forest and Qiskit devices : Parameters

Dear all,

Can someone please explain to me how the pre-trained set of parameters (params.npy file) were obtained?

If i understand correctly those parameters have to do with the state preparation stage in which we try to encode classical information to quantum states. In that case did you use the similar function:
def get_angles(x): from the “variational classifier demo” to obtain params?

Thank you in advance

Hey @NikSchet,

The params were obtained by training the hybrid model. For example, let’s define:

def softmax_ensemble(params, x_point=None):
    results = qnodes(params, x=x_point)
    softmax = torch.nn.functional.softmax(results, dim=1)
    choice = torch.where(softmax == torch.max(softmax))[0][0]
    return softmax[choice]

This function evaluates both circuits, converts to softmax vectors, and then returns the “most confident” vector (that is, the one with the largest entry). For example:

n_layers = 2
# the first index is for the two models
params = torch.tensor(np.random.random((2, n_layers, n_wires, 3)), requires_grad=True)

print(softmax_ensemble(params, x_train[0]))

The above gives a vector such as [0.2624, 0.4455, 0.2921], denoting a 44.55% probability of class 1.

This can be compared to the one-hot encoded version of y_train:

import tensorflow as tf
y_soft = torch.tensor(tf.one_hot(y_train, n_classes).numpy(), requires_grad=True)

For example, y_soft[0] = [1, 0, 0].

We then just need to define a cost function to compare the predicted vector with the target, for example:

def cost(params, y_point, x_point=None):
    return torch.sum(torch.abs(softmax_ensemble(params, x_point=x_point) - y_point))

This is a simple cost function that gives the absolute error.

Training is then as simple as:

opt = torch.optim.Adam([params], lr = 0.1)

for epoch in range(3):
    for x_point, y_point in zip(x_train, y_soft):
        opt.zero_grad()
        c = cost(params, y_point=y_point, x_point=x_point)
        c.backward()
        opt.step()

The training could be improved with random selection of points from the training set, batching, and use of a different cost function.

All the code together (also requires other parts of the tutorial to be run):

import tensorflow as tf

n_layers = 2
# the first index is for the two models
params = torch.tensor(np.random.random((2, n_layers, n_wires, 3)), requires_grad=True)


def softmax_ensemble(params, x_point=None):
    results = qnodes(params, x=x_point)
    softmax = torch.nn.functional.softmax(results, dim=1)
    choice = torch.where(softmax == torch.max(softmax))[0][0]
    return softmax[choice]

def cost(params, y_point, x_point=None):
    return torch.sum(torch.abs(softmax_ensemble(params, x_point=x_point) - y_point))


y_soft = torch.tensor(tf.one_hot(y_train, n_classes).numpy(), requires_grad=True)

opt = torch.optim.Adam([params], lr = 0.1)

for epoch in range(3):
    for x_point, y_point in zip(x_train, y_soft):
        opt.zero_grad()
        c = cost(params, y_point=y_point, x_point=x_point)
        c.backward()
        opt.step()

[quote=“Tom_Bromley, post:2, topic:608”]

Thank you very much for the detailed answer!!

Hello, I have also the same problem. In my case how to get this params.npy file. Because in the (Ensemble classification with Forest and Qiskit devices) they had used dataset having 2 features and 3 classes. In my dataset I have 10 features and 2 classes.
So I want to create this params.npy file fore my own dataset.

I am now facing this error (IndexError: index 4 is out of bounds for axis 2 with size 4)

Hi @Ubaid_Ullah, welcome to the forum!

Could you please share the code you’re using? Since you have a different size for your dataset you might have to change certain parts of the code.

Also, what version of the different libraries are you using?

n_features = 10
n_classes = 2
n_samples = 3042
cols=[22, 38, 41, 24, 35, 1, 40, 5, 4, 16]

sample_train=sample_train[:,[i-1 for i in cols]]
sample_test=sample_test[:,[i-1 for i in cols]]
sample_test.shape

n_wires = 10

dev0 =qml.device(“forest.qvm”, device=“4q-qvm”)
dev1 = qml.device(“qiskit.aer”, wires=10)
devs = [dev0, dev1]

def circuit0(params, x=None):
for i in range(n_wires):
qml.RX(x[i % n_features], wires=i)
qml.Rot(*params[1, 0, i], wires=i)

qml.CZ(wires=[0, 1])
qml.CZ(wires=[1, 2])
qml.CZ(wires=[2, 3])
qml.CZ(wires=[3, 4])
qml.CZ(wires=[4, 5])
qml.CZ(wires=[5, 6])
qml.CZ(wires=[6, 7])
qml.CZ(wires=[7, 8])
qml.CZ(wires=[8, 9])
qml.CZ(wires=[9, 0])



for i in range(n_wires):
    qml.Rot(*params[1, 1, i], wires=i)
return qml.expval(qml.PauliZ(0)), qml.expval(qml.PauliZ(1)), qml.expval(qml.PauliZ(2))

def circuit1(params, x=None):
for i in range(n_wires):
qml.RX(x[i % n_features], wires=i)
qml.Rot(*params[0, 0, i], wires=i)

qml.CZ(wires=[0, 1])
qml.CZ(wires=[1, 2])
qml.CZ(wires=[2, 3])
qml.CZ(wires=[3, 4])
qml.CZ(wires=[4, 5])
qml.CZ(wires=[5, 6])
qml.CZ(wires=[6, 7])
qml.CZ(wires=[7, 8])
qml.CZ(wires=[8, 9])
qml.CZ(wires=[9, 0])


for i in range(n_wires):
    qml.Rot(*params[0, 1, i], wires=i)
return qml.expval(qml.PauliZ(0)), qml.expval(qml.PauliZ(1)), qml.expval(qml.PauliZ(2))

qnodes = qml.QNodeCollection(
[qml.QNode(circuit0, dev0, interface=“torch”),
qml.QNode(circuit1, dev1, interface=“torch”)]
)

params = np.load(“ensemble_multi_qpu/params.npy”)

print(“Predicting on training dataset”)
p_train, p_train_0, p_train_1, choices_train = predict(params, x=sample_train)
print(“Predicting on test dataset”)
p_test, p_test_0, p_test_1, choices_test = predict(params, x=sample_test)

I am using this code. and using the following library version.

ackage Version


absl-py 1.0.0
appdirs 1.4.4
argon2-cffi 20.1.0
astunparse 1.6.3
async-generator 1.10
attrs 20.3.0
autograd 1.3
autoray 0.2.5
backcall 0.2.0
bleach 4.0.0
cachetools 4.2.4
certifi 2021.10.8
cffi 1.14.6
charset-normalizer 2.0.9
cloudpickle 2.0.0
cryptography 36.0.0
cvxpy 1.1.11
cycler 0.11.0
dask 2021.12.0
debugpy 1.5.1
decorator 5.1.0
defusedxml 0.7.1
dill 0.3.4
dlx 1.0.4
docker 5.0.3
docplex 2.22.213
ecos 2.0.8
entrypoints 0.3
fastdtw 0.3.4
flatbuffers 2.0
fonttools 4.28.3
fsspec 2021.11.1
future 0.18.2
gast 0.4.0
google-auth 2.3.3
google-auth-oauthlib 0.4.6
google-pasta 0.2.0
grpcio 1.42.0
h11 0.9.0
h5py 3.2.1
httpcore 0.11.1
httpx 0.15.5
idna 3.3
immutables 0.6
importlib-metadata 4.8.1
inflection 0.5.1
ipykernel 6.4.1
ipython 7.29.0
ipython-genutils 0.2.0
ipywidgets 7.6.5
iso8601 0.1.16
jax 0.2.26
jaxlib 0.1.75
jedi 0.18.0
Jinja2 3.0.2
joblib 1.1.0
jsonschema 3.2.0
jupyter 1.0.0
jupyter-client 7.0.6
jupyter-console 6.4.0
jupyter-core 4.9.1
jupyterlab-pygments 0.1.2
jupyterlab-widgets 1.0.0
keras 2.7.0
Keras-Preprocessing 1.1.2
kiwisolver 1.3.2
lark 0.11.3
libclang 12.0.0
llvmlite 0.37.0
locket 0.2.1
lxml 4.6.4
Markdown 3.3.6
MarkupSafe 2.0.1
matplotlib 3.5.0
matplotlib-inline 0.1.2
mistune 0.8.4
more-itertools 8.12.0
mpmath 1.2.1
msgpack 0.6.2
multitasking 0.0.10
nbclient 0.5.3
nbconvert 6.1.0
nbformat 5.1.3
nest-asyncio 1.5.1
networkx 2.6.3
ninja 1.10.2.3
notebook 6.4.6
ntlm-auth 1.5.0
numba 0.54.1
numpy 1.20.3
oauthlib 3.1.1
opt-einsum 3.3.0
osqp 0.6.2.post0
packaging 21.3
pandas 1.3.4
pandocfilters 1.4.3
parso 0.8.2
partd 1.2.0
pbr 5.8.0
PennyLane 0.20.0
PennyLane-Forest 0.20.0
PennyLane-Lightning 0.20.1
PennyLane-qiskit 0.20.0
pexpect 4.8.0
pickleshare 0.7.5
Pillow 8.4.0
pip 21.2.4
ply 3.11
prometheus-client 0.12.0
prompt-toolkit 3.0.20
protobuf 3.19.1
psutil 5.8.0
ptyprocess 0.7.0
py 1.11.0
pyasn1 0.4.8
pyasn1-modules 0.2.8
pycparser 2.21
pydantic 1.8.2
Pygments 2.10.0
PyJWT 1.7.1
pylatexenc 2.10
pyparsing 3.0.4
pyquil 2.28.2
pyrsistent 0.18.0
python-constraint 1.4.0
python-dateutil 2.8.2
python-rapidjson 1.5
pytz 2021.3
PyYAML 6.0
pyzmq 22.3.0
qcs-api-client 0.8.0
qdldl 0.1.5.post0
qiskit 0.34.0
qiskit-aer 0.10.1
qiskit-aer-gpu 0.9.1
qiskit-aqua 0.9.5
qiskit-ibmq-provider 0.18.3
qiskit-ignis 0.7.0
qiskit-machine-learning 0.2.1
qiskit-terra 0.19.1
qtconsole 5.1.1
QtPy 1.10.0
Quandl 3.7.0
requests 2.26.0
requests-ntlm 1.1.0
requests-oauthlib 1.3.0
retry 0.9.2
retrying 1.3.3
retworkx 0.10.2
rfc3339 6.2
rfc3986 1.5.0
rpcq 3.9.2
rsa 4.8
ruamel.yaml 0.17.17
ruamel.yaml.clib 0.2.6
scikit-learn 1.0.1
scipy 1.7.3
scs 2.1.4
semantic-version 2.6.0
Send2Trash 1.8.0
setuptools 58.0.4
setuptools-scm 6.3.2
sip 4.19.13
six 1.16.0
sniffio 1.2.0
sparse 0.13.0
stevedore 3.5.0
symengine 0.8.1
sympy 1.9
tensorboard 2.7.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.0
tensorflow 2.7.0
tensorflow-estimator 2.7.0
tensorflow-io-gcs-filesystem 0.23.0
termcolor 1.1.0
terminado 0.9.4
testpath 0.5.0
threadpoolctl 3.0.0
toml 0.10.2
tomli 1.2.2
toolz 0.11.2
torch 1.10.0
torchvision 0.11.1
tornado 6.1
traitlets 5.1.1
tweedledum 1.1.1
typing_extensions 4.0.1
urllib3 1.26.7
wcwidth 0.2.5
webencodings 0.5.1
websocket-client 1.2.3
Werkzeug 2.0.2
wheel 0.37.0
widgetsnbextension 3.5.1
wrapt 1.13.3
yfinance 0.1.67
zipp 3.6.0

Hi @Ubaid_Ullah, could you please share your dataset?

I’m trying to replicate your problem and this will make it much easier.
You can upload .txt and .py files here.

@CatalinaAlbornoz I covert my code and dataset into txt file and tried to send it to you but it seems (New user cannot upload attachment) how to send you my code and dataset?

Hi @Ubaid_Ullah!

Thank you for sharing your data. I finally made it work. Here’s what I found:

In the “Make predictions” section you had changed
params = torch.tensor(np.random.random((2, n_layers, n_wires, 3)), requires_grad=True)

to

params = torch.tensor(np.random.random((2, n_layers, n_wires, 10)), requires_grad=True)

However if you look carefully at the circuits you notice that when using the rotations you are saying that the parameters have size 3, not 10. This will then show an error.

qml.Rot(*params[1, 0, i], wires=i)

The next thing to take into account is that in the cost function you are subtracting two values.

return torch.sum(torch.abs(softmax_ensemble(params, x_point=x_point) - y_point))

Hence they must have the same shape. y_point comes from y_soft and if you look carefully y_soft depends on the number of classes.

y_soft = torch.tensor(tf.one_hot(label_train, n_classes).numpy(), requires_grad=True)

However, the number of classes is not affecting the softmax part of the cost equation. The ideal solution would be to change the softmax logic. However you can also simply hardwire “3” instead of n_classes.

These 2 changes should get you running.

Let me know if this works for you!

Hi, I haven’t quite understand some of the indexes representing what in variable params

My questions for now are

  1. Why the value of n_layers is 2?
  2. What is the forth index (3) is for? Is it for the classes? If so, if I have 2 classes then can the forth index be changed into 2?

Hi @quirkyMouse,

You can change the number of layers to any number. This will determine the depth of your circuits. You can try with different values to see which one gives you better results. If the number of layer is too high however you may run into different kinds of issues.

The last value is 3 because we use qml.Rot in the circuits. This function takes 3 parameters so we need to specify this when we create the parameters tensor.

I hope this answers your question!

Thank for the explanation @CatalinaAlbornoz ,

Can I change the value of n_layers without changing the quantum circuit (still using the circuit from the tutorial)? Or will it affect the accuracy?

Also, I’m experimenting on two classes dataset, I already try to change the learning rate and it does improve the accuracy
opt = torch.optim.Adam([params], lr = 0.1)
What should I change from the tutorial to improve the accuracy on my two classes dataset?

Hi @quirkyMouse,
It’s very possible that changing n_layers will change your accuracy. Changing your ansatz and your initial parameters will surely affect it too but it’s hard to know exactly what needs to be done if you want to improve it. I guess this is a question for you to respond! If you do get the accuracy to improve it would be great if you could share your work here or as a Community demo. :smiley:

1 Like