Quantum transfer learning code (Mari et al., 2019) - IBMQDevice endless execution

I am curious to understand why does the IBMQDevice keep executing the code endlessly?

I am referring to the Quantum transfer learning code (ants vs. bees classification) from (Mari et al., 2019 Link:https://arxiv.org/pdf/1912.08278.pdf
code link: https://pennylane.ai/qml/app/tutorial_quantum_transfer_learning.html).

Though I specify the number_of_epochs = 1 and the shots=1 for ‘ibmqx2’ device, the execution seems endless on the quantum hardware.

Plus the fact that I am unable to track the training progress on my system (this only worked with the simulator).

Am I missing some code segment to get back the results from the quantum hardware?

At present I only replaced the ‘default.qubit’ device to the actual ibm machine i.e.

From, p_device = qml.device(“default.qubit”, wires=n_qubits)

To, p_device = IBMQDevice(wires=n_qubits, backend=‘ibmqx2’, shots=1)

I am a novice in quantum computing, please pardon my query. I assumed 1 shot = 1 run on the quantum hardware.

On checking the IBMQ account, it displays the number of shots =1 correctly yet it seems to be executing for a lot of runs i.e. I could see 415 results (with status: COMPLETED) displayed for that particular job.

Does that have any connection with the stability of the actual quantum device? Or is the learning rate and the decay in the learning rate responsible for this?

Further details:

The following piece of code seems to execute endlessly

model_hybrid = train_model(
model_hybrid, loss_function, optimizer_hybrid, exp_lr_scheduler, num_epochs=num_epochs
)

Could you please help me understand the same?

Thank you.

Hi @angelinaG, thanks for your post!
Using a real device can take a long time especially if the device is busy (long queues of IBM users) so, probably, what you are reporting is normal. However I would like to give you some tips which could be useful:

  1. In our paper we first trained the model with a simulator and then we only executed it (with fixed parameters) on a real device. This is quite easier with respect to training. This is the code that we used: https://github.com/XanaduAI/quantum-transfer-learning/tree/master/quantum_processors
  2. You could first try to use the IBM could simulator, just to check if everything runs smoothly with your settings of the IBM plugin. If I remember correctly, this can be done by just replacing the keyword ibmqx2 with qasm_simulator.
  3. I think that it is normal that you see many jobs even if shots=1. Even in this case, the number of jobs is at least equal to the number of expectation values that you need to compute. If you classify many input images or if you train many parameters, you need to evaluate many expectation values. So this looks normal.
  4. You linked the transfer learning tutorial however, if you are interested in reproducing the results of the paper, you may find more useful the actual quantum transfer learning repository: https://github.com/XanaduAI/quantum-transfer-learning
2 Likes

Thank you so much Dr. Andrea Mari for this explanation.
The link https://github.com/XanaduAI/quantum-transfer-learning/tree/master/quantum_processors is what I was looking for.
I shall try that out.
Also, I missed mentioning that I was able to reproduce your results using the simulator on PennyLane.
Since I was trying to train the model on the quantum hardware it seemed like an endless execution. I understand that I can use the saved weights obtained by training on the simulator which will save time.

I also wish to acknowledge Dr. Maria Schuld for her timely inputs and direction to reach out to you! I shall keep you posted with the results of the experiment.

Thank you once again!

I was able to successfully execute the code on the IBMQ machine.

1 Like

I was waiting for this moment to announce that with the prompt responses received from the Xanadu-PennyLane team, special mention to @Maria_Schuld, @andreamari, and @josh (for liking the post), I was able to publish the results of this experiment in the “International Journal of Quantum Information” Link to the paper–> https://www.worldscientific.com/doi/10.1142/S0219749920500240
Thank you so much for all the support.

1 Like

Congrats @angelinaG! :slightly_smiling_face:

1 Like

Thank you so much Josh :smiley:

Hey, I was wondering what it would take to train this Transfer Learning code on IBM’s QCs?

From my understanding, it would currently take >512 jobs to calculate the gradients (due to the 512xN_Qubits fully connected layer before the qnet):

image

Do you think freezing that layer would allow us to calculate gradients in a reasonable time? Perhaps initializing those weights through a classical pretraining routine.

Also, are there QCs out there that can run that many jobs (>10000) in a reasonable time? Like, through a paid/corporate IBM quantum computer?

Hi @Jerry2001Qu,
Well, this code that you are referring to is part of the dressed quantum circuit. So, in the dressed quantum circuit, we have a hybrid classical-quantum-classical connection, i.e. a classical pre-processing layer, a quantum network and a classical post-processing layer. What you observe as the nn.Linear layers (PyTorch implementation) are actually classical layers running on the classical system itself. Here, the last but the final layer of the ResNet18 model, comprising 512 output neurons (or connections) is passed to the classical sequential layer, namely nn.Linear(), which outputs 4 neurons (values for the input qubits, i.e. nn.Linear(512, nqubits)).
The next layer is the actual quantum network, which accepts only 4 inputs and produces 4 outputs. So, the 4 classical neurons (obtained from the previous classical layer) are then embedded in the quantum circuit by performing single-qubit rotations and a Hadamard gate. After embedding them, the quantum variational layer (comprising 6 layers) operates on the qubits and finally the output is measured and passed onto the classical post processing layer of the dressed quantum circuit.
So, the last nn.Linear layer accepts the 4 output quantum states (measured on the classical register using the Pauli-Z matrix) and finally produces the 2 output states (nn.Linear(4, 2) = nn.Linear(nqubits, nclasses)) for classifying ants vs. bees as mentioned in the illustrated example by Mari et al. (2019). For further details, you can also refer to my paper mentioned on the forum to see how we applied Mari et al.'s method to detecting image splicing forgery.
In short, essentially only 4 qubit states are being processed iteratively on the qauntum simulator and the processor.
Hope this helps. Thank you.

Then, why does it take >100 jobs to calculate the gradient for a single training example? If it was just calculating parameter shifts for the RY gates, that’d be around N_QubitsxQ_Depthx2 jobs, from my understanding. However, even when I strip down the model to 2 qubits, and a depth of 3, it’s still going over 100 jobs for a single training example (batch size 1).

Hi @Jerry2001Qu, and thanks @angelinaG for your answer!

We recently introduced a new attribute to the device: dev.num_executions, which makes it easy to track the number of device executions. You could do this on simulator before trying to run on hardware. This feature can be accessed by installing the development version of PennyLane.

For example, the following shows a benchmark of the number of device executions on a 4-qubit, 6-layer circuit:

import pennylane as qml
import torch

nqubits = 4
nlayers = 6
dev = qml.device("default.qubit", wires=nqubits)


@qml.qnode(dev, interface="torch")
def qcircuit(inputs, weights):
    for i in range(nqubits):
        qml.Hadamard(wires=i)
        qml.RY(inputs[i], wires=i)
    qml.templates.BasicEntanglerLayers(weights, wires=range(nqubits))
    return [qml.expval(qml.PauliZ(i)) for i in range(nqubits)]


weight_shapes = {"weights": (nlayers, nqubits)}
inputs = torch.ones(nqubits, requires_grad=True)

qlayer = qml.qnn.TorchLayer(qcircuit, weight_shapes)

out = torch.sum(qlayer(inputs))
out.backward()

print(f"Number of executions: {dev.num_executions}")

n_exec_basic = nqubits * nlayers * 2
n_ry = nqubits * 2
n_expected = n_ry + n_exec_basic + 1  # the 1 comes from the forward pass

print(f"Expected number of executions: {dev.num_executions}")

The result is 57 device executions. We can also look at the dressed quantum circuit:

clayer1 = torch.nn.Linear(512, 4)
clayer2 = torch.nn.Linear(4, 2)

hybrid = torch.nn.Sequential(clayer1, qlayer, clayer2)
inputs = torch.ones(512, requires_grad=True)

dev._num_executions = 0

out = torch.sum(qlayer(inputs))
out.backward()

print(f"Number of executions: {dev.num_executions}")

This also gives 57 executions, so it doesn’t look like the hybrid element is increasing things (as expected).

In terms or training on IBMQ, we had a discussion on improving performance on another thread. I would say that this is quite a heavy task for optimization on hardware right now. One thing you could consider is training on simulator and testing (i.e., forward passes, which are much cheaper) on hardware.

3 Likes

Awesome! Thanks for the help @Tom_Bromley

May I ask, what is n_ry?

I think I understand n_exec_basic (to do param shift for the BasicEntanglerLayers I assume).

But I don’t understand where the executions for n_ry comes from.

Yes, n_exec_basic is equal to the number of gates in the BasicEntanglerLayers multiplied by 2.

However, we also want to differentiate with respect to the RY gates which are used to input the data. We hence have to do a forward and backward shift (for the parameter shift rule) to evaluate the gradient when the input parameters are included, which is something we need to do if the quantum circuit is placed within a larger hybrid model. The n_ry parameter is just the number of circuit executions due to finding the gradient with respect to the input parameters: # of RY gates (=# of qubits = 4) * 2 (for parameter shift rule).

1 Like

Thank you @Tom_Bromley. I shall check this feature and revert in case of further queries.

Hi, I am having an issue getting “https://github.com/XanaduAI/quantum-transfer-learning/blob/master/quantum_processors/run_on_QPU.py” to run on IBM Quantum hardware. I am not sure what I am doing wrong. My only modification to the code is where the connection to the IBM Q network takes place:

provider = IBMQ.load_account()
IBMQ.get_provider(hub=‘HUB’, group=‘GROUP’, project=‘PROJECT’)
dev = qml.device(‘qiskit.ibmq’, wires=n_wires, backend=‘ibmq_bogota’, provider=provider, shots=100)

It never runs on the quantum hardware. I get results in the notebook but it never executes. What am I doing wrong? I am correctly loading quantum_weights.pt.

Hey @dancbeaulieu!

My only modification to the code is where the connection to the IBM Q network takes place

Yes the code you’ve added looks great!

It never runs on the quantum hardware. I get results in the notebook but it never executes.

To help me understand, is your issue that:

  1. The whole script runs but you can’t see the jobs executed when you visit the jobs board on the IBMQ website?
  2. The script runs up to a point but then exits out with an error?
  3. The script hangs/takes forever to run?

There are no errors, but when I get to this part of the script runs 1/39 iterations, regardless of the number of epochs I input. Also, it looks like even though I am invoking the IBM quantum hardware, it never appears to run on the IBM hardware. I am in a position where it runs, I get no errors, but it doesn’t seem to be running on any IBM Quantum hardware. Can you give me any advice about what I can do to get it to actually run on IBM Quantum hardware?

The following print statements produce no output:
print(
“Results of the model testing on a real quantum processor.”,
file=open(“results_” + backend + “.txt”, “w”),
)
print("QPU backend: " + backend, file=open(“results_” + backend + “.txt”, “a”))

And neither do the next set of print statements:
print("\nTest Loss: {:.4f} Test Acc: {:.4f} ".format(epoch_loss, epoch_acc))

Log to file

print(
"\nTest Loss: {:.4f} Test Acc: {:.4f} ".format(epoch_loss, epoch_acc),
file=open(“results_” + backend + “.txt”, “a”),
)

However, I do get this counter for iterations, which is oddly set at X/39 and I can’t change the number of iterations. I am baffled.
Iter: 2/39

Hey @dancbeaulieu!

My recommendation would be to make a separate script that does a simple run on the quantum hardware and confirms everything is set up ok. You can then make sure the contents of the larger transfer learning script, where you load the device, match up with the script we know runs successfully.

You could try running the following to see if you get an output:

import pennylane as qml
from qiskit import IBMQ

IBMQ.load_account()
provider = IBMQ.get_provider(hub="HUB", group="GROUP", project="PROJECT")

backend = provider.backends()[0].name()
print(f"Running on backend {backend}")

dev = qml.device("qiskit.ibmq", backend=backend, wires=1,  provider=provider, shots=2000)

@qml.qnode(dev)
def f(x):
    qml.RX(x, wires=0)
    return qml.expval(qml.PauliZ(0))

print(f(0.5))

We should hopefully see this code print an output result and be visible as a job on IBMQ.

Hi, the code you sent me runs on quantum hardware, as does some other simple code I ran on actual quantum hardware “ibm_lagos”, and it showed up in the IBM Q administration logs. It works, everything ran successfully. Its the transfer learning that I can’t figure out, there are no errors, messages, or logs to analyze. So the issue is in the transfer learning code, and I’m not sure what’s going wrong.

Output:
/Users/dabeaulieu/opt/anaconda3/envs/qnlp/lib/python3.8/site-packages/pennylane_qiskit/qiskit_device.py:315: UserWarning: verbose is not a recognized runtime option and may be ignored by the backend. self._current_job = self.backend.run(qcirc, shots=self.shots, **self.run_args)

0.809