Hi!
I am currently creating a hybrid quantum-classical processing model wherein the outputs of a variational circuit are inputs to a neural network from which a cost is calculated. My idea is that I want to take a certain number of shots on the quantum computer which generates a set of samples of binary bitstrings. Using these as inputs to a neural network and then calculating the cost in a vectorized manner to get the gradients. I hope this schematic figure gives an overview of what I want to accomplish:
What I technically want to do is to generate a certain number of samples from the circuit using the qml.sample() function using a device with say 100 shots. Using these essentially as a “batch” when performing “backpropagation” to update the variational parameters of the circuit. I have tried to use the sample function this way however the gradient components are all zeros.
However, a current workaround that I found was to define the cost using a for-loop where I loop over a “shots” amount of times, then sample from the circuit using a single-shot device, pass it through the neural network and evaluate the average cost using these single-shot sampled instances. Sampling the single shot device is essentially done by
def QAOAreturnPauliZExpectation(gammas,betas,G):
QAOAcircuit(gammas,betas,G)
return [qml.expval(qml.PauliZ(wires = i)) for i in range(len(G.nodes))]
There are obvious reasons for why this type of workaround causes slow training. Firstly the cost function has a for loop in it (which I needed just to check if my approach works, I’ll see if I can somehow vectorize later).
def customcost(gammas,betas,G,qcircuit,neuralNet,adjacencymatrix):
cost = 0
for i in range(100):
x = (qcircuit(gammas,betas,G)).float() #one shot to the circuit
x = neuralNet(x) #pass it through the neural network
cost += EvaluateCutValue(adjacencymatrix,x)/100
return cost #returns a float
Here, qcircuit is a single shot device using the QAOAreturnPauliZExpectation function mentioned earlier.
Secondly, this approach hinges on runninng the entire circuit num_shots times and performing expval at the end for only a single shot device when essentially the wanted behavior can be achieved using the qml.sample() function instead. Additionally, I use parameter-shift as the diff-method, however I don’t think I can get around that fact since the approach is heavily shot-based. Note that I am not interested in defining a QNode and taking the expectation Pauli-Z value at each wire since this causes the resulting string to be zeros for the most part due to the symmetry properties of the MaxCut problem.
I am therefore curious if there is a way to use the qml.sample() function as a batch or something like that to aid in training, or if there are other smarter approaches to get the same behaviour on an analytic device.
Thanks!
AdressingSlowTraining.py (11.0 KB)
Here I have attached the code that I use. The training procedure where the variational circuit and Neural network are combined happens in the code chunk between lines 219 and 260. Particularly the “customcost” function that I defined is the cause of the bottleneck where I wish to implement the above logic of training on a batch of bitstring samples.