I read in a paper that certain optimization schemes can be improved by applying a sigmoid transformation to the cost Hamiltonian of the system
where H_c is the cost Hamiltonian and E_\tau is the energy expectation value at the current optimization step, whereas \sigma_0 and \sigma_T are scalar values which are not important for the purpose of my question. In this specific case, the cost Hamiltonian H_c is a sum of Pauli-Z strings, and is thus diagonal in the computational basis. The idea is that applying the sigmoid “stretches” the energy spectrum around the current energy of the system in a way that can be beneficial for optimization problems where the goal is to find the ground state of H_c.
I would like to try out this idea in my own simulations, but I am not sure how to implement the transformation efficiently. The goal is that at each optimization step in a VQE, I want to calculate the gradient and optimize the parameters of my ansatz \psi(\bar{\theta}) with respect to \langle f(H_C) \rangle, rather than the original Hamiltonian \langle H_c \rangle. How can I incorporate this feature into a Pennylane optimization framework? Exactly calculating the eigenvalues of the Hamiltonian is not an option, since that would defeat the purpose of trying to efficiently solve the optimization problem.
You could potentially implement a cost function that applies the sigmoid transformation, and use that in your optimization step.
For example, if you look at PennyLane’s demo on VQE, you’ll see that it follows the following structure:
Build the Hamiltonian
Build a QNode
Build a cost function that takes only the parameters you want to optimize
Build an optimization routine that finds the gradients of the cost function
For what you want to do you could modify the cost function such that it doesn’t simply return the value of the circuit, but instead the value of the circuit after the transformation.
Here’s some pseudocode as an example:
def sigmoid(Hc,E_T):
# Write the function for the sigmoid transformation
return result
# Write a cost function that depends on the parameters you want to optimize
def cost_fn(params):
# Calculate the output of your circuit
E_T = circuit(params, wires=range(qubits))
# Calculate the sigmoid
cost = sigmoid(Hc,E_T)
# Return the cost
return cost
I’m not sure that this will work but it’s something you could try.
Thank you for your reply, and excuse me for my delayed response.
If I understand your suggestion correctly, you are proposing applying a sigmoid function to the expectation value of the output of the circuit, however, this is not quite the same result I am aiming to achieve. I would not like the sigmoid of the expectation value, but rather the expectation value of the sigmoided Hamiltonian.
The Hamiltonian I am working with is diagonal in the computational basis, meaning that all measurements return eigenstates and eigenvalues of the Hamiltonian. In principle, It should therefore be possible to apply the sigmoid function to the sampled eigenvalues, before taking the expectation value (when not calculating the gradient analytically, but with a finite number of shots). Would it be possible to implement a custom measurement in Pennylane that implements such a functionality? I.e. a measurement which applies some function to the sampled energies before taking the expectation value?
In PennyLane you can perform operations on operators and add these within a circuit. So maybe what you could try is defining your Hamiltonian as an operator, and then use for example qml.exp() to generate your sigmoid as part of the circuit. You can see some examples on how to use qml.exp() in the documentation.
That being said, I don’t really know if this is feasible.
On the other hand, note that samples are stochastic, thus they cannot be differentiated. You can generate single-shot expectation values instead, which in practice is the same as taking a single sample.
My first thought was the same, that I could try to modify the original Hamiltonian operator H_c using built-in functionalities such as qml.exp() to perform the sigmoid. The problem is that the sigmoid transform requires an inverse, which is an operation which is not supported.
Regarding the samples being stochastic, I am not sure I see why this is an issue. I understand that analytic differentiation would not be possible in the protocol I propose as it relies on taking a finite number of shots, but hardware-compatible differentiation using the parameter-shift rule should still work fine, unless there is some flaw in my understanding. If I decide to apply a transformation to the sampled eigenvalues before taking the expectation values required for the parameter shift rule, I do not see how that would inhibit the differentiability.
For samples even hardware-compatible differentiation wouldn’t be well defined. The code below shows how to obtain samples by using single-shot expectation values. You then can find gradients for this circuit.
import pennylane as qml
from pennylane import numpy as pnp
# Create a device with a single shot
dev = qml.device("default.qubit", wires=3, shots=1)
# Create the qnode returning an expval
@qml.qnode(dev)
def circ(x, y):
qml.RX(x, wires=0)
qml.RY(y, wires=1)
return qml.expval(qml.Z(0))
# Compute the samples
n_samples = 5 # Number of samples
samples = [circ(1.0, 2.0) for _ in range(n_samples)]
print(samples)
# Print the gradient for both parameters
print(qml.grad(circ, argnum=[0,1])(1.0, 2.0))
Regarding the inverse, that does pose a problem indeed. I’ll let you know if I find something you can do there but I can’t think of anything at the moment.
Thank you for your suggestion, and for continuing to help me look into the issue.
Regarding your last post, It is still not quite what I am looking for. The code you provided seems to calculate the sigmoid of the expectation value of the efficient Hamiltonian H_{eff}, i.e. sigmoid(<H_{eff}>), whereas the paper I previously referred to calculates the expectation value of the sigmoid applied to the efficient Hamiltonian, i.e. <sigmoid(H_{eff})>, where it is known that H_{eff} is diagonal in the computational basis. In the original paper I cited previously, the authors explicitly say they “leverage the manifestly diagonal form of H” to be able to perform this sigmoid transformation, however they do not explicitly say how.
Furthermore, it seems they manage to perform this transformation both in simulation and when running on actual IonQ hardware (however of course they might not be using Pennylane). As it seems the results in the paper are providing state-of-the-art performance it would be very interesting to see if the results could be replicated.
I think in this case you might need to do some math. The inverse of a diagonal matrix is such that the elements of the diagonals are the reciprocals. So this might be why this works in this case. See here for example: Inverse of Diagonal Matrix - ProofWiki
I don’t know if you can code the sigmoid as a unitary gate, but if you manage to do so then you can use qml.adjoint to find the inverse (since the adjoint and inverse are identical for unitary gates, but not in general).
You could also take a look at the Intro to QSVT demo. It probably won’t help out of the box but it could give you some ideas.