Double Stochastic Gradient Descent

In the Double Stochastic Gradient Descent method. The Hamiltonian was given as

H = 4 +2I \otimes X + 4 I ⊗ Z −X⊗X + 5 Y ⊗ Y + 2 Z ⊗ X

And so the idea of “Double” stochastic means when we iterate through our SGD, we only sample the Pauli terms from H and evaluate the expectation of those terms?

For instance, we can set to sample 3 out of 5 Pauli terms at each iteration. At the first iteration, we might sample, IX,XX, ZX then at the next step we might sample IX, XX, YY,… and so on. Am I thinking of this right? If so, how many terms were set to get the results presented below?

Thanks!

And so the idea of “Double” stochastic means when we iterate through our SGD, we only sample the Pauli terms from H and evaluate the expectation of those terms?

@KAJ226, exactly right! The stochasticity comes from two sources:

  1. The finite number of shots
  2. The sampling of a random subset of the Hamiltonian terms

Hence ‘doubly stochastic’.

In this particular example, the number of sampled Hamiltonian terms was set to 1, via n=1 in the below snippet:

def loss(params):
    return 4 + (5 / 1) * circuit(params, n=1)
1 Like

I just have a question that what is this loss exactly, from what I understand we sample terms from hamiltonian and look at the moving average to see if this converges, so what is this def loss(), and why is it returning 4+(5/1)* circuit()? Shouldn’t it just be circuit()?

Hi @Kutubkhan_Bhatiya, this comes from the Hamiltonian, and the fact that we are only sampling 1 out of 5 terms:

H = 4 +2I \otimes X + 4 I ⊗ Z −X⊗X + 5 Y ⊗ Y + 2 Z ⊗ X

Here, the 4 + comes from the constant 4 term in the Hamiltonian. The 5/1 comes from the fact that we are only sampling 1 term from the remaining 5.

Oh okay, that makes a lot of sense, so if we sampled let’s say 2 out of 5 terms, would it be 5/2? Thanks for replying, I didn’t think I would get a reply on such an old post.

That’s right @Kutubkhan_Bhatiya, it would be 5/2

There’s no problem that it’s an old post, we’re here to help!