I’m studying a paper called Quantum generative adversarial networks by Pierre-Luc Dallaire-Demers and Nathan Killoran that introduces a layered structure for the generator and discriminator circuits and defines the gradient ⟨Z⟩grad for training. I have several related questions:
expectation value of the operator Z on Out D:
how we can prove that the expectation value of operator Z is proportional to
Pr(D(~θD , |λ〉 , R(|λ〉)) = |real〉?
how can we derive the cost function which is equation 11? i can understand the existence of trace in formula but how can we derive the overall formula?
how we can establish that Pr(Success D(~θD )|~θ) is bounded by purity function C(~θG) ?
introduction of ⟨Z⟩_grad:
The paper establishes that ⟨Z⟩_grad
corresponds to the gradient of the expected value of observable(P).
How is this deduced from earlier discussions in the paper? Is it directly tied to how
parameterized quantum circuits calculate gradients (e.g., the parameter-shift rule), or is it a
result of the specific ansatz structure and observable measurements?
Generator and Discriminator Circuit Design:
The generator and discriminator circuits are composed of several layers of parameterized RX
, RZ rotations, and nearest-neighbor ZZ rotations. How is this structure deduced from the theoretical framework established earlier in the paper?
Does this choice of design (as shown in Figure 7 of the paper) maximize the ability to represent and distinguish probability distributions, or is it influenced by other constraints, such as hardware compatibility or training stability?
I’m trying to understand how the authors logically derived both the gradient definition and the layered structure for the generator and discriminator from the principles laid out in the paper. Any guidance would be appreciated!
Hi @taha_hoseinpour , welcome to the Forum.
It’s nice to see that you’re interested in the paper!
You may be interested in taking a look at the demo on QGANs with Cirq and Tensorflow, written by Nathan Killoran himself. It might help to clarify some of your questions.
We also have another demo on QGANs, written by James Ellis, which shows some of the math that you’re asking about.
More generally I feel like some of your questions are more fundamentally about gradients in quantum circuits in general. Is that the case? If so I can share some resources specifically on that topic.
Let me know if you still have any specific questions after reading the resources I mentioned.
I hope this helps.
Thank you so much for helping me! I truly appreciate your assistance.
Here are my questions:
1:
Why is the statement true that the expectation value of the operator on the “Out D” register is proportional to the probability?
2:
how one can mathematically derive this cost function.
3:
how we can establish this inequality.
4:
Why does Zgrad equal the gradient?
that is:
Why does the above circuit calculate the gradient of the expectation value of the observable?
and my last question:
Why are the ansatz components Rz, Rx, and Rzz?
I know I’ve asked quite a few questions, and I truly appreciate your help. If you could point me toward resources or provide guidance to help me understand these concepts better, I would be very grateful, i have also used the resources you gave me to finish my course project, so thank you.
I’m eager to contribute to Pennylane and want to thoroughly understand the underlying mathematical principles first.
Thank you so much for your time and support!
Hi @taha_hoseinpour ,
It’s great to see that you want to contribute to PennyLane and understand the math.
- The expectation value is the dot product between the probabilities of collapsing to the eigen states of the operator, and the corresponding eigenvalues. So for Z the eigenstates are \vert real\rangle and \vert fake\rangle with corresponding eigenvalues 1 and -1. Let’s call the probability of measuring D in the \vert real\rangle state P_{real} and the probability of measuring D in the \vert fake\rangle state P_{fake}. Then we’d have: \langle Z\rangle = 1* P_{real} + (-1)* P_{fake}. Since we’re handling probabilities of measuring our eigenstates we know that P_{real} + P_{fake} = 1, so P_{fake} = 1 - P_{real}. We can replace this in the expression that we had for \langle Z\rangle and we get \langle Z\rangle = P_{real} - (1-P_{real})= 2P_{real} - 1. Since we’re looking at an optimization problem we can ignore the constant and we have that \langle Z\rangle is proportional to P_{real}.
For your other questions have you gotten stuck doing the math? If so, it would be ideal to show what you have tried so far.
I would also recommend checking pennylane.ai. We have a lot of resources on gradients and optimization. Have you looked at some of them? If so, which ones?