Learning a Binomial distribution

Hello,
The target distribution (blue bars) should closely match the learned distribution (orange bars) if the PQC has been trained successfully. However, it seems like the measured distribution is concentrated around the 1100 to 1111 range, which is not expected. Any idea why this is happening?

Here is the code:

import pennylane as qml
from pennylane import numpy as np
from scipy.stats import binom
import matplotlib.pyplot as plt

n_wires = 4  # number of qubits
dev = qml.device("default.qubit", wires=n_wires, shots=1000)

@qml.qnode(dev)
def pqc(params):
    for i in range(n_wires):
        qml.RY(params[i], wires=i)
    for i in range(n_wires - 1):
        qml.CNOT(wires=[i, i + 1])
        qml.RX(params[i], wires=i)
    return [qml.sample(qml.PauliZ(i)) for i in range(n_wires)]

# This is a new quantum node for calculating probabilities for the loss function
@qml.qnode(dev)
def pqc_probs(params):
    for i in range(n_wires):
        qml.RY(params[i], wires=i)
    for i in range(n_wires - 1):
        qml.CNOT(wires=[i, i + 1])
        qml.RX(params[i], wires=i)
    return qml.probs(wires=range(n_wires))

def loss(params, target_distribution):
    return np.sum(np.abs(pqc_probs(params) - target_distribution))

n = n_wires
p = 0.5
target_distribution = np.array([binom.pmf(i, n, p) for i in range(2**n_wires)])

params = np.random.uniform(0, 2*np.pi, n_wires)

opt = qml.GradientDescentOptimizer(stepsize=0.4)
for i in range(500):
    params = opt.step(lambda params: loss(params, target_distribution), params)
    if i % 100 == 0:
        print(f"Iteration {i}, Loss: {loss(params, target_distribution)}")

print("Learned parameters:")
print(params)

# Now run the circuit with the learned parameters and get samples
samples = np.array(pqc(params))

# Convert from {-1, 1} to {0, 1} and calculate the decimal equivalent for each sample
samples = (samples + 1) // 2  # Convert from {-1, 1} to {0, 1}
samples = samples.astype(int)
samples_decimal = [int("".join(map(str, s)), 2) for s in samples.T]

# Prepare an array to hold the outcomes
outcomes = np.zeros(2**n_wires)

# Record the outcomes
for outcome in samples_decimal:
    outcomes[outcome] += 1

# Normalize the outcomes to get the measured distribution
measured_distribution = outcomes / dev.shots

# Plot the measured distribution
plt.figure(figsize=(9,6))
plt.bar(range(2**n_wires), target_distribution, alpha=0.5, label="Target (Binomial)")
plt.bar(range(2**n_wires), measured_distribution, alpha=0.5, label="Measured")
plt.title("Target vs Measured Distribution")
plt.xlabel("Outcome")
plt.ylabel("Frequency")
plt.xticks(range(2**n_wires), [format(i, f'0{n_wires}b') for i in range(2**n_wires)])
plt.legend()
plt.show()

Thanks.

Hey @Solomon!

It could be a Z2-symmetry thing. I.e., are 0 and 1 being treated differently in your measured distribution compared to the target? More specifically, does 0010, say, actually mean 1101 (all bits flipped) in your model?

@isaacdevlugt thanks for taking a look!
I thought about this and you may be right, but for 8 qubits it looks more like a uniform distribution, so I am not sure if this is the case.

Oh okay, that might rule it out then.

Just inspecting your cost function — if you’re trying to learn a probability distribution, there are better distance measures that you can (and should!) use that might yield better results. One great example is the KL divergence: Kullback–Leibler divergence - Wikipedia

There are many many more :slight_smile:. Although this article is centred around PyTorch, it’s a good summary of commonly used loss functions: PyTorch Loss Functions: The Ultimate Guide.

If that doesn’t work, then you probably should reconsider how your model is constructed, hyperparameters, etc.

Indeed I tried KL and BCE, same phenomena.

I tried this for 50K iterations:

def kl_divergence(p, q):
    mask = p != 0
    return torch.tensor (np.sum(p[mask] * np.log(p[mask] / q[mask])),requires_grad=True)

def loss(params, target_distribution):
    output_probs = CONVCircuit_probs(params)
    output_probs += 1e-8
    output_probs = torch.where(torch.isnan(output_probs), torch.tensor([0.0], requires_grad=True), output_probs)
    output_probs /= torch.sum(output_probs)
    return 0.9 * kl_divergence(target_distribution.detach().numpy(), output_probs.detach().numpy()) 

It seems there is some merit in your suspicion regarding the Z2 issue, however I couldn’t quite figure out how to correctly flip the qubits.

Thanks

I would use PyTorch’s builtin loss functions (e.g., torch.nn.KLDivLoss).

If that still doesn’t work, then something is the matter with your algorithm — could be that hyperparameters need to change or there’s a small bug in your code somewhere! It would be a good idea to scale down to the 4 qubit example you originally showed diagrams of and work through each step in your code, check dimensions, etc.