No Gradients in StrawberryFields

Hello,
I’m trying to build a neural network layer, similar to the paper [1806.06871] Continuous-variable quantum neural networks for the sine fitting example. But after getting rid of all problems, it seems there is no gradient, which i do not understand. I already looked up various examples from SF demos and applied them, but I#d be very grateful for help. Here is what I’ve done:

batch_size = 10

#dataset
X = np.linspace(-1.0, 1.0, num=batch_size)
Y = np.sin(X)
X = tf.convert_to_tensor(X)
Y = tf.convert_to_tensor(Y)

prog = sf.Program(1)
cutoff = 10
# create symbolic parameter for data input displacement
alpha_d = prog.params("alpha_d")

# create symbolic ones for layer
theta_1a, theta_1b , s_1, alpha_1, kappa_1= prog.params("theta_1a", "theta_1b", "s_1", "alpha_1","kappa_1")

with prog.context as q:
    # States
    Dgate(alpha_d) | q[0]
    # Gates
    Rgate(theta_1a) | q[0]
    Sgate(s_1) | q[0]
    Rgate(theta_1b) | q[0]
    Dgate(alpha_1) | q[0]
    Kgate(kappa_1)| q[0]
    # Measurements
    MeasureHomodyne(0.0) | q[0]

    
eng = sf.Engine(backend="tf", backend_options={"cutoff_dim": cutoff, "batch_size":batch_size})    
opt = tf.keras.optimizers.Adam(learning_rate=0.1)
steps = 50


theta_1a = tf.Variable(1.0)
theta_1b = tf.Variable(1.0)
s_1 = tf.Variable(1.0)
alpha_1 = tf.Variable(1.0)
kappa_1 = tf.Variable(0.1)


mapping = {"alpha_d": X,
           "theta_1a": theta_1a, "theta_1b": theta_1b, "s_1": s_1, 
           "alpha_1": alpha_1, "kappa_1": kappa_1}


def loss():
    # This is the Least-Squares-Loss
    # reset the engine if it has already been executed
    if eng.run_progs:
        eng.reset()

    # execute the engine
    results = eng.run(prog, args=mapping)
    results = tf.reshape(results.samples,[batch_size])
    print(results)
    loss = 0
    for l, p in zip(Y, results):
        print(l,p, loss)
        loss = loss + (l - p) ** 2

    loss = loss / len(Y)
    return loss

for step in range(steps):
    _ = opt.minimize(loss, [theta_1a, theta_1b , s_1, alpha_1, kappa_1])
    parameter_vals = [theta_1a.numpy(), theta_1b.numpy() , s_1.numpy(), alpha_1.numpy(), kappa_1.numpy()]
    print("Parameter values at step {}: {}".format(step, parameter_vals))
    

eng.reset()

The error message is:
ValueError: No gradients provided for any variable: (['Variable:0', 'Variable:0', 'Variable:0', 'Variable:0', 'Variable:0'],). Provided grads_and_vars is ((None, <tf.Variable 'Variable:0' shape=() dtype=float32, numpy=1.0>), (None, <tf.Variable 'Variable:0' shape=() dtype=float32, numpy=1.0>), (None, <tf.Variable 'Variable:0' shape=() dtype=float32, numpy=1.0>), (None, <tf.Variable 'Variable:0' shape=() dtype=float32, numpy=1.0>), (None, <tf.Variable 'Variable:0' shape=() dtype=float32, numpy=0.1>)).

Hey @zephir! I’m not sure that I completely understand why you’re getting this error. I simplified and modified your code to get to something that works. Hopefully you can adapt and change it to better suit your needs.

import numpy as np
import strawberryfields as sf
from strawberryfields.ops import *

import tensorflow as tf

cutoff = 10
eng = sf.Engine(backend="tf", backend_options={"cutoff_dim": cutoff})
prog = sf.Program(1)

theta_1a, theta_1b, s_1, alpha_1, kappa_1 = prog.params(
    "theta_1a", "theta_1b", "s_1", "alpha_1", "kappa_1"
)

with prog.context as q:
    # States
    Dgate(1.0) | q[0]
    # Gates
    Rgate(theta_1a) | q[0]
    Sgate(s_1) | q[0]
    Rgate(theta_1b) | q[0]
    Dgate(alpha_1) | q[0]
    Kgate(kappa_1) | q[0]

opt = tf.keras.optimizers.legacy.Adam(learning_rate=0.1)

def loss():
    # reset the engine if it has already been executed
    if eng.run_progs:
        eng.reset()

    # execute the engine
    results = eng.run(
        prog,
        args={
            "theta_1a": tf_theta_1a,
            "theta_1b": tf_theta_1b,
            "s_1": tf_s_1,
            "alpha_1": tf_alpha_1,
            "kappa_1": tf_kappa_1,
        },
    )
    prob = results.state.fock_prob([1])
    # negative sign to maximize prob
    return -prob

tf_theta_1a = tf.Variable(1.0)
tf_theta_1b = tf.Variable(1.0)
tf_s_1 = tf.Variable(1.0)
tf_alpha_1 = tf.Variable(1.0)
tf_kappa_1 = tf.Variable(0.1)

for step in range(50):
    _ = opt.minimize(loss, [tf_theta_1a, tf_theta_1b, tf_s_1, tf_alpha_1, tf_kappa_1])
    parameter_vals = [
        tf_theta_1a.numpy(),
        tf_theta_1b.numpy(),
        tf_s_1.numpy(),
        tf_alpha_1.numpy(),
        tf_kappa_1.numpy(),
    ]
    print("Parameter values at step {}: {}".format(step, parameter_vals)

A few things to note:

  • I found it less problematic to use a legacy optimizer instead of what you have in your code: tf.keras.optimizers.legacy.Adam. I can tell that you’re going off of our demo here, and I actually needed to use a legacy optimizer to get it to work for me! Aside: I’ll bring this up to the dev team.
  • I found your loss function to be problematic for some reason. I removed the homodyne measurement from your circuit, and my loss function instead returns the Fock probability.
  • I removed batching, which is generally a good idea when trying to debug machine learning code :slight_smile:

Let me know if this helps you!

1 Like

Hey @isaacdevlugt , thanks a lot for this quick response!

According to the first point: I wanted to go off to redo the Pennylane demo for CV-QNN function fitting (1). The goal is to optimize the x-quadrature values from the homodyne measurements w.r.t. the labelled data.

Yesterday, by analysing the tutorials again, I got the crucial moment in my code - similar to your solution , according to your second point: The measurement. It seems, that measurements kill gradients somehow. Removing the measurements and using the state ansatz you used here, which I also found in the tutorial you mentioned, worked for my purpose: I used the quad_expectation() function (I found it in the BaseSate doc in SF-Docs: https://strawberryfields.readthedocs.io/en/stable/code/api/strawberryfields.backends.BaseState.html#strawberryfields.backends.BaseState.quad_expectation). With it, I was able to get gradients without changing the optimizer. But with your optimizer, I don’t need the loss to be vectorial anymore, which is a nice improvement :slight_smile:

I’m still not done yet, but maybe you could also check with the dev team regarding my measurement hypothesis in case you don’t know the answer anyway.

Due to my unfinished work and my inexperience using ML in StrawberryFields, I hope I may get your help again with problems that will likely occur during the next steps!

Hey @zephir! Glad you figured it out :smiley:. Of course! Please feel free to ask questions here :rocket:.

So, it turns out that homodyne measurements are not differentiable since they’re a stochastic process. That’s why the error happened to begin with :).

1 Like

Hey @zephir! I have the same problem you had here. As you and @isaacdevlugt noticed, measurements are not differentiable. I am also using quad_expectation(), but I still get the same error. I am using some general functions from Quantum neural network and my code continues as follows:

# Create array of Strawberry Fields symbolic gate arguments, matching
# the size of the weights Variable.
sf_params = np.arange(num_params).reshape(weights.shape).astype(str)
sf_params = np.array([qnn.params(*i) for i in sf_params])

# Construct the symbolic Strawberry Fields program by

x = qnn.params("x")

with qnn.context as q:
    Coherent(x) | q[0] #Data encoding
    for k in range(layers):
        layer(sf_params[k], q)
opt = opt = tf.keras.optimizers.Adam(learning_rate = 0.1)

steps = 100
for step in range(steps):
    
    if eng.run_progs:
        eng.reset()

    # execute the engine
    with tf.GradientTape() as tape:
    
        mapping1 = {p.name: w for p, w in zip(sf_params.flatten(), tf.reshape(weights, [-1]))}
        predictions = [eng.run(qnn, args = {**mapping1 , **{"x": tf.Variable(i)}}).state.quad_expectation(0)[0] for i in noissy_X]

        loss = 0
        for l, p in zip(noissy_Y, np.array(predictions)):
            loss = loss + (l - p)**2

        loss = loss / len(noissy_Y)
        
        loss = tf.convert_to_tensor(loss)
        
        print(loss)
    
    gradients = tape.gradient(loss, weights)
    opt.apply_gradients(zip([gradients], [weights]))
    print("Loss at step {}: {}".format(step, loss))

ValueError: No gradients provided for any variable: (['Variable:0'],). Provided `grads_and_vars` is ((None, <tf.Variable 'Variable:0' shape=(4, 14) dtype=float32, numpy=
array([[-7.35549107e-02,  1.19483426e-01,  4.26002108e-02,
         4.84507218e-05,  7.35140929e-06, -1.12536542e-01,
         7.80233890e-02, -8.94486438e-03, -1.95912566e-04,
         1.51194981e-04, -1.03756212e-01,  7.11902604e-02,
        -4.77273643e-05, -1.54791531e-04],
       [ 5.63541241e-02,  2.05883514e-02, -9.75365415e-02,
         4.37998551e-06,  1.21389181e-04, -9.72189661e-03,
         2.47706082e-02, -4.90078405e-02, -1.00281010e-04,
         5.04668351e-05, -4.01519276e-02, -6.67408481e-03,
         6.09786657e-05, -4.38948664e-05],
       [ 2.80619003e-02, -1.68637946e-01, -1.14657871e-01,
        -7.72963119e-07, -9.18500791e-06,  8.03599507e-03,
         7.08330376e-03, -7.10179806e-02,  3.62094652e-05,
         1.59656964e-04,  8.47241208e-02, -1.52276158e-01,
         2.37092318e-04, -2.18253321e-04],
       [ 8.04337338e-02,  2.38722749e-02,  3.92198376e-03,
        -1.48602470e-04,  8.35286846e-05, -1.67933822e-01,
         1.83792368e-01, -8.80715176e-02,  3.67052553e-06,
        -1.34917762e-04, -2.86001004e-02,  6.50760010e-02,
        -2.53011356e-04, -5.59570435e-05]], dtype=float32)>),).

It would be great if you could show me how to solve this. Thanks in advance!

1 Like