Optimisation routine yields the same loss values

Been using Pennylane for a optimization task and seem to be getting the following error: UserWarning: Output seems independent of input. I have checked various things on how the loss is being calculated and if all examples are being passed in correctly but cant seem to find the issue. I have pasted my code file below - cant attach it since I am a new user. Thanks a lot for taking a look at this.

import pennylane as qml                                                 
from pennylane import numpy as np
from sklearn.metrics import log_loss
from pennylane.optimize import AdamOptimizer

#set parameters of the model 

n_qubits = 8
n_features = 16
n_parameters = 15
feature_range = (0, 2*np.pi)

train_size = 150
test_size = 150

qubits = list(range(n_qubits))


def initialize_parameters(min_range, max_range, n_parameters):
    params = np.random.uniform(low = min_range, high = max_range, size = n_parameters)
    return params


#Import the data which I have ommitted 

x_train = np.random.uniform(size = (150,16))
x_test = np.random.uniform(size = (150,16))
y_train = np.array([0]*75 + [1]*75)
y_test = np.array([0]*75 + [1]*75)


dev = qml.device("default.qubit", wires=8)

@qml.qnode(dev)
def qnn(data, theta):

    #data encoding 
    for i in range((n_features // 2)):
        qml.RX(data[i], wires = i)

    for i in range((n_features // 2)):
        qml.RY(data[i+n_qubits], wires = i)

    #variational
    theta_counter = 0 

    for i, q in enumerate(qubits, start = theta_counter):
        qml.RX(theta[i], wires = q)
        theta_counter = i

    for q1, q2 in zip(qubits[0::2], qubits[1::2]):
        qml.CZ(wires=[q1,q2])

    for i, q in enumerate(qubits[1::2], start = theta_counter+1):
        qml.RY(theta[i], wires = q)
        theta_counter = i 

    qml.CZ(wires = [1, 3])
    qml.CZ(wires = [5, 7])

    for i, q in enumerate(qubits[3::4], start = theta_counter+1):
        qml.RX(theta[i], wires = q)
        theta_counter = i 

    qml.CZ(wires=[3, 7])

    for i, q in enumerate(qubits[7::8], start = theta_counter+1):
        qml.RY(theta[i], wires=q)
        theta_counter = i 

    return qml.expval(qml.PauliZ(7))


def compute_cost(params, x, y):
    
    y_pred = [qnn(x[i], params) for i in range(x.shape[0])] 

    yhat = [1 if x > 0 else 0 for x in y_pred]

    return np.array(log_loss(y, yhat))

#training the model 
epochs = 10

opt = AdamOptimizer(stepsize= 0.01, beta1=0.9, beta2=0.999)

params = initialize_parameters(feature_range[0], feature_range[1], n_parameters)

loss = compute_cost(params, x_train, y_train)

print("Epoch: {:2d} | Cost: {:3f}".format( 0, loss ))

for it in range(epochs):

    params, loss = opt.step_and_cost(lambda v: compute_cost(v, x_train, y_train), params)

    res = [it + 1, loss]
    print("Epoch: {:2d} | Loss: {:3f}".format(*res))

Hi @Zohim_Chandani1, and welcome to the forum! :slightly_smiling_face:

I think I’ve determined the issue here. It is in your implementation of compute_cost:

def compute_cost(params, x, y):
  y_pred = [qnn(x[i], params) for i in range(x.shape[0])]

  yhat = [1 if x > 0 else 0 for x in y_pred]

  return np.array(log_loss(y, yhat))

The returned value of this function does not depend in a differentiable manner on params, which is why you are seeing the automatic differentiation system complaining that “Output seems independent of input.”

Specifically, you are using y_pred (which should be differentiable with respect to params) within a regular Python list comprehension, the output of which (i.e., discrete zeros or ones) no longer has a derivative with respect to params.

(You’re also importing log_loss from scikit-learn, would furthermore might break differentiability, since its internal implementation would import “standard” numpy, and not the autograd numpy provided within PennyLane).

Instead of converting predictions to binary 0 or 1, I would recommend working directly with the prediction probabilities y_pred themselves (which have the nice property that it equals 1 when the prediction is 100% confident). You can them put them through a cross-entropy or log-loss type cost function to compare predictions vs labels. If you’re using numpy, you might have to hand code it (I don’t believe it is built-in there); the equivalent step in a library like Tensorflow would be (note: you’d have to record your model using TF to use this):

probs = tf.stack(y_pred) / 2 + 1
logits = tf.math.log(probs) # this creates "logits", i.e., unnormalized log probabilities
cost = tf.nn.softmax_cross_entropy_with_logits(y, logits) # outputs a single cost function, averaged over all labels y

Thanks for your previous response @nathan

How could one determine if the output of the cost function depends on the tunable parameters in a ‘differentiable manner’?

I hand-coded the log loss function as per your advice with a for loop, as shown below, and the code now works

def compute_cost(params, x, y): 

    cost = 0 

    for i in range(len(x)):

        probs = qnn(x[i], params)
        p1 = probs[1]
        loss = (-1)*((y[i]*np.log(p1)) + ((1-y[i])*np.log(1-p1)))
        cost += loss 

    cost = cost/len(x)

    return cost

However, a list comprehension version of the same function (shown below) does not work.

def compute_cost(params, x, y):
    
    probs = [qnn(x[i], params) for i in range(x.shape[0])]
    
    p1 = [probs[i][1] for i in range(len(probs))]
    
    cost = ((np.sum((y*np.log(p1)) + ([1-x for x in y]*np.log([1-x for x in p1]))))/len(y))*(-1)

    return cost

Shall I just avoid list comprehensions moving forward or is there a deeper reason to why Pennylane does not support this?

Hey @Zohim_Chandani1! This is less a restriction of PennyLane, and more a restriction of the machine learning framework/autodifferentiation interface that is being used.

In general, there are restrictions that are common across all autodifferentiation libraries:

  • Cost functions must return a single floating-point scalar
  • The output of the cost function must be a piecewise differentiable function of the input.

In your case above, I believe that your original cost function did not satisfy the second restriction, since yhat was not a continuous transformation of the input.

Beyond these general restrictions, though, there might also be some restrictions that are specific to the autodifferentiation library.

For example, Autograd (which PennyLane uses by default when you from pennylane import numpy) has a couple of additional restrictions, including no differentiable support for array assignments (A[0, 0] = x). You can see the full list of restrictions in their documentation.

I’m not 100% sure about TensorFlow and Torch, but JAX also has similar restrictions: https://jax.readthedocs.io/en/latest/notebooks/Common_Gotchas_in_JAX.html#in-place-updates

Hope that helps!

Thankyou for your previous response @josh.

I have been studying the behaviour of various loss functions which I have hardcoded adhering to the restrictions you mentioned in your last post and am still somewhat confused on what seems to break differentiability in the second example as compared to the first example shown below.

In the first case, when I measure the output probability of my qubit in my QNN, I define the following loss function which is differentiable and is minimised:

def compute_cost(params, x, y): 

    eps = 1e-15

    cost = 0 

    for i in range(len(x)):

        probs = qnn(x[i], params)
        p0 = probs[0]
        p1 = probs[1]

        if y[i] == 0: 
            p = p0

        elif y[i] == 1: 
            p = p1

        p = np.clip(p, eps, 1 - eps)
        loss = (-1)*((y[i]*np.log(p)) + ((1-y[i])*np.log(1-p)))
        cost += loss 

    cost = cost/len(x)

    return cost

If I now alter this setup slightly and instead measure the Z expectation value of the qubit and define the following:

def compute_cost(params, x, y): 

    eps = 1e-15

    cost = 0 

    for i in range(len(x)):

        exp_val = qnn(x[i], params)

        if exp_val >= 0.0: 
            prediction = 1 
        elif exp_val < 0.0: 
            prediction = 0 

        p = np.clip(prediction, eps, 1 - eps)
    
        loss = (-1)*((y[i]*np.log(p)) + ((1-y[i])*np.log(1-p)))
        cost = cost + loss 

    cost = cost/len(x)

    return cost

yielding the following error UserWarning: Output seems independent of input.

I am particularly fixated on this particular function as it seems to give me strong gradients for training on a different example in pyQuil but can’t seem to spot what the fault may be. Any help is much appreciated.

Hi @Zohim_Chandani1, I haven’t tested it myself locally, but my gut feeling would be that the if statement,

if exp_val >= 0.0: 
    prediction = 1 
elif exp_val < 0.0: 
    prediction = 0 

is what causes the non-differentiability issue. This is because prediction is not dependent on the output of the QNode exp_val.

Compare this to the first example, where p is either p0 or p1, and is directly connected to the output of the QNode.

In the second example, you could try replacing the if statement with the np.heaviside function,

prediction = np.heaviside(exp_val, 1.)

however I can’t confirm if heaviside itself is also differentiable!