Feeding dataset into strawberry field

I am trying implement a binary classifier with strawberryfield following the link https://strawberryfields.ai/photonics/demos/run_state_learner.html
My query is how can I feed my dataset features into this model. I am having a dataset containing 2D features. please suggest.


Hi @Satanik_Mitra!

Right, this demo focuses on training for state preparation and is more of an optimization task than an ML task, which has input data and a prediction.

However, it’s certainly possible to do binary classification in Strawberry Fields. One way to encode d-dimensional input data into d modes is to have a single gate act on each mode with a parameter set by one dimension of the input data. For example, you could encode 4-dimensional data (x0, x1, x2, x3) into 4 modes by placing a Dgate on each mode. The Dgate on the first mode would encode x0, and so forth.

The best tutorial to follow is this one. That tutorial also has a section specific to batching. You can also check out an implementation of binary classification for fraud detection in our repo here, however note that the code there uses an older version of Strawberry Fields.

Finally, the above is focused on using Strawberry Fields. You can also check out the PennyLane-Strawberry Fields plugin which allows you to harness PennyLane for photonic systems. A relevant tutorial to check out would be function fitting.


Thank you so much @Tom_Bromley for your kind help. I will go through the links you suggested and implement it. In case any further issues I will post it here. :slight_smile:
Satanik Mitra

1 Like

Hi @Tom_Bromley I have encountered the " CircuitError: The Program is locked, no more Commands can be appended to it." I am trying to implement the below code. My intention was to execute the Engine for 2 times in a loop with same X1 and X2 values. If you can suggest something where I am going wrong. I have tried with engine reset() as well.

for i in range(2):
   with prog.context as q:
     Squeezed(x1) | q[0]
     Squeezed(x2) | q[1]
     BSgate(0.4, 0.10) | (q[0], q[1])
     Rgate(0.4) | q[0]
     Rgate(0.4) | q[1]
     MeasureHomodyne(0.0) | q[0]
    results = eng.run(prog,args=mapping)
    if eng.run_progs:

Hi @Satanik_Mitra!

A program becomes locked after it is run, as you can see in the note here. The issue in your code is the second run of the for loop. There, the prog.context context manager is trying to add to the program, which is locked due to the first eng.run().

My intention was to execute the Engine for 2 times in a loop with same X1 and X2 values.

I’m curious about your use case. Is the idea of running twice with the same values to extract samples from the homodyne measurement? If so, sampling can be done in the following way:

import strawberryfields as sf
from strawberryfields.ops import *

shots = 100
prog = sf.Program(2)
x1 = 0.4
x2 = 0.5

with prog.context as q:
    Squeezed(x1) | q[0]
    Squeezed(x2) | q[1]
    BSgate(0.4, 0.10) | (q[0], q[1])
    Rgate(0.4) | q[0]
    Rgate(0.4) | q[1]
    MeasureHomodyne(0.0) | q[0]

eng = sf.Engine("fock", backend_options={"cutoff_dim": 4})
samples = []

for i in range(shots):
    result = eng.run(prog)
    s = result.samples

Alternatively, if you are looking for batching with the TF backend, check out our tutorial here.

1 Like

Thanks a lot @Tom_Bromley for the reply. You are correctly mention that there’s no point in executing the code with same X1 X2 values twice, actually I was exploring the possibilities of executing a circuit multiple time, and as a trial run I choose X1 X2 to same in both run. However, I want feed newer X1 X2 every time by which I can run through all the data points in the training set. Now, I will try to implement it with your modified code. Thanks a lot for you suggestions and modified code. I will come back to you in case any further issue. You are a great help… :slight_smile: :slight_smile:

Thanks for the info @Satanik_Mitra! In that case, I’d definitely recommend checking out the batching section of the demo. Good luck and let us know if you have any other difficulties!

1 Like
steps = 10

for i in range(len(train_x)):

  _x1 = tf.Variable(train_x[i][0])

  _x2 = tf.Variable(train_x[i][1])

  _y = train_y[i]

  for step in range(steps):

    with tf.GradientTape() as tape:

        # execute the engine

        results = eng.run(circuit, args={**{"x1": _x1, "x2": _x2},**{f"theta{i}": _thetas[i] for i in range(8)}})

        prob1 = results.state.fock_prob([2,0])

        prob2 = results.state.fock_prob([0,2])

        loss1 = -tf.reduce_sum(prob1)

        loss2 = -tf.reduce_sum(prob2)

    if loss1<loss2:   

      gradients = tape.gradient(loss1, [_x1, _x2])


      gradients = tape.gradient(loss2, [_x1, _x2])

    opt.apply_gradients(zip(gradients, [_x1, _x2]))

Hi @Tom_Bromley, As I mentioned earlier in my dataset I have two feature values for every data points. I am passing each of the data points through the circuit 10 times by putting steps =10 and calculating the loss with loss1 and loss2. Likewise I am feeding my entire training set through the circuits. This way I am getting training accuracy as well. Next, I ran the validation set as well with same approach however there I am not considering the loss part and predict the labels directly. In this classification approach I try to incorporate the classification approach followed in " Quantum machine learning in feature Hilbert spaces"

Anyway, I am worried with my approach as I am not feeding the entire training set in one go (that’s I am looping and feeding). Is it a right approach? if you can kindly suggest whether my approach of training and validation is correct or not. and if not where I went wrong. I have tried with the batching approach but there I am getting prediction for only the first batch size. The training is not iterated over the entire dataset. Hence, I followed the above approach.

Hi @Satanik_Mitra,

Thanks for sharing your updated approach! Feeding your data through the circuit 10 times sounds good. You can vary this number to see if it gives better predictions, certainly worth checking (as long as it doesn’t take too long to train!). Having a separate validation data set to check the trained model is also a good idea :slight_smile:

It may be useful to measure the validation loss too as this will give an idea of how well the trained model is generalising to unseen feature data. If you have enough data it can be useful to split a third “test” data set to check the direct predictions.

It’s difficult to comment on why your batching approach only predicted for the first batch without seeing your code (feel free to share!), however the code you have provided is an example of batching itself :smiley: i.e it breaks up a large data set and feeds smaller subsets of the data to the model bit-by-bit. Nice Work!

Let us know if you have any other questions!

Thank you @Ant_Hayes for your reply. Yeah you are correct that my approach is an extension of batching approach, or you can say with batch_size=1 :slight_smile: . However, please have a look at the chunk below -

_x1 = tf.Variable(train_x[:,0])
_x2 = tf.Variable(train_x[:,1])
_y = train_y[:]

for step in range(steps):
    with tf.GradientTape() as tape:
        results = eng.run(circuit, args={**{"x1": _x1[:6], "x2": _x2[:6]},**{f"theta{i}": _thetas[i] for i in range(8)}})

The batching approach I tried to execute in the following way. Taking batch_size=6

Hi @Satanik_Mitra,

Thanks for sharing your batching code! It may be the case that you need to include the batch size in the backend options when defining the engine:

batch_size = 6
eng = sf.Engine("tf", backend_options={ "batch_size": batch_size})

for step in range(steps):
    with tf.GradientTape() as tape:
        results = eng.run(circuit, args={**{"x1": _x1, "x2": _x2}, **{f"theta{i}": _thetas[i] for i in range(8)}})

This way you won’t need to index the the features when running the engine which may be why the model was only training on the first batch.

Alternatively, expanding on your method:

batches = np.arange(0, len(_x1), 6).tolist() # makes list of multiples of 6
for step in range(steps):
    for i in range(len(batches) - 1):
        slice = batches[i:i+2] # this will be used to slice the feature data to create a batch
        with tf.GradientTape() as tape:
            results = eng.run(circuit, args={**{"x1": _x1[slice[0]:slice[1]], "x2": _x2[slice[0]:slice[1]]},**{f"theta{i}": _thetas[i] for i in range(8)}})

This would iterate over the whole data set in batches of size 6. Note that this assumes len(_x1) % 6 = 0, otherwise some data points would be missed. Also note that these examples haven’t been tested (since we don’t have your data so please take this as guidance rather than an exact solution :slight_smile: )

Hope this helps!

1 Like

Thanks a lot @Ant_Hayes . I will made the modification in my code accordingly and let you know if any issue arise :slight_smile:

No problem @Satanik_Mitra , let us know if you have any more questions!

Hi @Ant_Hayes please check the circuit and theta values I am using for classification of my 2D dataset. However, the final accuracy of training set and testing set is overfitted. Can you please suggest anything?

Thetas -

thetas = circuit.params(*[f"theta{i}" for i in range(4)])                    
_thetas = tf.Variable(0.4 * tf.ones((4, batch_size)))


with circuit.context as q:

    Squeezed(sq,x1) | q[0]

    Squeezed(sq,x2) | q[1]

    BSgate(thetas[0], thetas[1]) | (q[0], q[1])

    Pgate(thetas[2]) | q[0]

    Pgate(thetas[2]) | q[1]

    Vgate(thetas[3]) | q[0]

    Vgate(thetas[3]) | q[1]

Training execution

with tf.GradientTape() as tape:
  results = eng.run(circuit, args={**{"x1": _x1, "x2": _x2},**{f"theta{i}": _thetas[i] for i in range(4)}})

Testing execution (I am checking it without theta values)

for j in range(len(test_x)):

    x1_val = tf.Variable(test_x[j][0])

    x2_val = tf.Variable(test_x[j][1])

    _y = test_y[j]

    validation = eng.run(circuit, args={**{"x1": x1_val, "x2": x2_val}})


    prob1 = validation.state.fock_prob([2,0])

    prob2 = validation.state.fock_prob([0,2])

    m1 = tf.reduce_mean(prob1)

    m2 = tf.reduce_mean(prob2)

    p0 = m1/(m1+m2)

    p1 = m2/(m1+m2)

    if p0 > p1:

      label = 0


      label = 1



print(accuracy(labels, predictions))

print(accuracy(val_true, val_label))

Accuracy training 0.58
Accuracy Testing 0.70

Please let me know where I went wrong or how can I handle the overfitting?

Hi @Satanik_Mitra, your implementatoin using strawberryfields looks good! When it comes to handling overfitting there is no set method but there are a couple of general solutions:

  1. Use more data points (if possible).

  2. Reduce the complexity of the model. The aim is to reduce the number of trainable parameters.

  3. Early stopping. Once the model’s training accuracy plateaus, stop the training.

The key is to prevent the model from “memorising ” the training data so it can generalise to unseen data.

Your implementation looks sound, so at this point it is a matter of adding small variations to your approach in order to increase the accuracies. There’s no guarantee that the model will be able to hit 99.99% accuracy but it worth trying! 70% fidelity of a state is a pretty good starting point!

Thank you so much @Ant_Hayes for the motivating comment. I will try to fine tune the model as per your suggestions. :slight_smile:

Hey @Satanik_Mitra, no problem! Your doing some great work here :smiley: feel free to share your findings!

@Ant_Hayes I have tried with increased datapoints received marginally higher train accuracy than validation. However, I was trying with state.fock_prob([0,2]) and state.fock_prob([2,0])
is there any documentation available which discussed in details about the state probabilities? In my work I keep the cut off dimension as 3 which will capture 3 states |0>, |1> and |2> having two modes do I need to try out all the combination of states? It will be great if someway I can be enlighten a little bit about the state probabilities.

Hi @Satanik_Mitra, glad to hear your seeing improvements (even if they’re marginal).

Yes here is the documentation on the fock_prob() which outlines that the returned probability is the overlap of a multimode fock state with the state of interest. So in your example prob1 is the overlap of your trained state and a multimode fock state with the second excited state in the 1st mode and the vacuum sate in the second mode. Similarly for prob2. As the overlap (the probability) approaches 1 the trained state approaches the target multimode fock state.

The cutoff_dim is simply a truncation of the fock space dimension which would ideally be infinite for CV sates. Increasing this can increase the accuracy of computations so would be worth trying! But note that this will come at the cost of memory usage!

Let us know if you have any more questions :slight_smile: