CV Convolutional Neural Network

Hi,

I am trying to implement a Continuous-Variable version of a Convolutional Neural Network. I couldn’t find any example of this on the website or Github. I am using the same ansatz as in the article Killoran, 2018, which is a periodic 1D convolution that can be represented by a circulant matrix but am encountering some issues.

My idea is this: every circulant matrix can be diagonalized with the Fourier matrix like so: C=1/n F diag(Fc) F*, where c is the first row of the circulant matrix. Therefore, I should be able to calculate the convolution with two interferometers, one for each Fourier matrix and a complex scaling using the squeezing and rotation gates as below (where x is the input vector, r=-log(abs(Fc)) and phi_rot is the argument of Fc):

dev = device("strawberryfields.fock",wires=M, cutoff_dim=10)

@qnode(dev)
def convolution(x=None, phi_x=None, FH=None, F=None, r=None, phi_r=None, phi_rot=None):
    M = len(x)
    wires = [wire for wire in range(M)]

    broadcast(unitary=Displacement, pattern="single", wires=wires, parameters=list(zip(x, phi_x)))
    Interferometer(U1, wires=wires)
    broadcast(unitary=Squeezing, pattern="single", wires=wires, parameters=list(zip(r, phi_r)))
    broadcast(unitary=Rotation, pattern="single", wires=wires, parameters=list(phi_rot))
    Interferometer(U2, wires=wires)

    return (expval(X(0)),expval(X(1)),expval(X(2)),expval(X(3)),)

The problem with this method arises in the squeezing gate, when the real scaling with abs(Fc) are large and lead to what I think are numerical errors. Is there a way to minimize these errors in scaling or is there a more clever way to achieve this transformation?

Any help would be greatly appreciated!

Hey @martin,

Looks like an interesting idea!

I managed to run the code you shared - so is the problem that the output vector from the QNode is not what you are expecting? You’re right that increasing squeezing can lead to numerical errors - it is a trade-off between the cutoff and the amount of squeezing, since that increases the photon number. I’d be curious to see if you get the right answer if you try bumping up the cutoff, e.g., a cutoff of 100 for 2 modes or maybe 25 for 3 modes. In general such a high cutoff will be memory intensive, but it’s ok for a quick test.

If your problem does turn out to be due to squeezing causing truncation effects, perhaps you could place a regularizer on the squeezing strength during training? This is something we’ve done before as you can see for example here with the clipping (note that this is an old Strawberry Fields script).

Also, just so we’re on the same page, this is the code I was running:

import pennylane as qml
import numpy as np

M = 3
dev = qml.device("strawberryfields.fock", wires=M, cutoff_dim=25)

@qml.qnode(dev)
def convolution(x=None, phi_x=None, U1=None, U2=None, r=None, phi_r=None, phi_rot=None):
    M = len(x)
    wires = [wire for wire in range(M)]

    qml.broadcast(unitary=qml.Displacement, pattern="single", wires=wires, parameters=list(zip(x, phi_x)))

    qml.Interferometer(U1, wires=wires)

    qml.broadcast(unitary=qml.Squeezing, pattern="single", wires=wires, parameters=list(zip(r, phi_r)))

    qml.broadcast(unitary=qml.Rotation, pattern="single", wires=wires, parameters=list(phi_rot))

    qml.Interferometer(U2, wires=wires)

    return [qml.expval(qml.X(i)) for i in range(M)]

x = np.random.random(M)
phi_x = np.random.random(M)
r = np.random.random(M)
phi_r = np.random.random(M)
phi_rot = np.random.random(M)

U1 = np.eye(M)
U2 = np.eye(M)

convolution(x=x, phi_x=phi_x, U1=U1, U2=U2, r=r, phi_r=phi_r, phi_rot=phi_rot)

Thanks!

2 Likes

Hi Tom,

Thanks for your answer!

I tried to increase the cutoff dimension, but because in order for the CNN to be interesting it has to be beyond 4 wires, whereby the amount of memory needed quickly became infeasible. In the end I have taken a different approach.

The code I’m running is slightly different from yours in that it contains certain constraints in order to preserve the circulant property.

The problem with trying to bring scaling close to 1 is that the scaling amount in my case is the absolute value of the discrete Fourier transform of the first row of the circulant matrix, or s=abs(Fc), where c is the first row. But if it has a first row like c=[c1, c2, 0, … ,0] then I only want to vary c1 and c2. The question now becomes, how can I vary those values while keeping the resulting scaling close to 1 for all wires.

It turns out this is quite simple if |c1| is close to 0 and |c2| close to 1 and vice versa as per the following plot which shows the infinity norm of s-[1,1,1,1], where s is the scaling vector.
inf_error_convolution_1_squeeze

So in principle using this circuit could work for particular convolutions. This constraint might be too restrictive, however.

1 Like

UPDATE: the approach below didn’t work, it didn’t lead to less errors. However, fixing |c1| at 0 and |c2| at 1 did decrease the error.

Therefore a different approach is to split up the scaling into several squeezing gates. F.ex. 3, each scaling by the cubic root of the actual scaling. The next plot shows inf_norm(cbrt(s)-[1,1,1,1])^3, where cbrt is the cubic root.

inf_error_convolution_3_squeeze

This shows the propagation of the infinity norm. I am not entirely sure, this is a good way to measure the error propagation, but it seems that using several smaller squeezing gates lowers the error. This would also lead to a computational overhead but probably lower than raising the cutoff dimension.

1 Like

Thanks @martin, some interesting updates!

Just wanted to share some things that came to mind when reading:

  • The reason we use a Fock-based device is to allow for non-Gaussianity in the circuit such as photon-number input states, non-Gaussian gates (e.g. qml.Kerr) or photon-number counting measurements. The circuit you are currently using is all Gaussian, and should be simulable using dev = qml.device('strawberryfields.gaussian', wires=M). This device does not have a cutoff dimension and is in general much quicker to simulate (it simply transforms multidimensional Gaussian distributions). However, of course it has constraints on the states, gates, and observables that it can simulate.

  • One way we’ve used before to keep track of the truncation error is to calculate the norm of the truncated state. One can also add a constraint to the cost function to ensure that the norm remains close to one. This approach is a bit more implicit than controlling the amount of squeezing directly, but it might help!