Making quanvolutional neural net weights trainable

Hi i am trying to make the weights of this network trainable in this tutorial https://pennylane.ai/qml/demos/tutorial_quanvolution.html
using this layer https://pennylane.readthedocs.io/en/stable/code/api/pennylane.qnn.KerasLayer.html?highlight=keraslayer#pennylane.qnn.KerasLayer
However I am unsure as to what arguments i should pass for weight shapes and output dimensions. Can I please get some help ?

Thank you

Hey @vijpandaturtle!

Having a look at the tutorial, is this the part you are trying to convert into a KerasLayer:

dev = qml.device("default.qubit", wires=4)
# Random circuit parameters
rand_params = np.random.uniform(high=2 * np.pi, size=(n_layers, 4))

@qml.qnode(dev)
def circuit(phi=None):
    # Encoding of 4 classical input values
    for j in range(4):
        qml.RY(np.pi * phi[j], wires=j)

    # Random quantum circuit
    RandomLayers(rand_params, wires=list(range(4)))

    # Measurement producing 4 classical output values
    return [qml.expval(qml.PauliZ(j)) for j in range(4)]

(feel free to share any code you have and we can take a closer look)

For the above, you first need to update the signature of circuit() so that the phi argument is changed to inputs, since KerasLayer requires an argument of this name to pass input data to. You should also remove the =None part to make the gradient accessible with respect to the input data. You can then add rand_params as an argument to circuit.

Then, we need to tell KerasLayer the shape of rand_params so they can be initialized within KerasLayer. To do this, you can define:

weight_shapes = {"rand_params": (n_layers, 4)}

It is then simply a case of running

qml.qnn.KerasLayer(circuit, weight_shapes, output_dim=4)

to convert to a Keras-compatible layer.

Hope this helps!
Tom

1 Like

Thank you for the reply @Tom_Bromley
The problem with using a keras layer just for the circuit is that i cannot include the actual convolution function (pasted the code below) in the network

def quanv(image):
    """Convolves the input image with many applications of the same quantum circuit."""
    out = np.zeros((14, 14, 4))

    # Loop over the coordinates of the top-left pixel of 2X2 squares
    for j in range(0, 28, 2):
        for k in range(0, 28, 2):
            # Process a squared 2x2 region of the image with a quantum circuit
            q_results = circuit(
                phi=[image[j, k, 0], image[j, k + 1, 0], image[j + 1, k, 0], image[j + 1, k + 1, 0]]
            )
            # Assign expectation values to different channels of the output pixel (j/2, k/2)
            for c in range(4):
                out[j // 2, k // 2, c] = q_results[c]
    return out

i even tried including the circuit logic inside the convolution function to see if it works

def quanv(inputs, conv_params):
    out = np.zeros((14, 14, 4))

    # Loop over the coordinates of the top-left pixel of 2X2 squares
        for j in range(0, 28, 2):
            for k in range(0, 28, 2):
            # Process a squared 2x2 region of the image with a quantum circuit
                win=[inputs[j, k, 0], inputs[j, k + 1, 0], inputs[j + 1, k, 0], inputs[j + 1, k + 1, 0]]
            
                for j in range(4):
                    qml.RY(np.pi * win[j], wires=j)
                RandomLayers(conv_params, wires=list(range(4)))
                q_results = [qml.expval(qml.PauliZ(j)) for j in range(4)]
            # Assign expectation values to different channels of the output pixel (j/2, k/2)
                for c in range(4):
                    out[j // 2, k // 2, c] = q_results[c]
    return out

there may be some issues with the array shapes but this is a rough sketch of what i’m trying to do. I’m probably going wrong in many place please do let me know :confused:

Hey @vijpandaturtle,

Ah ok I see, good question! Since the current KerasLayer doesn’t support applying this style of convolution, you could create your own Keras layer that does!

This layer could inherit from qml.qnn.KerasLayer and edit the call() method to apply the convolution. I had a quick go at doing this to give you an idea:

import tensorflow as tf

class ConvQLayer(qml.qnn.KerasLayer):
    
    def call(self, inputs):
        
        batches = inputs.shape[0]
        out = tf.Variable(tf.zeros((batches, 14, 14, 4)))
        
        # Loop over the coordinates of the top-left pixel of 2X2 squares
        for j in range(0, 28, 2):
            for k in range(0, 28, 2):
                # Process a squared 2x2 region of the image with a quantum circuit
                qnode_inputs = tf.stack([inputs[:, j, k, 0], inputs[:, j, k + 1, 0], inputs[:, j + 1, k, 0], inputs[:, j + 1, k + 1, 0]], axis=1)
                q_results = super().call(qnode_inputs)

                out[:, j // 2, k // 2].assign(q_results)

        return out

qlayer_conv = ConvQLayer(circuit, weight_shapes, output_dim=4)
batches = 2
inputs = np.random.random((batches, 28, 28, 3))
out = qlayer_conv(inputs)

This probably needs a lot more work to double check that it’s functioning as expected, but this would be the general idea!

Thanks,
Tom

1 Like

Thank you so much for that !! :grinning:

1 Like

@Tom_Bromley I am using the same code snippet and it gives me a warnings β€œgradients do not exist”. I have set the argument name to β€˜inputs’ and removed the β€˜None’ assigned to it. Is there anything else that needs to be done to make gradients accessible ?
Thanks

Hi @vijpandaturtle,

Could you help out with posting the code snippet that results in the warnings? From the PennyLane side, the previous suggestions by Tom should suffice for making the gradients be accessible. These warnings seem to be specific to TensorFlow and in certain cases still without specific resolutions. Having said that, perhaps we could uncover something specific to the example. :slightly_smiling_face:

@antalszava the gradient error seems to go away once i updated the tensorflow version. however, i am still getting errors regarding matrix compatibility

InvalidArgumentError: Matrix size-incompatible: In[0]: [4,784], In[1]: [4,10] [Op:MatMul]

How does the keras layer handle batch processing ? I think that might the reason for this issue.

batches = inputs.shape[0]
out = tf.Variable(tf.zeros((batches, 14, 14, 4)))
        

Hi @vijpandaturtle,

That’s great news!

A single KerasLayer object is meant for a single batch. An example of how integrating KerasLayer with a tf.keras.models.Sequential model and specifying the number of batches for the model as a whole can be found in the Additional example section.

For the InvalidArgumentError it indeed seems like there is something going on with the shapes, which could be related to batching (the second component of [4,784] for example seems to be the product of 14, 14, 4).

Unfortunately, it is challenging to see what exactly could be going wrong without the complete code snippet that yields the error. Could you please post it?

Thank you for the reply @antalszava here are the code snippets

class ConvQLayer(qml.qnn.KerasLayer):
    
    def call(self, inputs):
        
        batches = inputs.shape[0]
        out = tf.Variable(tf.zeros((batches, 14, 14, 4)))
        
        # Loop over the coordinates of the top-left pixel of 2X2 squares
        for j in range(0, 28, 2):
            for k in range(0, 28, 2):
                # Process a squared 2x2 region of the image with a quantum circuit
                qnode_inputs = tf.stack([inputs[:, j, k, 0], inputs[:, j, k + 1, 0], inputs[:, j + 1, k, 0], inputs[:, j + 1, k + 1, 0]], axis=1)
                q_results = super().call(qnode_inputs)
                out[:, j // 2, k // 2, :].assign(q_results)
        return out

weight_shapes = {'conv_params': (n_layers,4)}

def MyModel():
    """Initializes and returns a custom Keras model
    which is ready to be trained."""
    model = keras.models.Sequential([
        ConvQLayer(circuit, weight_shapes, output_dim=4),
        keras.layers.Flatten(),
        keras.layers.Dense(10, activation="softmax")
    ])

    model.compile(
        optimizer='adam',
        loss="sparse_categorical_crossentropy",
        metrics=["accuracy"],
    )
    return model

q_model = MyModel()

q_history = q_model.fit(
    train_images,
    train_labels,
    validation_data=(test_images, test_labels),
    batch_size=4,
    epochs=n_epochs,
    verbose=2,
)


Hey @vijpandaturtle,

I had another look just now and also had a problem with accessing the gradient. Unfortunately it wasn’t clear to me what the problem was :thinking: This will likely take some time troubleshooting before we work out the problem. Remember that KerasLayer does not support convolution, so this is definitely a more β€œresearch” level question. If you’re interested in looking deeper into this and providing some feedback that would be great, otherwise I’d recommend following the established style in the tutorial.

@Tom_Bromley thank you for the reply. yes i will look into it a little deeper this time. however, do let me know if you find something :slight_smile:

1 Like

Hi @antalszava,

I have faced the same issue. Have you found a fix to this? It will be very helpful. Thanks

Hey @Emmanuel_OM, welcome to the forum!

Could you elaborate on the issue you’re facing? It can also be useful to share your complete code so that we can troubleshoot the issue more directly.

Thanks!

Thank you for the reply @Tom_Bromley. I am using the code snippets that @vijpandaturtle report above. The only difference is the optimizer and loss function (used for a binary classification problem)

    def MyModel():
    """Initializes and returns a custom Keras model
    which is ready to be trained."""
    model = keras.models.Sequential([
        ConvQLayer(circuit, weight_shapes, output_dim=4),
        keras.layers.Flatten(),
        keras.layers.Dense(1, activation="sigmoid")
    ])

model.compile(
        optimizer='adam',
        loss="binary_crossentropy",
        metrics=["binary_accuracy"],
    )
    return model

Thanks @Emmanuel_OM. From my side I didn’t have any breakthrough with getting the gradient from ConvQLayer to be accessible. Not sure if @vijpandaturtle had any luck?

Unfortunately, this approach to a quantum convolutional layer does not work out of the box with KerasLayer and would require some careful work to get the gradients properly accessible. For now, I’d recommend using the core PennyLane approach (i.e., without KerasLayer) that is used in the tutorial.

Thanks!

Hi there!

We’ve implemented a trainable QuanvolutionLayer for PyTorch:

class QonvLayer(nn.Module):
    def __init__(self, stride=2, device="default.qubit", wires=4, circuit_layers=4, n_rotations=8, out_channels=4, seed=None):
	super(QonvLayer, self).__init__()
	
	# init device
	self.wires = wires
	self.dev = qml.device(device, wires=self.wires)
	
	self.stride = stride
	self.out_channels = min(out_channels, wires)
	
	if seed is None:
	    seed = np.random.randint(low=0, high=10e6)
	    
	print("Initializing Circuit with random seed", seed)
	
	# random circuits
	@qml.qnode(device=self.dev)
	def circuit(inputs, weights):
	    n_inputs=4
	    # Encoding of 4 classical input values
	    for j in range(n_inputs):
	        qml.RY(inputs[j], wires=j)
	    # Random quantum circuit
	    RandomLayers(weights, wires=list(range(self.wires)), seed=seed)
	    
	    # Measurement producing 4 classical output values
	    return [qml.expval(qml.PauliZ(j)) for j in range(self.out_channels)]
	
	weight_shapes = {"weights": [circuit_layers, n_rotations]}
	self.circuit = qml.qnn.TorchLayer(circuit, weight_shapes=weight_shapes)
    
    
    def draw(self):
	# build circuit by sending dummy data through it
	_ = self.circuit(inputs=torch.from_numpy(np.zeros(4)))
	print(self.circuit.qnode.draw())
	self.circuit.zero_grad()
	
    
    def forward(self, img):
	bs, h, w, ch = img.size()
	if ch > 1:
	    img = img.mean(axis=-1).reshape(bs, h, w, 1)
	                
	kernel_size = 2        
	h_out = (h-kernel_size) // self.stride + 1
	w_out = (w-kernel_size) // self.stride + 1
	
	
	out = torch.zeros((bs, h_out, w_out, self.out_channels))
	
	# Loop over the coordinates of the top-left pixel of 2X2 squares
	for b in range(bs):
	    for j in range(0, h_out, self.stride):
	        for k in range(0, w_out, self.stride):
	            # Process a squared 2x2 region of the image with a quantum circuit
	            q_results = self.circuit(
	                inputs=torch.Tensor([
	                    img[b, j, k, 0],
	                    img[b, j, k + 1, 0],
	                    img[b, j + 1, k, 0],
	                    img[b, j + 1, k + 1, 0]
	                ])
	            )
	            # Assign expectation values to different channels of the output pixel (j/2, k/2)
	            for c in range(self.out_channels):
	                out[b, j // kernel_size, k // kernel_size, c] = q_results[c]
	                
	         
	return out

Experiment I: Training with 1 Quanvolutional Layer

Net:

model = torch.nn.Sequential(
    QonvLayer(stride=2, circuit_layers=2, n_rotations=4, out_channels=4),
    torch.nn.Flatten(),
    torch.nn.Linear(in_features=14*14*4, out_features=10)
)

Training output:

Epoch: 0 	Step: 0 	Accuracy: 0.25 	Loss: 2.353778839111328
Gradients Layer 0
tensor([[-4.6585e-03, -1.5023e-01,  1.0962e-17,  3.5731e-18],
	[-4.6677e-03,  2.9001e-02, -9.6852e-19,  0.0000e+00]])
Current Circuit:
 0: ──RY(0.0)──RZ(0.397)───RZ(1.178)───────────────────────────── ⟨Z⟩ 
 1: ──RY(0.0)──RZ(4.088)──╭X──────────RZ(3.61)───╭X────────────── ⟨Z⟩ 
 2: ──RY(0.0)─────────────╰C──────────RY(4.173)──╰C──RY(5.072)─── ⟨Z⟩ 
 3: ──RY(0.0)──RZ(1.785)───RZ(5.903)───────────────────────────── ⟨Z⟩ 

---------------------------------------
Epoch: 0 	Step: 1 	Accuracy: 0.0 	Loss: 3.284860610961914
Gradients Layer 0
tensor([[-2.2039e-02, -4.9558e-01,  2.5899e-16, -1.4661e-16],
	[-1.2097e-02,  2.3364e-01,  1.9031e-16,  0.0000e+00]])
---------------------------------------
Epoch: 0 	Step: 2 	Accuracy: 0.25 	Loss: 2.0575411319732666
Gradients Layer 0
tensor([[-1.3089e-02, -9.6094e-02,  8.8986e-17,  9.1110e-17],
	[-7.3473e-03,  6.8553e-02,  6.4072e-18,  0.0000e+00]])
---------------------------------------
Epoch: 0 	Step: 3 	Accuracy: 0.25 	Loss: 3.791848659515381
Gradients Layer 0
tensor([[-8.5180e-02, -6.7926e-01,  2.9336e-16, -6.0067e-16],
	[-5.4367e-02,  3.8367e-01, -2.3463e-16,  0.0000e+00]])
---------------------------------------
Epoch: 0 	Step: 4 	Accuracy: 0.0 	Loss: 4.429379463195801
Gradients Layer 0
tensor([[-5.6071e-02, -9.5350e-01, -4.0188e-16, -5.2387e-16],
	[-4.3445e-02,  4.8110e-01,  8.5049e-18,  0.0000e+00]])
---------------------------------------
Epoch: 0 	Step: 5 	Accuracy: 0.0 	Loss: 2.415179967880249
Gradients Layer 0
tensor([[-3.7586e-02, -3.1990e-01,  6.1641e-19, -7.2152e-17],
	[-2.7385e-02,  1.1129e-01,  2.5546e-18,  0.0000e+00]])
Current Circuit:
 0: ──RY(0.0)──RZ(0.397)───RZ(1.178)──────────────────────────── ⟨Z⟩ 
 1: ──RY(0.0)──RZ(4.13)───╭X──────────RZ(3.653)──╭X───────────── ⟨Z⟩ 
 2: ──RY(0.0)─────────────╰C──────────RY(4.216)──╰C──RY(5.03)─── ⟨Z⟩ 
 3: ──RY(0.0)──RZ(1.785)───RZ(5.903)──────────────────────────── ⟨Z⟩ 

---------------------------------------
Epoch: 0 	Step: 6 	Accuracy: 0.25 	Loss: 2.0272059440612793
Gradients Layer 0
tensor([[-1.3096e-03, -1.6318e-01, -5.6946e-18,  1.0381e-17],
	[-2.0847e-03,  3.5787e-02, -5.7634e-18,  0.0000e+00]])
---------------------------------------
Epoch: 0 	Step: 7 	Accuracy: 0.25 	Loss: 3.111910820007324
Gradients Layer 0
tensor([[-4.7392e-02, -5.0553e-01,  1.0673e-16,  3.0912e-17],
	[-3.8066e-02,  2.5993e-01, -1.4139e-16,  0.0000e+00]])
---------------------------------------
Epoch: 0 	Step: 8 	Accuracy: 0.0 	Loss: 2.9227261543273926
Gradients Layer 0
tensor([[-5.8086e-02, -3.5329e-01,  3.0340e-17,  1.1894e-16],
	[-4.1156e-02,  1.4573e-01,  1.4530e-16,  0.0000e+00]])
---------------------------------------
Epoch: 0 	Step: 9 	Accuracy: 0.0 	Loss: 2.6818065643310547
Gradients Layer 0
tensor([[-1.1859e-01, -2.2195e-01,  8.9172e-18, -3.7483e-17],
	[-8.8680e-02,  2.1465e-01,  4.7048e-18,  0.0000e+00]])
---------------------------------------
Epoch: 0 	Step: 10 	Accuracy: 0.0 	Loss: 2.707582950592041
Gradients Layer 0
tensor([[-6.6730e-03, -3.5080e-01,  9.0117e-18, -9.0980e-19],
	[ 4.0098e-03,  6.0043e-02,  1.0396e-17,  0.0000e+00]])
Current Circuit:
 0: ──RY(0.0)──RZ(0.397)───RZ(1.178)───────────────────────────── ⟨Z⟩ 
 1: ──RY(0.0)──RZ(4.172)──╭X──────────RZ(3.695)──╭X────────────── ⟨Z⟩ 
 2: ──RY(0.0)─────────────╰C──────────RY(4.258)──╰C──RY(4.991)─── ⟨Z⟩ 
 3: ──RY(0.0)──RZ(1.785)───RZ(5.903)───────────────────────────── ⟨Z⟩ 

Training takes really long, but at least the net achieves ca. 70%-80% accuracy on MNIST.

Experiment II: Training with 2 Quanvolutional Layer

Net:

model = torch.nn.Sequential(
    QonvLayer(stride=2, circuit_layers=2, n_rotations=4, out_channels=4),
    QonvLayer(stride=2, circuit_layers=2, n_rotations=4, out_channels=4),
    torch.nn.Flatten(),
    torch.nn.Linear(in_features=7*7*4, out_features=10)
)

Training output:

Epoch: 0 	Step: 0 	Accuracy: 0.25 	Loss: 2.3005757331848145
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[-9.2107e-03, -3.4147e-02, -9.6166e-03, -1.6073e-02],
	[-9.2107e-03, -3.4147e-02, -3.1461e-03,  7.0380e-18]])
Current Circuit Layer 0:
 0: ──RY(0.0)───RX(3.225)───RX(3.592)──RX(5.593)───RX(5.953)───────────────────────── ⟨Z⟩ 
 1: ──RY(0.0)──╭C───────────RX(2.63)───RY(4.176)──╭C───────────RX(2.43)──RY(2.163)─── ⟨Z⟩ 
 2: ──RY(0.0)──╰X──────────╭C─────────────────────╰X──────────╭C───────────────────── ⟨Z⟩ 
 3: ──RY(0.0)──────────────╰X─────────────────────────────────╰X───────────────────── ⟨Z⟩ 

Current Circuit Layer 1:
 0: ──RY(0.0)───RY(1.79)──RY(1.847)────────────────── ⟨Z⟩ 
 1: ──RY(0.0)───RY(2.06)──RY(1.588)────────────────── ⟨Z⟩ 
 2: ──RY(0.0)──╭X─────────RZ(2.124)──╭X──RZ(4.867)─── ⟨Z⟩ 
 3: ──RY(0.0)──╰C─────────RY(1.193)──╰C──RY(3.918)─── ⟨Z⟩ 

---------------------------------------
Epoch: 0 	Step: 1 	Accuracy: 0.0 	Loss: 2.500396251678467
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[-4.7077e-02, -1.2101e-01, -2.1126e-01,  1.4875e-02],
	[-4.7077e-02, -1.2101e-01, -1.1339e-02, -5.2808e-18]])
---------------------------------------
Epoch: 0 	Step: 2 	Accuracy: 0.25 	Loss: 2.1083250045776367
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[ 1.0099e-01, -6.7789e-03,  1.0940e-01, -3.1570e-02],
	[ 1.0099e-01, -6.7789e-03, -5.4767e-03, -1.1648e-17]])
---------------------------------------
Epoch: 0 	Step: 3 	Accuracy: 0.25 	Loss: 2.5666348934173584
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[-1.6059e-01, -1.4263e-01, -3.8268e-01,  6.0618e-02],
	[-1.6059e-01, -1.4263e-01, -1.1166e-02, -9.7793e-19]])
---------------------------------------
Epoch: 0 	Step: 4 	Accuracy: 0.0 	Loss: 2.981722593307495
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[-4.0437e-01, -3.6849e-01, -6.9833e-01,  1.3052e-01],
	[-4.0437e-01, -3.6849e-01, -6.2979e-03,  1.3407e-17]])
---------------------------------------
Epoch: 0 	Step: 5 	Accuracy: 0.25 	Loss: 2.1014046669006348
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[ 1.9919e-03,  1.2177e-02, -2.0729e-02, -1.3597e-02],
	[ 1.9919e-03,  1.2177e-02, -8.4981e-03, -2.5948e-19]])
Current Circuit Layer 0:
 0: ──RY(0.0)───RX(3.225)───RX(3.592)──RX(5.593)───RX(5.953)───────────────────────── ⟨Z⟩ 
 1: ──RY(0.0)──╭C───────────RX(2.63)───RY(4.176)──╭C───────────RX(2.43)──RY(2.163)─── ⟨Z⟩ 
 2: ──RY(0.0)──╰X──────────╭C─────────────────────╰X──────────╭C───────────────────── ⟨Z⟩ 
 3: ──RY(0.0)──────────────╰X─────────────────────────────────╰X───────────────────── ⟨Z⟩ 

Current Circuit Layer 1:
 0: ──RY(0.0)───RY(1.81)───RY(1.867)────────────────── ⟨Z⟩ 
 1: ──RY(0.0)───RY(2.099)──RY(1.627)────────────────── ⟨Z⟩ 
 2: ──RY(0.0)──╭X──────────RZ(2.116)──╭X──RZ(4.867)─── ⟨Z⟩ 
 3: ──RY(0.0)──╰C──────────RY(1.223)──╰C──RY(3.964)─── ⟨Z⟩ 

---------------------------------------
Epoch: 0 	Step: 6 	Accuracy: 0.25 	Loss: 2.009097099304199
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[ 5.2628e-02,  2.8042e-02,  1.7530e-01, -4.5965e-02],
	[ 5.2628e-02,  2.8042e-02, -2.1108e-03, -8.3491e-18]])
---------------------------------------
Epoch: 0 	Step: 7 	Accuracy: 0.0 	Loss: 2.7671358585357666
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[-4.1022e-01, -2.2605e-01, -4.9493e-01,  1.0710e-01],
	[-4.1022e-01, -2.2605e-01,  1.3541e-02,  2.0550e-17]])
---------------------------------------
Epoch: 0 	Step: 8 	Accuracy: 0.0 	Loss: 2.595287799835205
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[-2.7742e-01, -1.4687e-01, -4.3393e-01,  1.1318e-01],
	[-2.7742e-01, -1.4687e-01, -3.8294e-03,  1.2389e-17]])
---------------------------------------

As you can see, only the Quanvolutional Layer 1 receives gradients. Layer 0 does not get any gradients and hence is not updated by the optimizer.

Now my question is: why? (i.e. What am I missing? What am I doing wrong? Or am I facing a bug?)

Thanks in advance!
Denny

PS: we are using PyTorch 1.4.0 with PennyLane v0.12.0.

1 Like

@dymat this is awesome ! Thanks so much for posting this here.
Since you are training MNIST dataset, can you share a simple but complete snippet which i can run myself (if that’s okay with you)? I can have a look at it and see if i find the source of your problem.
Again thanks, i hit a roadblock on this, happy to see a way forward :slight_smile:

Hi @vijpandaturtle,

thanks for your reply. Since new users (like me) cannot attach files to threads, I am posting it right here.

I hope you can help me find out why gradients are not propagated through the quantum layer.

Looking forward to hear from you,
~dymat

UPDATED CODE (FIXES WRONG INDENTATIONS)

# coding: utf-8

# In[2]:


import torch
from torch import nn

import torchvision

import pennylane as qml
from pennylane import numpy as np
from pennylane.templates import RandomLayers

from sklearn.metrics import accuracy_score


# In[3]:


class QonvLayer(nn.Module):
    def __init__(self, stride=2, device="default.qubit", wires=4, circuit_layers=4, n_rotations=8, out_channels=4, seed=None):
	super(QonvLayer, self).__init__()
	
	# init device
	self.wires = wires
	self.dev = qml.device(device, wires=self.wires)
	
	self.stride = stride
	self.out_channels = min(out_channels, wires)
	
	if seed is None:
	    seed = np.random.randint(low=0, high=10e6)
	    
	print("Initializing Circuit with random seed", seed)
	
	# random circuits
	@qml.qnode(device=self.dev)
	def circuit(inputs, weights):
	    n_inputs=4
	    # Encoding of 4 classical input values
	    for j in range(n_inputs):
	        qml.RY(inputs[j], wires=j)
	    # Random quantum circuit
	    RandomLayers(weights, wires=list(range(self.wires)), seed=seed)
	    
	    # Measurement producing 4 classical output values
	    return [qml.expval(qml.PauliZ(j)) for j in range(self.out_channels)]
	
	weight_shapes = {"weights": [circuit_layers, n_rotations]}
	self.circuit = qml.qnn.TorchLayer(circuit, weight_shapes=weight_shapes)
    
    
    def draw(self):
	# build circuit by sending dummy data through it
	_ = self.circuit(inputs=torch.from_numpy(np.zeros(4)))
	print(self.circuit.qnode.draw())
	self.circuit.zero_grad()
	
    
    def forward(self, img):
	bs, h, w, ch = img.size()
	if ch > 1:
	    img = img.mean(axis=-1).reshape(bs, h, w, 1)
	                
	kernel_size = 2        
	h_out = (h-kernel_size) // self.stride + 1
	w_out = (w-kernel_size) // self.stride + 1
	
	
	out = torch.zeros((bs, h_out, w_out, self.out_channels))
	
	# Loop over the coordinates of the top-left pixel of 2X2 squares
	for b in range(bs):
	    for j in range(0, h_out, self.stride):
	        for k in range(0, w_out, self.stride):
	            # Process a squared 2x2 region of the image with a quantum circuit
	            q_results = self.circuit(
	                inputs=torch.Tensor([
	                    img[b, j, k, 0],
	                    img[b, j, k + 1, 0],
	                    img[b, j + 1, k, 0],
	                    img[b, j + 1, k + 1, 0]
	                ])
	            )
	            # Assign expectation values to different channels of the output pixel (j/2, k/2)
	            for c in range(self.out_channels):
	                out[b, j // kernel_size, k // kernel_size, c] = q_results[c]
	                
	         
	return out


# In[16]:


qonv = QonvLayer(circuit_layers=2, n_rotations=4, out_channels=4, stride=2)
qonv.draw()
x = torch.rand(size=(10,28,28,1))
qonv(x).shape


# In[17]:


def transform(x):
    x = np.array(x)
    x = x/255.0
    
    return torch.from_numpy(x).float()


# In[18]:


train_set = torchvision.datasets.MNIST(root='./mnist', train=True, download=True, transform=transform)
test_set = torchvision.datasets.MNIST(root='./mnist', train=False, download=True, transform=transform)

train_loader = torch.utils.data.DataLoader(dataset=train_set, batch_size=4)


# # Experiment I (one Quanvolutional Layer)

# In[49]:


def training_experiment_1():
    print("Starting Experiment I")

    model = torch.nn.Sequential(
	QonvLayer(stride=2, circuit_layers=2, n_rotations=4, out_channels=4),
	torch.nn.Flatten(),
	torch.nn.Linear(in_features=14*14*4, out_features=10)
    )

    model.train()

    optimizer = torch.optim.Adam(params=model.parameters(), lr=0.01)
    criterion = torch.nn.CrossEntropyLoss()

    for epoch in range(1):
	for i, (x, y) in enumerate(train_loader):

	    # prepare inputs and labels
	    x = x.view(-1, 28, 28, 1)
	    y = y.long()

	    # reset optimizer
	    optimizer.zero_grad()

	    # engage
	    y_pred = model(x)

	    # error, gradients and optimization
	    loss = criterion(y_pred, y)  
	    loss.backward()
	    optimizer.step()

	    # output
	    acc = accuracy_score(y, y_pred.argmax(-1).numpy())       

	    print("Epoch:", epoch, "\tStep:", i, "\tAccuracy:", acc, "\tLoss:", loss.item())
	    print("Gradients Layer 0:")
	    print(model[0].circuit.weights.grad)

	    if i % 5 == 0:
	        model[0].draw()
	    
	    print("---------------------------------------")
	    
	    # early break
	    if i > 0 and i % 10 == 0:
	        break
	    
    return model


# # Experiment II (two stacked Quanvolutional Layers)

# In[48]:


def training_experiment_2():
    print("Starting Experiment II")

    model = torch.nn.Sequential(
	QonvLayer(stride=2, circuit_layers=2, n_rotations=4, out_channels=4),
	QonvLayer(stride=2, circuit_layers=2, n_rotations=4, out_channels=4),
	torch.nn.Flatten(),
	torch.nn.Linear(in_features=7*7*4, out_features=10)
    )

    model.train()

    optimizer = torch.optim.Adam(params=model.parameters(), lr=0.01)
    criterion = torch.nn.CrossEntropyLoss()

    for epoch in range(50):
	for i, (x, y) in enumerate(train_loader):

	    # prepare inputs and labels
	    x = x.view(-1, 28, 28, 1)
	    y = y.long()

	    # reset optimizer
	    optimizer.zero_grad()

	    # engage
	    y_pred = model(x)

	    # error, gradients and optimization
	    loss = criterion(y_pred, y)  
	    loss.backward()
	    optimizer.step()


	    # output
	    acc = accuracy_score(y, y_pred.argmax(-1).numpy())  

	    print("Epoch:", epoch, "\tStep:", i, "\tAccuracy:", acc, "\tLoss:", loss.item())
	    print("Gradients Layer 0:")
	    print(model[0].circuit.weights.grad)
	    print("Gradients Layer 1:")
	    print(model[1].circuit.weights.grad)

	    if i % 5 == 0:
	        print("Current Circuit Layer 0:")
	        model[0].draw()
	        print("Current Circuit Layer 1:")
	        model[1].draw()

	    print("---------------------------------------")
	    
	    # early break
	    if i > 0 and i % 10 == 0:
	        break
	    
    return model


# In[ ]:


if __name__ == "__main__":
    training_experiment_1()
    training_experiment_2()

Hi @dymat, welcome, and thanks for your question!

I’d like to run your code and reproduce the results. The indentation doesn’t seem to have been preserved when it was pasted into the code block though. To make sure I am running it exactly as intended, could you please reformat it? Thanks very much!