Hi there!
Weβve implemented a trainable QuanvolutionLayer for PyTorch:
class QonvLayer(nn.Module):
def __init__(self, stride=2, device="default.qubit", wires=4, circuit_layers=4, n_rotations=8, out_channels=4, seed=None):
super(QonvLayer, self).__init__()
# init device
self.wires = wires
self.dev = qml.device(device, wires=self.wires)
self.stride = stride
self.out_channels = min(out_channels, wires)
if seed is None:
seed = np.random.randint(low=0, high=10e6)
print("Initializing Circuit with random seed", seed)
# random circuits
@qml.qnode(device=self.dev)
def circuit(inputs, weights):
n_inputs=4
# Encoding of 4 classical input values
for j in range(n_inputs):
qml.RY(inputs[j], wires=j)
# Random quantum circuit
RandomLayers(weights, wires=list(range(self.wires)), seed=seed)
# Measurement producing 4 classical output values
return [qml.expval(qml.PauliZ(j)) for j in range(self.out_channels)]
weight_shapes = {"weights": [circuit_layers, n_rotations]}
self.circuit = qml.qnn.TorchLayer(circuit, weight_shapes=weight_shapes)
def draw(self):
# build circuit by sending dummy data through it
_ = self.circuit(inputs=torch.from_numpy(np.zeros(4)))
print(self.circuit.qnode.draw())
self.circuit.zero_grad()
def forward(self, img):
bs, h, w, ch = img.size()
if ch > 1:
img = img.mean(axis=-1).reshape(bs, h, w, 1)
kernel_size = 2
h_out = (h-kernel_size) // self.stride + 1
w_out = (w-kernel_size) // self.stride + 1
out = torch.zeros((bs, h_out, w_out, self.out_channels))
# Loop over the coordinates of the top-left pixel of 2X2 squares
for b in range(bs):
for j in range(0, h_out, self.stride):
for k in range(0, w_out, self.stride):
# Process a squared 2x2 region of the image with a quantum circuit
q_results = self.circuit(
inputs=torch.Tensor([
img[b, j, k, 0],
img[b, j, k + 1, 0],
img[b, j + 1, k, 0],
img[b, j + 1, k + 1, 0]
])
)
# Assign expectation values to different channels of the output pixel (j/2, k/2)
for c in range(self.out_channels):
out[b, j // kernel_size, k // kernel_size, c] = q_results[c]
return out
Experiment I: Training with 1 Quanvolutional Layer
Net:
model = torch.nn.Sequential(
QonvLayer(stride=2, circuit_layers=2, n_rotations=4, out_channels=4),
torch.nn.Flatten(),
torch.nn.Linear(in_features=14*14*4, out_features=10)
)
Training output:
Epoch: 0 Step: 0 Accuracy: 0.25 Loss: 2.353778839111328
Gradients Layer 0
tensor([[-4.6585e-03, -1.5023e-01, 1.0962e-17, 3.5731e-18],
[-4.6677e-03, 2.9001e-02, -9.6852e-19, 0.0000e+00]])
Current Circuit:
0: ββRY(0.0)ββRZ(0.397)βββRZ(1.178)βββββββββββββββββββββββββββββ€ β¨Zβ©
1: ββRY(0.0)ββRZ(4.088)βββXββββββββββRZ(3.61)ββββXββββββββββββββ€ β¨Zβ©
2: ββRY(0.0)ββββββββββββββ°CββββββββββRY(4.173)βββ°CββRY(5.072)βββ€ β¨Zβ©
3: ββRY(0.0)ββRZ(1.785)βββRZ(5.903)βββββββββββββββββββββββββββββ€ β¨Zβ©
---------------------------------------
Epoch: 0 Step: 1 Accuracy: 0.0 Loss: 3.284860610961914
Gradients Layer 0
tensor([[-2.2039e-02, -4.9558e-01, 2.5899e-16, -1.4661e-16],
[-1.2097e-02, 2.3364e-01, 1.9031e-16, 0.0000e+00]])
---------------------------------------
Epoch: 0 Step: 2 Accuracy: 0.25 Loss: 2.0575411319732666
Gradients Layer 0
tensor([[-1.3089e-02, -9.6094e-02, 8.8986e-17, 9.1110e-17],
[-7.3473e-03, 6.8553e-02, 6.4072e-18, 0.0000e+00]])
---------------------------------------
Epoch: 0 Step: 3 Accuracy: 0.25 Loss: 3.791848659515381
Gradients Layer 0
tensor([[-8.5180e-02, -6.7926e-01, 2.9336e-16, -6.0067e-16],
[-5.4367e-02, 3.8367e-01, -2.3463e-16, 0.0000e+00]])
---------------------------------------
Epoch: 0 Step: 4 Accuracy: 0.0 Loss: 4.429379463195801
Gradients Layer 0
tensor([[-5.6071e-02, -9.5350e-01, -4.0188e-16, -5.2387e-16],
[-4.3445e-02, 4.8110e-01, 8.5049e-18, 0.0000e+00]])
---------------------------------------
Epoch: 0 Step: 5 Accuracy: 0.0 Loss: 2.415179967880249
Gradients Layer 0
tensor([[-3.7586e-02, -3.1990e-01, 6.1641e-19, -7.2152e-17],
[-2.7385e-02, 1.1129e-01, 2.5546e-18, 0.0000e+00]])
Current Circuit:
0: ββRY(0.0)ββRZ(0.397)βββRZ(1.178)ββββββββββββββββββββββββββββ€ β¨Zβ©
1: ββRY(0.0)ββRZ(4.13)ββββXββββββββββRZ(3.653)βββXβββββββββββββ€ β¨Zβ©
2: ββRY(0.0)ββββββββββββββ°CββββββββββRY(4.216)βββ°CββRY(5.03)βββ€ β¨Zβ©
3: ββRY(0.0)ββRZ(1.785)βββRZ(5.903)ββββββββββββββββββββββββββββ€ β¨Zβ©
---------------------------------------
Epoch: 0 Step: 6 Accuracy: 0.25 Loss: 2.0272059440612793
Gradients Layer 0
tensor([[-1.3096e-03, -1.6318e-01, -5.6946e-18, 1.0381e-17],
[-2.0847e-03, 3.5787e-02, -5.7634e-18, 0.0000e+00]])
---------------------------------------
Epoch: 0 Step: 7 Accuracy: 0.25 Loss: 3.111910820007324
Gradients Layer 0
tensor([[-4.7392e-02, -5.0553e-01, 1.0673e-16, 3.0912e-17],
[-3.8066e-02, 2.5993e-01, -1.4139e-16, 0.0000e+00]])
---------------------------------------
Epoch: 0 Step: 8 Accuracy: 0.0 Loss: 2.9227261543273926
Gradients Layer 0
tensor([[-5.8086e-02, -3.5329e-01, 3.0340e-17, 1.1894e-16],
[-4.1156e-02, 1.4573e-01, 1.4530e-16, 0.0000e+00]])
---------------------------------------
Epoch: 0 Step: 9 Accuracy: 0.0 Loss: 2.6818065643310547
Gradients Layer 0
tensor([[-1.1859e-01, -2.2195e-01, 8.9172e-18, -3.7483e-17],
[-8.8680e-02, 2.1465e-01, 4.7048e-18, 0.0000e+00]])
---------------------------------------
Epoch: 0 Step: 10 Accuracy: 0.0 Loss: 2.707582950592041
Gradients Layer 0
tensor([[-6.6730e-03, -3.5080e-01, 9.0117e-18, -9.0980e-19],
[ 4.0098e-03, 6.0043e-02, 1.0396e-17, 0.0000e+00]])
Current Circuit:
0: ββRY(0.0)ββRZ(0.397)βββRZ(1.178)βββββββββββββββββββββββββββββ€ β¨Zβ©
1: ββRY(0.0)ββRZ(4.172)βββXββββββββββRZ(3.695)βββXββββββββββββββ€ β¨Zβ©
2: ββRY(0.0)ββββββββββββββ°CββββββββββRY(4.258)βββ°CββRY(4.991)βββ€ β¨Zβ©
3: ββRY(0.0)ββRZ(1.785)βββRZ(5.903)βββββββββββββββββββββββββββββ€ β¨Zβ©
Training takes really long, but at least the net achieves ca. 70%-80% accuracy on MNIST.
Experiment II: Training with 2 Quanvolutional Layer
Net:
model = torch.nn.Sequential(
QonvLayer(stride=2, circuit_layers=2, n_rotations=4, out_channels=4),
QonvLayer(stride=2, circuit_layers=2, n_rotations=4, out_channels=4),
torch.nn.Flatten(),
torch.nn.Linear(in_features=7*7*4, out_features=10)
)
Training output:
Epoch: 0 Step: 0 Accuracy: 0.25 Loss: 2.3005757331848145
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[-9.2107e-03, -3.4147e-02, -9.6166e-03, -1.6073e-02],
[-9.2107e-03, -3.4147e-02, -3.1461e-03, 7.0380e-18]])
Current Circuit Layer 0:
0: ββRY(0.0)βββRX(3.225)βββRX(3.592)ββRX(5.593)βββRX(5.953)βββββββββββββββββββββββββ€ β¨Zβ©
1: ββRY(0.0)βββCβββββββββββRX(2.63)βββRY(4.176)βββCβββββββββββRX(2.43)ββRY(2.163)βββ€ β¨Zβ©
2: ββRY(0.0)βββ°XβββββββββββCββββββββββββββββββββββ°XβββββββββββCβββββββββββββββββββββ€ β¨Zβ©
3: ββRY(0.0)βββββββββββββββ°Xββββββββββββββββββββββββββββββββββ°Xβββββββββββββββββββββ€ β¨Zβ©
Current Circuit Layer 1:
0: ββRY(0.0)βββRY(1.79)ββRY(1.847)ββββββββββββββββββ€ β¨Zβ©
1: ββRY(0.0)βββRY(2.06)ββRY(1.588)ββββββββββββββββββ€ β¨Zβ©
2: ββRY(0.0)βββXβββββββββRZ(2.124)βββXββRZ(4.867)βββ€ β¨Zβ©
3: ββRY(0.0)βββ°CβββββββββRY(1.193)βββ°CββRY(3.918)βββ€ β¨Zβ©
---------------------------------------
Epoch: 0 Step: 1 Accuracy: 0.0 Loss: 2.500396251678467
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[-4.7077e-02, -1.2101e-01, -2.1126e-01, 1.4875e-02],
[-4.7077e-02, -1.2101e-01, -1.1339e-02, -5.2808e-18]])
---------------------------------------
Epoch: 0 Step: 2 Accuracy: 0.25 Loss: 2.1083250045776367
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[ 1.0099e-01, -6.7789e-03, 1.0940e-01, -3.1570e-02],
[ 1.0099e-01, -6.7789e-03, -5.4767e-03, -1.1648e-17]])
---------------------------------------
Epoch: 0 Step: 3 Accuracy: 0.25 Loss: 2.5666348934173584
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[-1.6059e-01, -1.4263e-01, -3.8268e-01, 6.0618e-02],
[-1.6059e-01, -1.4263e-01, -1.1166e-02, -9.7793e-19]])
---------------------------------------
Epoch: 0 Step: 4 Accuracy: 0.0 Loss: 2.981722593307495
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[-4.0437e-01, -3.6849e-01, -6.9833e-01, 1.3052e-01],
[-4.0437e-01, -3.6849e-01, -6.2979e-03, 1.3407e-17]])
---------------------------------------
Epoch: 0 Step: 5 Accuracy: 0.25 Loss: 2.1014046669006348
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[ 1.9919e-03, 1.2177e-02, -2.0729e-02, -1.3597e-02],
[ 1.9919e-03, 1.2177e-02, -8.4981e-03, -2.5948e-19]])
Current Circuit Layer 0:
0: ββRY(0.0)βββRX(3.225)βββRX(3.592)ββRX(5.593)βββRX(5.953)βββββββββββββββββββββββββ€ β¨Zβ©
1: ββRY(0.0)βββCβββββββββββRX(2.63)βββRY(4.176)βββCβββββββββββRX(2.43)ββRY(2.163)βββ€ β¨Zβ©
2: ββRY(0.0)βββ°XβββββββββββCββββββββββββββββββββββ°XβββββββββββCβββββββββββββββββββββ€ β¨Zβ©
3: ββRY(0.0)βββββββββββββββ°Xββββββββββββββββββββββββββββββββββ°Xβββββββββββββββββββββ€ β¨Zβ©
Current Circuit Layer 1:
0: ββRY(0.0)βββRY(1.81)βββRY(1.867)ββββββββββββββββββ€ β¨Zβ©
1: ββRY(0.0)βββRY(2.099)ββRY(1.627)ββββββββββββββββββ€ β¨Zβ©
2: ββRY(0.0)βββXββββββββββRZ(2.116)βββXββRZ(4.867)βββ€ β¨Zβ©
3: ββRY(0.0)βββ°CββββββββββRY(1.223)βββ°CββRY(3.964)βββ€ β¨Zβ©
---------------------------------------
Epoch: 0 Step: 6 Accuracy: 0.25 Loss: 2.009097099304199
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[ 5.2628e-02, 2.8042e-02, 1.7530e-01, -4.5965e-02],
[ 5.2628e-02, 2.8042e-02, -2.1108e-03, -8.3491e-18]])
---------------------------------------
Epoch: 0 Step: 7 Accuracy: 0.0 Loss: 2.7671358585357666
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[-4.1022e-01, -2.2605e-01, -4.9493e-01, 1.0710e-01],
[-4.1022e-01, -2.2605e-01, 1.3541e-02, 2.0550e-17]])
---------------------------------------
Epoch: 0 Step: 8 Accuracy: 0.0 Loss: 2.595287799835205
Gradients Layer 0:
None
Gradients Layer 1:
tensor([[-2.7742e-01, -1.4687e-01, -4.3393e-01, 1.1318e-01],
[-2.7742e-01, -1.4687e-01, -3.8294e-03, 1.2389e-17]])
---------------------------------------
As you can see, only the Quanvolutional Layer 1 receives gradients. Layer 0 does not get any gradients and hence is not updated by the optimizer.
Now my question is: why? (i.e. What am I missing? What am I doing wrong? Or am I facing a bug?)
Thanks in advance!
Denny
PS: we are using PyTorch 1.4.0 with PennyLane v0.12.0.