Hello! I encountered a somewhat peculiar issue while using Keras to write code. Specifically, when attempting to create a simple circuit similar to a QCNN, I am trying to control the output of two circuit qubits using a Toffoli gate onto four additional qubits. Here is an example of my code:

```
import pennylane as qml
from pennylane import numpy as np
import logging
import tensorflow as tf
from pennylane.templates.embeddings import AmplitudeEmbedding
logging.getLogger('tensorflow').disabled = True
dev = qml.device("default.qubit", wires=8)
def kernel(params, wires): # 4 params
qml.RY(params[0], wires=wires[0])
qml.RY(params[1], wires=wires[1])
qml.CNOT(wires=[wires[1], wires[0]])
qml.RY(params[2], wires=wires[0])
qml.RY(params[3], wires=wires[1])
qml.CNOT(wires=[wires[0], wires[1]])
def pooling(params, wires): #2 params
qml.CRZ(params[0], wires=[wires[0], wires[1]])
qml.PauliX(wires=wires[0])
qml.CRX(params[1], wires=[wires[0], wires[1]])
@qml.qnode(dev, interface="tf")
def ancillary_qcnn_circuit(inputs, weights):
kernel_size, pooling_size = 4, 2
AmplitudeEmbedding(features=inputs, wires=range(4), normalize=True)
kernel(weights[:kernel_size], wires=[0, 1])
kernel(weights[:kernel_size], wires=[2, 3])
kernel(weights[:kernel_size], wires=[3, 0])
kernel(weights[:kernel_size], wires=[1, 2])
pooling(weights[kernel_size:kernel_size +
pooling_size], wires=[1, 0])
pooling(weights[kernel_size:kernel_size +
pooling_size], wires=[3, 2])
kernel(weights[kernel_size+pooling_size:kernel_size*2+pooling_size], wires=[0, 2])
# comment out any one of the four Toffoli gates
qml.Toffoli(wires=[0, 2, 4])
qml.Toffoli(wires=[0, 2, 5])
qml.Toffoli(wires=[0, 2, 6])
qml.Toffoli(wires=[0, 2, 7])
return [qml.expval(qml.PauliZ(i)) for i in range(4, 8)]
x_train = np.random.rand(100, 16)
y_train = np.random.rand(100, 4)
qcnnlayer = qml.qnn.KerasLayer(ancillary_qcnn_circuit, {
"weights": 10}, output_dim=4)
model = tf.keras.Sequential([qcnnlayer])
opt = tf.keras.optimizers.Adam(learning_rate=0.002)
model.compile(opt, loss=tf.keras.losses.MeanSquaredError())
model.fit(x_train, y_train, epochs=50, batch_size=50)
```

I have encountered the following errors:

```
tensorflow.python.framework.errors_impl.InvalidArgumentError: Exception encountered when calling layer 'keras_layer' (type KerasLayer).
{{function_node __wrapped__MatMul_device_/job:localhost/replica:0/task:0/device:CPU:0}} Matrix size-incompatible: In[0]: [4,4], In[1]: [100,128] [Op:MatMul]
Call arguments received by layer 'keras_layer' (type KerasLayer):
• inputs=tf.Tensor(shape=(50, 16), dtype=float32)
```

I’m not sure what errors are occurring in the code mentioned above. I have noticed that it seems to run fine if the input does not include a batch dimension, like:

```
input = np.random.rand(16)
# input = np.random.rand(20,16) would get an error
weights = np.random.rand(10)
ancillary_qcnn_circuit(input, weights)
```

Additionally, what further confuses me is that when I comment out any of the four Toffoli gates (or more) in the code, it runs.

My version is 0.34.0.dev0. （for I wanna use batch mode `AmplitudeEmbedding`

in TensorFlow)