Hi Isaac,
Thanks for providing the idea! Bur I’m not quit understand the reason behind the idea. Originally when I read Data re-uploading impelementation in hybrid NN with keras layer - #4 by Kuma-quant, I though what it mean is ensuring all array/tensor manipulations in the QNode should use TensorFlow rather than NumPy, because in the QNode what you deal with is the training parameter, it should be the form of a differentiable tf.tensor.
But why the operation outside QNode, in my case is in keras call
function, should also include tensorflow arithmetic / operations only? We have already transformed the QNode to a KerasLayer object, I have seen people use non-tensorflow arithmetic / operations when using ordinary keras layer in call
function. I will be appreciate if you can help me with this question!
But still I have changed the operation to using tensorflow only, including:
-
Instead of appending to list I use
tf.concat
-
output is initialized using
tf.zeros
instead ofnp.zeros
Here is the modified call
function:
def call(self, inputs):
# The output length of one sample after the convolution operation with stride = 1
output_length = inputs.shape[1] - self.filter_size + 1
num_sample = inputs.shape[0]
output_all = None
quantum_filter_list = [self.quantum_filter_1, self.quantum_filter_2]
count = 0
for a in range(num_sample):
print(f"Runing {a}th data sample...")
# output shape for one image
# output = np.zeros((output_length, output_length, self.num_filters))
output = tf.zeros((output_length, output_length, self.num_filters), dtype=tf.dtypes.float32)
for i in range(output_length):
for j in range(output_length):
for f in range(self.num_filters):
# convolution windows, now only applying to one channel image
sub_input = inputs[a,i:i+self.filter_size, j:j+self.filter_size, :]
# flatten into 1-D
sub_input = tf.reshape(sub_input, [-1])
quantum_filter = quantum_filter_list[f]
# output[i][j][f] = quantum_filter(sub_input)
output = tf.tensor_scatter_nd_update(output, [[i, j, f]], [quantum_filter(sub_input)])
# output_all[a,i,j,f].assign(quantum_filter(sub_input))
output = tf.convert_to_tensor(output)
output = tf.expand_dims(output, axis=0)
# print(f"output_shape: {output.shape}")
# print(output)
if count == 0:
output_all = output
# print(count)
# print(output_all.shape)
else:
output_all = tf.concat([output_all, output], 0)
# print(count)
# print(output_all.shape)
count += 1
print("All input data for one batch have been convolved!")
# output_all = tf.cast(output_all, tf.float32)
x = self.flatten(output_all)
print("after flatten")
# x = tf.cast(x, tf.float32)
x = self.hidden(x)
print("after hidden")
x = self.dense(x)
print("after dense")
print(x.dtype)
return x
But now it have some dtype error when I call model.fit() to compute the loss, so I’m not sure whether the gradient problem is solved or not :
InvalidArgumentError: cannot compute Mul as input #1(zero-based) was expected to be a float tensor but is a double tensor [Op:Mul]
I’m still working on this dtype error, will share the result after I solve it. Any suggestion about this dtype error is welcomed.
Again thanks @isaacdevlugt for helping me!