Hi @JesusBG , welcome to the Forum!
default.qubit
utilizes NumPy broadcasting under the hood:
Broadcasting provides a means of vectorizing array operations so that looping occurs in C instead of Python.
So it makes sense that the benefit of using default.qubit
with batched inputs scales well.
Since lightning.qubit
and lightning.gpu
are already built on C++ I can see why the benefit from batching with those devices is lower.
Feel free to take a look at Forum post #8403, you might find it useful too.
I hope this helps!