Can pennylane support parallel simulation on multi CPUs simultaneously?

Hi @Yang and welcome to the forum!

You can use the QnodeCollection class to create a collection of independent Qnodes that can be simultaneously evaluated. The collection can be created as:

qnode = qml.QNodeCollection([qnode1, qnode2])

The qnodes within the QNodeCollection are executed sequentially by default but you can use the parallel=True keyword argument to activate asynchronous evaluation. However, the best speedup is achieved with external hardware devices or external simulators as explained in further details here under the “Asynchronous Evaluation” section. You may also find this previous discussion helpful. Please feel free to let us know if you have any further questions.