I think it would be nice if PL included a “meta-plugin” to run the actual computations of simulator based plugins on a remote machine, such as a HPC cluster or super computer. i imagine this to be useful especially when PL is used from a notebook interface such as a jupyter notebook.
As there already is a nice plugin interface, this should be rather easy to implement. The meta-plugin would offer a device, receive the server address, login credentials, and the “real” target plugin as parameters and then would delegate all
execute(),… operations to an instance of PL running on a remote machine.
Initiating the job on the remote machine could be done in several ways:
In the simplest approach, the meta-plugin could simply ignore all calls to
apply()and instead, in
pre_expval(), peek into the
queueand send the whole description of the computation off to the remote server, retreive and store the results to respond to calls to
expval(). This could be done even without any changes in core PL, as the remote job execution could be done via ssh. This would be very flexible and even play well with HPC job-queuing systems. Essentially the user could tell the plugin the commands to execute on the remote server to put a job in the queue there, and how to check and retrieve the result. Default such commands could be added over time for common HPC queuing systems. One can even take this a step further: In principle, such a meta-plugin could fire up a whole virtual machine, say on AWS, run the computation, and retrieve the result…
A more advanced implementation would add a “server” to PL core. This would be a python program that load PL and listens for instructions form the “meta-plugin”. This server would have to be started before the “meta-plugin” would work and both the server and the plugin would have to handle the network traffic themselves. The only real advantage would be not having to delay
apply()operations, which enables “on the fly” circuit optmization by the framework of the “real” plugin running on the remote machine. However, this approach seems less flexible and might need a lot of manual coding to work well with different HPC queuing systems.
Just a suggestion to keep in mind for the future…