Using PennyLane for Q-Reinforcement Learning

Hello everyone,

I had a question about using PennyLane for a QRL framework.
Specifically, I am testing out the Variational Soft Actor-Critic Method implemented by Lan et al. (2021). The tests take place in the Pendulum-v0 environment. The quantum part is a 3-qubit, 41 parameter, hardware efficient ansatz PQC with a Linear Classical Layer for outputting the mean and the standard deviation of a Gaussian policy.

What I find is that the tests take ‘super’ long. To give you an example, after 5 epochs of the exploration phase (each of which take about 3 seconds in real time), the 6th epoch takes around 30minutes. This is a gigantic jump, and not necessarily something I’d expect for 41 parameters.

The differentiation method is ‘best’ and I’ve tried both the inherent qubit simulator and qualcs, the results are the same.

So my questions is, does anybody have an idea why this is happening, and why this would be intuitively expected?

Best,

Hi @Bartu_Bisgin, welcome to the forum!

If you share your code with us we may be able to dig deeper.

You could also try using the lightning.qubit and lightning.gpu devices.

Also please share the output of qml.about()

Hi @CatalinaAlbornoz, thank you for the warm welcome.

I believe I’ve also already tried the lightning devices actually.
This is basically the repository, I’m running tests from: https://github.com/qlan3/QuantumExplorer

The algorithm replaces the classical Actor with a hybrid one, where the first layer is quantum, with 3 qubits and the output is a Linear Classical Layer.

The settings for tests can be also found in the .json files; however, if you’d like I could make an entire summary of the important parts of the code and the hyperparameters also :slight_smile:

Hi @Bartu_Bisgin!

Thank you for sharing this information. It seems that the lightning.qubit device has a bug where using diff_method=best will result in using parameter-shift, which is quite slow. You should get a speedup if you try lightning.qubit with diff_method=adjoint.

Please let me know if you get a speedup with this change!

Dear @CatalinaAlbornoz ,

Thank you very much for this feedback! It is already running significantly faster, and I expect it to finish in a tolerable amount of time (order of hours instead of weeks now). Though I believe it could still become a lot faster. I think either the code, PennyLane or PyTorch is not making use of my machine fully.

I am running the code through Visual Studio Code on a M1x Mac Pro (the base 14 inch model). However, I do see just a little CPU load in my Activity Monitor. Again, the classical tests finish in matter of minutes (but they also do not give me much CPU load in activity monitor). I have set the ‘export OMP_NUM_THREADS=4’ variable as specified in the lightning.qubit doc. But I’m unsure if it had effect, or if I can set this higher.

What can I do in the code to take full advantage of my machine for PennyLane? Thank you a bunch in advance once more!

Hi @Bartu_Bisgin,

I’m glad it’s significantly faster already!

It’s possible that you’re using a minimizer that is failing to find a good minimum on the 6th epoch. Maybe you can try a different minimizer or different initial conditions.

It may also be that the entire workflow is composed using Torch and you don’t have a compatible GPU to make best use of the Torch backend.

Please let me know if changing any of this helps you find some extra speed!