If we use lightning.gpu
, can we make the sub-generators process in parallel? Please explain how to use multi-GPU support.
Hi @mass_of_15, how many qubits are you using at the moment? Using GPUs only starts being effective if you have over 20 qubits. Using multiple GPUs might be effective if you have multiple observables.
I believe in this case using GPUs might slow you down instead of helping since you’re using about 12 or 13 qubits.
If you want to try using multiple GPUs you can install lightning.gpu
with pip install pennylane-lightning-gpu custatevec-cu11
. Then you can use batch_obs=True
to allow multiple GPUs to be used.
dev = qml.device(“lightning.gpu”, wires=X, batch_obs=True)
If the above command runs you out of memory, you can tell the GPUs to only keep n
copies of the statevector at any given time with batch_obs=n
. Tuning this can help with certain workloads (gradients over large Hamiltonians, and having 4-8 GPUs on a single node).
I hope this helps!
At the moment, unfortunately, I didn’t find a great solution and I don’t have other ideas. I thought that the Scheduler function managing the learning rates during the training would lead to good results…the learning rate trends are stable and more realistic, although the images were not changed.
Now, I am trying to develop a different generator circuit, replacing some linear layers with quantum layers:
q_depth=2
n_qubits=12
weight_shapes = {“weights”: (q_depth, n_qubits)}
@qml.qnode(dev,interface=“torch”)
def qnode(inputs, weights):
qml.AngleEmbedding(inputs, wires=range(n_qubits))
qml.BasicEntanglerLayers(weights, wires=range(n_qubits))
return [qml.expval(qml.PauliZ(wires=i)) for i in range(n_qubits)]
class Generator(nn.Module):
def init(self,ngpu):
self.ngpu = ngpu
super().init()self.model = nn.Sequential(
nn.Linear(n_qubits , 16), # the input ‘n_qubits’ is correct??
nn.ReLU(),qml.qnn.TorchLayer(qnode, weight_shapes),
nn.ReLU(),qml.qnn.TorchLayer(qnode, weight_shapes),
nn.ReLU(),nn.Linear(64,64643),
nn.Tanh(),
)def forward(self, x):
return self.model(x)
The training code is the same of the previous model (batch size is 16), But when I start the training I occurred into the error:
noise = torch.rand(b_size, n_qubits, device=device) * math.pi
fake_red = generator(noise)
fake_green = generator(noise)
fake_blue = generator(noise)
ValueError Traceback (most recent call last)
[<ipython-input-63-7330ec666c01>](https://localhost:8080/#) in <cell line: 7>()
25 # Noise follwing a uniform distribution in range [0,pi/2)
26 noise = torch.rand(12, n_qubits, device=device) * math.pi / 2 #noise=(32,13)
---> 27 fake_red = generator(noise)
28 fake_green = generator(noise)
29 fake_blue = generator(noise)
Hey @Eleonora_Panini, looks like the error traceback is cut off. Can you attach the whole thing?
I have understood the cause of the error, in the architecture with linear layers and some quantum layers I noticed that if you pass ,for instance, from 16 to 64 ->nn.Linear(input, 16) as input layer, than you have some quantum layers and the output layer is nn.Linear(64,output)-> the quantum layers do not apply any changes to the dimension. If they take 16 as input (that must also correspond to the number of qubits), they can’t convert 16 to 64. But they still keep the same dimension of 16 for every quantum layer.
How can I use quantum layers to change the value from 16 to 32 or 64?
class Generator(nn.Module):
def init(self,ngpu):
self.ngpu = ngpu
super().init()self.model = nn.Sequential(
nn.Linear(latent_vector_dim, 16), # input layer
nn.ReLU(),qml.qnn.TorchLayer(qnode, weight_shapes), #1°hidden layer → I would like to have the same effect as nn.Linear(16,32), but the layer keep 16
nn.ReLU(),qml.qnn.TorchLayer(qnode, weight_shapes), #2°hidden layer → I would like to have the same effect as nn.Linear(32,64), but the layer keep 16
nn.ReLU(),nn.Linear(64,image_sizeimage_size3), #output layer
nn.Tanh(),)
def forward(self, x):
return self.model(x)
Moreover, If I want to use convolutional layers (Conv2D(…)), how can I replace them with quantum layers that make the same effect?
ngf=64
nz=100
nc=3
class Generator(nn.Module):
def init(self, ngpu):
super(Generator, self).init()
self.ngpu = ngpu
self.main = nn.Sequential(
input is Z, going into a convolution
nn.ConvTranspose2d( nz, ngf * 8, 4, 1, 0, bias=False),
nn.BatchNorm2d(ngf * 8),
nn.ReLU(True),
state size. (ngf*8) x 4 x 4
nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 4),
nn.ReLU(True),
state size. (ngf*4) x 8 x 8
nn.ConvTranspose2d( ngf * 4, ngf * 2, 4, 2, 1, bias=False), #I would like to replace this layer with quantum layer that converts from ngf8 to ngf4 (512->256)
nn.BatchNorm2d(ngf * 2),
nn.ReLU(True),
state size. (ngf*2) x 16 x 16
nn.ConvTranspose2d( ngf * 2, ngf, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf),
nn.ReLU(True),
state size. (ngf) x 32 x 32
nn.ConvTranspose2d( ngf, nc, 4, 2, 1, bias=False),
nn.Tanh()
state size. (nc) x 64 x 64
)
def forward(self, input):
return self.main(input)
@mass_of_15 Did you train the model with my dataset converted to greyscale? I am trying to do it, but the original demo model do not work for my dataset also in greyscale images.
So, I changed the size to 64x64 (instead of 8x8) and batch size to 16 (instead of 1), and also the generator with 11 qubits, 1 ancilla, 4 depth, 4 generator, but the output images did not evolve. I also test with a discriminator with more layers (the code below), but there are not improvements after 300 epochs. Should I train for more iterations? Anyway it is strange that after 300 epochs there are not any signs of similarity with the real images.
import torchvision as tv
import numpy as np
import torch.utils.data as data
batch_size = 16
ngpu=1
workers=2
image_size = 64
#dataDir = '/content/drive/MyDrive/Colab Notebooks/Subset_Dil_Bos/'
dataDir ="C:/Users/elyon/OneDrive/Desktop/Tesi/dataset/"
trainTransform = tv.transforms.Compose([tv.transforms.Grayscale(num_output_channels=1),
tv.transforms.ToTensor(),
tv.transforms.Resize(image_size),
tv.transforms.CenterCrop(image_size),
tv.transforms.Normalize((0.5), (0.5))])
trainSet = dset.ImageFolder(dataDir, transform=trainTransform)
dataloader = torch.utils.data.DataLoader(trainSet, batch_size=batch_size,shuffle=True, num_workers=workers)
class Discriminator(nn.Module):
"""Fully connected classical discriminator"""
def __init__(self):
super().__init__()
self.model = nn.Sequential(
# Inputs to first hidden layer (num_input_features -> 64)
nn.Linear(image_size * image_size, 2048),
nn.ReLU(),
nn.Linear(2048, 1024),
nn.ReLU(),
nn.Linear(1024, 512),
nn.ReLU(),
nn.Linear(512, 256),
nn.ReLU(),
nn.Linear( 256,128),
nn.ReLU(),
nn.Linear( 128,64),
nn.ReLU(),
# First hidden layer (64 -> 16)
nn.Linear(64, 32),
nn.ReLU(),
nn.Linear(32, 16),
nn.ReLU(),
# Second hidden layer (16 -> output)
nn.Linear(16, 1),
nn.Sigmoid(),
)
def forward(self, x):
return self.model(x)
# Quantum variables
n_qubits = 11 # Total number of qubits / N
n_a_qubits = 1 # Number of ancillary qubits / N_A
q_depth = 4 # Depth of the parameterised quantum circuit / D
n_generators = 4 # Number of subgenerators for the patch method / N_G
# Quantum simulator
dev = qml.device("lightning.qubit", wires=n_qubits)
# Enable CUDA device if available
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
@qml.qnode(dev, interface="torch", diff_method="parameter-shift")
def quantum_circuit(noise, weights):
weights = weights.reshape(q_depth, n_qubits)
# Initialise latent vectors
for i in range(n_qubits):
qml.RY(noise[i], wires=i)
# Repeated layer
for i in range(q_depth):
# Parameterised layer
for y in range(n_qubits):
qml.RY(weights[i][y], wires=y)
# Control Z gates
for y in range(n_qubits - 1):
qml.CZ(wires=[y, y + 1])
return qml.probs(wires=list(range(n_qubits)))
def partial_measure(noise, weights):
# Non-linear Transform
probs = quantum_circuit(noise, weights)
probsgiven0 = probs[: (2 ** (n_qubits - n_a_qubits))]
probsgiven0 /= torch.sum(probs)
# Post-Processing
probsgiven = probsgiven0 / torch.max(probsgiven0)
return probsgiven
class PatchQuantumGenerator(nn.Module):
"""Quantum generator class for the patch method"""
def __init__(self, n_generators, q_delta=1):
"""
Args:
n_generators (int): Number of sub-generators to be used in the patch method.
q_delta (float, optional): Spread of the random distribution for parameter initialisation.
"""
super().__init__()
self.q_params = nn.ParameterList(
[
nn.Parameter(q_delta * torch.rand(q_depth * n_qubits), requires_grad=True)
for _ in range(n_generators)
]
)
self.n_generators = n_generators
def forward(self, x):
# Size of each sub-generator output
patch_size = 2 ** (n_qubits - n_a_qubits)
# Create a Tensor to 'catch' a batch of images from the for loop. x.size(0) is the batch size.
images = torch.Tensor(x.size(0), 0).to(device)
# Iterate over all sub-generators
for params in self.q_params:
# Create a Tensor to 'catch' a batch of the patches from a single sub-generator
patches = torch.Tensor(0, patch_size).to(device)
for elem in x:
q_out = partial_measure(elem, params).float().unsqueeze(0)
patches = torch.cat((patches, q_out))
# Each batch of patches is concatenated with each other to create a batch of images
images = torch.cat((images, patches), 1)
return images
lrG = 0.3 # Learning rate for the generator
lrD = 0.01 # Learning rate for the discriminator
num_iter = 100 # Number of training iterations
discriminator = Discriminator().to(device)
generator = PatchQuantumGenerator(n_generators).to(device)
# Binary cross entropy
criterion = nn.BCELoss()
# Optimisers
optD = optim.Adam(discriminator.parameters(), lr=lrD)
optG = optim.Adam(generator.parameters(), lr=lrG)
#optD = optim.SGD(discriminator.parameters(), lr=lrD)
#optG = optim.SGD(generator.parameters(), lr=lrG)
# Fixed noise allows us to visually track the generated images throughout training
fixed_noise = torch.rand(batch_size, n_qubits, device=device) * math.pi / 2
# Iteration counter
counter = 0
# Collect images for plotting later
results = []
G_losses = []
D_losses = []
while True:
for i, (data, _) in enumerate(dataloader):
# Data for training the discriminator
data = data.reshape(-1, image_size * image_size)
real_data = data.to(device)
b_size = real_data.size(0)
real_labels = torch.full((b_size,), 1.0, dtype=torch.float, device=device)
fake_labels = torch.full((b_size,), 0.0, dtype=torch.float, device=device)
# Noise follwing a uniform distribution in range [0,pi/2)
noise = torch.rand(b_size, n_qubits, device=device) * math.pi / 2
fake_data = generator(noise)
# Training the discriminator
discriminator.zero_grad()
outD_real = discriminator(real_data).view(-1)
outD_fake = discriminator(fake_data.detach()).view(-1)
errD_real = criterion(outD_real, real_labels)
errD_fake = criterion(outD_fake, fake_labels)
# Propagate gradients
errD_real.backward()
errD_fake.backward()
errD = errD_real + errD_fake
optD.step()
# Training the generator
generator.zero_grad()
outD_fake = discriminator(fake_data).view(-1)
errG = criterion(outD_fake, real_labels)
errG.backward()
optG.step()
counter += 1
# Show loss values
if counter % 10 == 0:
print(f'Iteration: {counter}, Discriminator Loss: {errD:0.3f}, Generator Loss: {errG:0.3f}')
test_images = generator(fixed_noise).normal_().view(b_size,1,image_size,image_size).cpu().detach()
#test_images = generator(fixed_noise).view(b_size,1,image_size,image_size).cpu().detach()
G_losses.append(errG.item())
D_losses.append(errD.item())
# Save images every 50 iterations
if counter % 50 == 0:
results.append(test_images)
if counter == num_iter:
break
if counter == num_iter:
break
I think the small size of the dataset may be the issue. GAN pretty much requires a large dataset to train. Try checking Kaggle or some other database to get RGB images. I am adding one for metasurface images:
In this, there are at least 18770 images of the metasurface. Check with this. If you want, you can remove some images for memory constraints.
@mass_of_15
The original dataset of Dilbert comics has 15000 images (I have it), but I use only a part of it (300images) because with standard GAN the model works really well also with 300 images. Although the quantum GAN model does not work with 300 images both greyscale and RGB.
I can try to run our GAN quantum model with the entire dataset of 15000 images 64x64 and batch size=16 or 32… i can try both with rgb and greyscale images.
Did you try our rgb quantum model with a larger dataset of 64x64 images? Does it work?
Did you also try the greyscale quantum model with a different dataset (not handswritten digit) with 64x64 images? Does it work?
Yes. I used the dataset from the GitHub link below and used quantum GAN. It worked perfectly fine. I did it with both grayscale and color images. The results were good. Why don’t you try with the dataset in the below GitHub link?
@isaacdevlugt, In the PennyLane tutorial for Quantum GAN (Quantum GANs | PennyLane Demos), can you give me the reference for the nonlinear transform? Are there any research articles on the same?
Ok super! What kind of discriminator did you use? And 12 qubits with the three colour separated? 64x64 size images? How many epochs?
Can you post here the entire code?
I can try with this dataset thank you
I tried with the dataset from your link but It does not work. The output image is always an ensemble of coloured points. Can you share all the code of your GAN for both RGB and grayscale images, please? So I can compare the parameters and architecture with mine and find the difference. Thank you
Hi @mass_of_15,
I found this paper on nonlinear transformations in quantum computation. It’s not specifically about GANs but it can give you some context on the topic.
I hope this helps you!
@mass_of_15 These are my results after 1000 epochs with the dataset that you suggested me (metasurface_inverseDesign link) and quantum GAN: the demo for greyscale and the model modified for RGB. The model does not work also with this dataset.
If you reach a good result, can you please share your code (both for RGB and greyscale) and the output images? Maybe I am using different parameters and I would like to know why I can’t reach your results. Thanks
Greyscale:
RGB:
# Quantum variables
n_qubits = 9 # Total number of qubits / N
n_a_qubits = 1 # Number of ancillary qubits / N_A
q_depth = 5 # Depth of the parameterised quantum circuit / D
n_generators = 48 # Number of subgenerators for the patch method / N_G
# Quantum simulator
dev = qml.device("lightning.gpu", wires=n_qubits)
@qml.qnode(dev, interface="torch", diff_method="parameter-shift")
def quantum_circuit(noise, weights):
weights = weights.reshape(q_depth, n_qubits)
for i in range(n_qubits):
qml.RY(noise[i], wires=i)
# Repeated layer
for j in range(q_depth):
# Parameterised layer
for y in range(n_qubits):
qml.RY(weights[j][y], wires=y)
# Control Z gates
for y in range(n_qubits - 1):
qml.CZ(wires=[y, y + 1])
return qml.probs(wires=list(range(n_qubits)))
# For further info on how the non-linear transform is implemented in Pennylane
# https://discuss.pennylane.ai/t/ancillary-subsystem-measurement-then-trace-out/1532
def partial_measure(noise, weights):
# Non-linear Transform
probs = quantum_circuit(noise, weights)
probsgiven0 = probs[: (2 ** (n_qubits - n_a_qubits))]
probsgiven0 /= torch.sum(probs)
# Post-Processing
probsgiven = probsgiven0 / torch.max(probsgiven0)
return probsgiven
class PatchQuantumGenerator(nn.Module):
"""Quantum generator class for the patch method"""
def __init__(self, n_generators, ngpu, q_delta=1):
"""
Args:
n_generators (int): Number of sub-generators to be used in the patch method.
q_delta (float, optional): Spread of the random distribution for parameter initialisation.
"""
super().__init__()
self.ngpu = ngpu
self.q_params = nn.ParameterList(
[
nn.Parameter(q_delta * torch.rand(q_depth * n_qubits), requires_grad=True)
for _ in range(n_generators)
]
)
self.n_generators = n_generators
def forward(self, x):
# Size of each sub-generator output
patch_size = 2 ** (n_qubits - n_a_qubits)
# Create a Tensor to 'catch' a batch of images from the for loop. x.size(0) is the batch size.
images = torch.Tensor(x.size(0), 0).to(device)
# Iterate over all sub-generators
for params in self.q_params:
# Create a Tensor to 'catch' a batch of the patches from a single sub-generator
patches = torch.Tensor(0, patch_size).to(device)
for elem in x:
q_out = partial_measure(elem, params).float().unsqueeze(0)
patches = torch.cat((patches, q_out))
# Each batch of patches is concatenated with each other to create a batch of images
images = torch.cat((images, patches), 1)
return images
netG = PatchQuantumGenerator(n_generators, ngpu).to(device)
#Print the model
print(netG)
class Discriminator(nn.Module):
"""Fully connected classical discriminator"""
def __init__(self, ngpu):
super().__init__()
self.ngpu = ngpu
self.l1 = nn.Linear(4, image_size*image_size*nc, bias=False)
self.model = nn.Sequential(
# Inputs to first hidden layer (num_input_features -> 64)
nn.Linear(image_size * image_size * nc, 64),
nn.ReLU(),
# First hidden layer (64 -> 16)
nn.Linear(64, 16),
nn.ReLU(),
# Second hidden layer (16 -> output)
nn.Linear(16, 1),
nn.Sigmoid(),
)
def forward(self, x):
return self.model(x)
#Create the Discriminator
netD = Discriminator(ngpu).to(device)
#Print the model
print(netD)
# Initialize the ``BCELoss`` function
criterion = nn.BCELoss()
# Create batch of latent vectors that we will use to visualize
# the progression of the generator
fixed_noise = torch.randn(64, n_qubits, device=device)
# Establish convention for real and fake labels during training
real_label = 1.
fake_label = 0.
# Setup Adam optimizers for both G and D
optimizerD = optim.Adam(netD.parameters(), lr=lr, betas=(beta1, 0.999))
optimizerG = optim.Adam(netG.parameters(), lr=lr, betas=(beta1, 0.999))
# Training Loop
# Lists to keep track of progress
num_epochs = 500
img_list = []
G_losses = []
D_losses = []
iters = 0
print("Starting Training Loop...")
# For each epoch
for epoch in range(num_epochs):
# For each batch in the dataloader
for i, data in enumerate(dataloader, 0):
############################
# (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
###########################
## Train with all-real batch
netD.zero_grad()
# Format batch
real_cpu = data[0].to(device)
b_size = real_cpu.size(0)
label = torch.full((b_size,), real_label, dtype=torch.float, device=device)
# Forward pass real batch through D
output = netD(real_cpu).view(-1)
# Calculate loss on all-real batch
errD_real = criterion(output, label)
# Calculate gradients for D in backward pass
errD_real.backward()
D_x = output.mean().item()
## Train with all-fake batch
# Generate batch of latent vectors
noise = torch.randn(b_size, n_qubits, device=device)
# Generate fake image batch with G
fake = netG(noise)
label.fill_(fake_label)
# Classify all fake batch with D
output = netD(fake.detach()).view(-1)
# Calculate D's loss on the all-fake batch
errD_fake = criterion(output, label)
# Calculate the gradients for this batch, accumulated (summed) with previous gradients
errD_fake.backward()
D_G_z1 = output.mean().item()
# Compute error of D as sum over the fake and the real batches
errD = errD_real + errD_fake
# Update D
optimizerD.step()
############################
# (2) Update G network: maximize log(D(G(z)))
###########################
netG.zero_grad()
label.fill_(real_label) # fake labels are real for generator cost
# Since we just updated D, perform another forward pass of all-fake batch through D
output = netD(fake).view(-1)
# Calculate G's loss based on this output
errG = criterion(output, label)
# Calculate gradients for G
errG.backward()
D_G_z2 = output.mean().item()
# Update G
optimizerG.step()
# Output training stats
if i % 50 == 0:
print('[%d/%d][%d/%d]\tLoss_D: %.4f\tLoss_G: %.4f\tD(x): %.4f\tD(G(z)): %.4f / %.4f'
% (epoch, num_epochs, i, len(dataloader),
errD.item(), errG.item(), D_x, D_G_z1, D_G_z2))
# Save Losses for plotting later
G_losses.append(errG.item())
D_losses.append(errD.item())
# Check how the generator is doing by saving G's output on fixed_noise
if (iters % 500 == 0) or ((epoch == num_epochs-1) and (i == len(dataloader)-1)):
with torch.no_grad():
fake = netG(fixed_noise).view(64,image_size,image_size,3).detach().cpu()
img_list.append(vutils.make_grid(fake, padding=2, normalize=True))
iters += 1
Try the above code. This code is for 64x64x3 image size. If not working try with 8x8x3 images. For that put n_qubits = 5, n_a_qubits = 1, n_generators = 12
.
What is the batch size?? I set 64 because it is the only dimension that does not get any errors. The training cell is still running for infinite time without showing the progress of the iterations…I think that 48 subgenerators and 5 layers of depth are too much for my hardware that has 32 GB RAM and 8GB GPU.
The images 8x8x3 do not make sense because they are too small and there are not enough pixels, anyway I tried also with 8x8x3 and 12 subgenerators but It has the same problem. Maybe the problem is the number of subgenerators and the depth that make the training too slow and it is blocked.
If on your computer this GAN works, can you please send me your output images??
I am running this on NVIDIA DGX A-100. It’s an HPC system Also are you using 'lightning.gpu
?
You can use 13 qubits, 1 ancillary qubit and 3 sub-generators.
Original:
QGAN:
Ok, i have not an HPC, but a laptop with nvidia geforce rtx. I am using lightning.qubit because i have windows while lightning.gpu works only on Linux. I can try with Virtual Box and google colab but the virtual system has worse performance (less Ram and gpu) than the local system.
I will try with 13 qubits and 3 subgenerator and depth 4? For 64x64x3 images.
Did you use batch size 64 for all?
This code works also with my dataset using your hpc?
The above results are for
batch_size = 16
fixed_noise = torch.randn(8, n_qubits, device=device)
fake = netG(fixed_noise).view(8,image_size,image_size,3).detach().cpu()
Btw I did it with conditional QGAN.
Ok, does the code with 64x64x3 size images work for you by using HPC? Or does it work only for 8x8?
what is conditional qgan? Is it different?