Data encoding for a real datasets

def get_angles(x):

beta0 = 2 * np.arcsin(np.sqrt(x[1] ** 2) / np.sqrt(x[0] ** 2 + x[1] ** 2 + 1e-12))
beta1 = 2 * np.arcsin(np.sqrt(x[3] ** 2) / np.sqrt(x[2] ** 2 + x[3] ** 2 + 1e-12))
beta2 = 2 * np.arcsin(
    np.sqrt(x[2] ** 2 + x[3] ** 2)
    / np.sqrt(x[0] ** 2 + x[1] ** 2 + x[2] ** 2 + x[3] ** 2)

return np.array([beta2, -beta1 / 2, beta1 / 2, -beta0 / 2, beta0 / 2])

May I know why e choose np.sqrt(x[1] ** 2) / np.sqrt(x[0] ** 2 + x[1] ** 2 + 1e-12), not np.sqrt(x[0] ** 2) / np.sqrt(x[0] ** 2 + x[1] ** 2 + 1e-12)? Similarly for beta1, x[3] -> x[4] .

For beta2, can we set np.sqrt(x[0] ** 2 + x[4] ** 2)
/ np.sqrt(x[0] ** 2 + x[1] ** 2 + x[2] ** 2 + x[3] ** 2) and etc?

Any mathematical explanation here?

This is 4 data in a vector using two qubits. Says, I have a datesets with 8 elements in a vector. It means that I can use method above to represent the data. What about I use 3 qubits to have 2^3 states to represent my data? If hardware is not the issue, what is the merit between these two techniques? Thank you very much.

Hi @SuFong_Chien,

Thanks for the question! :slightly_smiling_face:

Any mathematical explanation here?

The techniques used here are part of the [1] Mottonen state preparation and its modified version for positive vectors only from Schuld and Petruccione (2018).

It’s worth noting, that the x vector describes the statevector:

\boldsymbol{x} = x_0|00\rangle+x_1|01\rangle+x_2|10\rangle+x_3|11\rangle

As such, for example, as noted in the State preparation part of [1] (below equation (5)), we would like to zero out the values for the |1\rangle state on qubit n. To achieve this, we specifically pick the 2j-1 components of \boldsymbol{x} where j=1,2 (note that the paper uses 2j as the index in a_{2j} is done from 1 as opposed to our code where we index from 0).

For two qubits, the uniformly controlled Y rotations correspond to controlled Y rotation gates. These are used in the tutorial. When going for higher qubit numbers, we would then apply the pattern described for uniformly controlled Y rotations on n qubits.

If hardware is not the issue, what is the merit between these two techniques?

Not sure if I grasp this, could you elaborate on this question?

Just to add my two cents to @antalszava’s great reply (since I wrote the demo), if I remember correctly this was just how the equation worked out in this particular case. It seems to work, but let us know if you suspect a bug…

In general, we could just use the AmplitudeEmbedding template here, but this was more explicit. If you use the template, you should be able to encode any normalised vector, even if it has negative entries.

I’m also not sure what you mean by “two techniques” - what is the second one? In simulators, amplitude embedding can be implemented more efficiently than computing angles for a lengthy circuit, since we can just set the initial state vector to the desired values. AmplitudeEmbedding will do exactly that when run on a simulator device like default.qubit, and it will revert to the circuit decomposition (similar to the one in this example) which is called MottonenStatePreparation if used on hardware.

Hope this helps!

Dear all

I am sorry that I shouldn;t use 'two techniques" that causes confusion.
My question is that if I have input data [a1,a2, a3, a4, a5, a6, a7, a8]. If I want to have state preparation in real quantum computer for these input data for input neurons, I may have two options, i.e., the first is solution is creating two different states a1|00>+ a2|01>+a3|10> +a4|11> and a5|00>+ a6|01>,+a7|10> +a8|11>

The second solution is very straightforward i.e., a1|000>+ a2|001>+a3|010> +a4|011>+a5|100> + a6|101> +a7|110> + a8|111>

By looking at these two solutions, what are the advantages and disadanvetges for these data encodings? (Supposing we have no problem with the hardware)

I am thinking to code my data for IBM composer if you can give the amplitude encoding example for the composer, it will help lots. Thank you.

Hi @SuFong_Chien,

I see that’s a good question! Frankly, I’m not sure which one could be more beneficial in your case, maybe it’s worth checking on a simulator beforehand? There are many aspects to consider, e.g., the accuracy, the depth of the circuits to simulate, queuing with the real hardware, etc.

I am thinking to code my data for IBM composer if you can give the amplitude encoding example for the composer, it will help lots

Not entirely sure about this, could perhaps the documentation part that Maria linked help here? It contains an example of using qml.AmplitudeEmbedding. Once the circuit looks good, the device can then be switched by specifying the IBMQ device from PennyLane-Qiskit. Alternatively, let us know where it would be useful to have a more detailed example as help.