Possible attribute swap in Hidden Manifold dataset (train vs diff_train)?

Hello! I have been working with the Hidden Manifold dataset from qml.data and noticed a discrepancy between the documentation and the actual data structure I am receiving.

According to the documentation, train should let you change the dimensions (d) of the final vector space while fixing the dimensions (m) of the original vector space. However, my results suggest that diff_train is actually the one that fixes the dimensions of the original vector space and lets you change the final one, while train varies the original.

Here is the self-contained code to reproduce the behavior:

import pennylane as qml

# Load the dataset
# Note: force=True ensures we aren't using a cached/corrupted version
datasets = qml.data.load("other", name="hidden-manifold", force=True)
ds = datasets[0]

# Access specific keys to check dimensions
# According to docs: ds.train varies input dimension 'd' (final space)
d4 = ds.train['4']['inputs'] 

# According to docs: ds.diff_train keeps input fixed (d=10) and varies manifold 'm'
d5_diff = ds.diff_train['5']['inputs']

print(f"--- Dimension Check ---")
print(f"Key '4' in .train['4'] should mean 4 dimensions in the final space.")
print(f"Actual shape: {len(d4[0])} vector dimensions")
print("-" * 20)
print(f"Key '5' in .diff_train['5'] should mean 10 dimensions (fixed).")
print(f"Actual shape: {len(d5_diff[0])} vector dimensions")

Output demonstrating the discrepancy:

--- Dimension Check ---
Key '4' in .train['4'] should mean 4 dimensions in the final space.
Actual shape: 10 vector dimensions
--------------------
Key '5' in .diff_train['5'] should mean 10 dimensions (fixed).
Actual shape: 5 vector dimensions

It appears the attributes might be swapped compared to the description in the documentation.

Could you confirm if this is the intended behavior or a documentation error? I want to ensure I am correctly using the subset where the manifold complexity varies.

Package versions:

Name: pennylane
Version: 0.42.3
Platform info: Linux-6.6.87.2-microsoft-standard-WSL2-x86_64-with-glibc2.35
Python version: 3.10.12
Numpy version: 2.2.6
Scipy version: 1.15.3

Hi @Spartoons, thank you for pointing this out.

I can confirm that the output doesn’t match the documentation.
I’m checking with our team to see what’s going on. In the meantime you could try going to the source for generating the data yourself.

The repo is GitHub - XanaduAI/qml-benchmarks: Code to benchmark quantum machine learning models.

There you will find the original code to generate the data for the hidden manifold model, and you will also find this function being used within the paper/benchmarks folder to generate datasets for the ‘hidden manifold’ and ‘hidden manifold diff’ benchmarks.

I hope this helps!

1 Like

Hi @Spartoons ,

Our team has validated the issue and we will be fixing it in the next few weeks. I’ve created an issue for this so you can follow the issue for updates on this.

Thanks again for surfacing this issue!

1 Like

Hi @Spartoons , we uploaded a new version of the dataset that correctly inverts the attributes. It may take a while for this change to propagate through CloudFront and appear in future downloads.

Let us know if you run into any more problems.

2 Likes

I can confirm that the fix is working. Thanks @Diego !
And thanks @Spartoons for flagging this issue.

2 Likes

Thank you @Diego!
And thanks @CatalinaAlbornoz for the support.

1 Like