Summary
I want to confirm whether the current version combination in this repository is expected to work as-is.
From my run, it looks like there may be an API mismatch:
uv.lock pins lerobot to git rev 0cf864870cf29f4738d3ade893e6fd13fbd7cdb5 (version 0.1.0)
src/openpi/training/mixture_dataset.py calls LeRobotDataset(..., load_video=False)
In pinned lerobot revision 0cf864..., LeRobotDataset.__init__ does not appear to accept load_video (it uses download_videos instead).
The norm-stats script fails at dataset construction with:
TypeError: LeRobotDataset.__init__() got an unexpected keyword argument 'load_video'
Also, scripts/compute_norm_stats_sim.py currently uses bare except: and prints only path, which hides traceback unless locally changed.
Evidence
From this repo (policy/openpi-InternData-A1/uv.lock):
name = "lerobot" at line ~2002
- git source contains rev
0cf864870cf29f4738d3ade893e6fd13fbd7cdb5
name = "datasets" version 3.6.0 at line ~603
name = "pyarrow" version 20.0.0 at line ~3638
From this repo (policy/openpi-InternData-A1/src/openpi/training/mixture_dataset.py):
load_video=False is passed to LeRobotDataset (around line ~672)
From pinned upstream lerobot revision (0cf864...):
LeRobotDataset.__init__(..., download_videos=True, ...)
- no
load_video parameter
From this repo (policy/openpi-InternData-A1/scripts/compute_norm_stats_sim.py):
- bare
except: in two loops (around lines ~294 and ~313)
Reproduction
- Create/sync env from this repo lock (
policy/openpi-InternData-A1/uv.lock).
- Run norm stats script with a valid dataset root.
- Observe crash at dataset construction stage.
Troubleshooting Timeline (What was tried and what happened)
- Initial run showed
0it and no effective processing.
- After correcting invocation/path usage, script started iterating but still did not expose root cause because
compute_norm_stats_sim.py had bare except:.
- After making traceback visible locally, first concrete failure was:
TypeError: LeRobotDataset.__init__() got an unexpected keyword argument 'load_video'
- After bypassing that argument mismatch locally for debugging, next failure was:
TypeError: stack(): argument 'tensors' (position 1) must be tuple of Tensors, not Column
- Then environment was aligned back toward the repository's original lock combination (
datasets==3.6.0, pyarrow==20.0.0) to compare behavior, and a new schema-level error appeared:
ValueError: Feature type 'List' not found. Available feature types: ['Value', 'ClassLabel', 'Translation', 'TranslationVariableLanguages', 'LargeList', 'Sequence', 'Array2D', 'Array3D', 'Array4D', 'Array5D', 'Audio', 'Image', 'Video', 'Pdf', 'VideoFrame']
This made it unclear whether the current dataset schema, script assumptions, and pinned dependency set are expected to be mutually compatible.
Questions
- Is
load_video=False expected to be valid with the currently pinned lerobot revision in uv.lock?
- If yes, am I using the script in a wrong way, or is there an environment/version assumption not documented yet?
- Is the current behavior of swallowing exceptions in
compute_norm_stats_sim.py intentional?
- For the same data, is
ValueError: Feature type 'List' not found ... expected under the repository's original dependency stack, or does it indicate a known schema-version mismatch?
Summary
I want to confirm whether the current version combination in this repository is expected to work as-is.
From my run, it looks like there may be an API mismatch:
uv.lockpinslerobotto git rev0cf864870cf29f4738d3ade893e6fd13fbd7cdb5(version0.1.0)src/openpi/training/mixture_dataset.pycallsLeRobotDataset(..., load_video=False)In pinned
lerobotrevision0cf864...,LeRobotDataset.__init__does not appear to acceptload_video(it usesdownload_videosinstead).The norm-stats script fails at dataset construction with:
Also,
scripts/compute_norm_stats_sim.pycurrently uses bareexcept:and prints only path, which hides traceback unless locally changed.Evidence
From this repo (
policy/openpi-InternData-A1/uv.lock):name = "lerobot"at line ~20020cf864870cf29f4738d3ade893e6fd13fbd7cdb5name = "datasets"version3.6.0at line ~603name = "pyarrow"version20.0.0at line ~3638From this repo (
policy/openpi-InternData-A1/src/openpi/training/mixture_dataset.py):load_video=Falseis passed toLeRobotDataset(around line ~672)From pinned upstream
lerobotrevision (0cf864...):LeRobotDataset.__init__(..., download_videos=True, ...)load_videoparameterFrom this repo (
policy/openpi-InternData-A1/scripts/compute_norm_stats_sim.py):except:in two loops (around lines ~294 and ~313)Reproduction
policy/openpi-InternData-A1/uv.lock).Troubleshooting Timeline (What was tried and what happened)
0itand no effective processing.compute_norm_stats_sim.pyhad bareexcept:.datasets==3.6.0,pyarrow==20.0.0) to compare behavior, and a new schema-level error appeared:This made it unclear whether the current dataset schema, script assumptions, and pinned dependency set are expected to be mutually compatible.
Questions
load_video=Falseexpected to be valid with the currently pinnedlerobotrevision inuv.lock?compute_norm_stats_sim.pyintentional?ValueError: Feature type 'List' not found ...expected under the repository's original dependency stack, or does it indicate a known schema-version mismatch?