menily/toolkit
Python library that converts heterogeneous raw data sources — first-person video, VR hand-tracking, motion capture, teleoperation traces — into task-level demonstration data conforming to menily/schema v1. Apache-2.0 open source.
Repository: github.com/MenilyIntelligence/toolkit · Status: Internal alpha · PyPI release: Planned in coming weeks · License: Apache-2.0
What it does
The embodied AI community today works with four structurally incompatible raw data sources:
| Source | Typical device | Raw format | Sample rate |
|---|---|---|---|
| POV video | iPhone, GoPro, Vision Pro recording | .mp4 / .mov | 24-60 fps |
| VR hand-tracking | Meta Quest Pro, Vision Pro, PICO 4U | Custom binary / JSON frames | 60-90 Hz |
| Motion capture | OptiTrack, Vicon, Xsens | .bvh / .fbx / .c3d | 120-240 Hz |
| Teleoperation | URDF + teleop SDK | HDF5 / pickle / RLDS | 10-30 Hz |
menily/toolkit provides a unified Python API that accepts any of these four sources and outputs a Task object conforming to menily/schema v1. Downstream consumers get a consistent interface regardless of where the data originated.
Architecture
Three adapters, one core:
toolkit.pov— First-person video → task-level demonstration datatoolkit.vr— VR hand-tracking → end-effector trajectoriestoolkit.mocap— Motion capture (BVH/FBX) → full-body action sequences with retargetingtoolkit.core—Taskobject, schema validation, serialization, RLDS / HF Datasets export
Installation
# Planned PyPI release
pip install menily-toolkit
# Development install (current)
git clone https://github.com/MenilyIntelligence/toolkit
cd toolkit
pip install -e .
Quick start
From first-person video
from menily.toolkit import pov, schema
tasks = pov.segment(
video_path="./demo_pour_water.mp4",
language="Pour water from the blue cup into the kettle.",
language_variants=[
"把蓝色杯子里的水倒进水壶里",
"Fill the kettle with water from the blue cup",
],
fps=30,
viewpoint="ego",
body_morphology="bimanual_humanoid",
collection_region="SEA",
)
for task in tasks:
task.save_as(
schema="menily.task-demo/1",
out_dir="./processed/",
)
From VR hand-tracking
from menily.toolkit import vr
tasks = vr.from_quest_log(
log_path="./raw/quest_session_20260414.json",
language="Assemble the blue widget onto the base plate.",
fps=60,
viewpoint="ego",
body_morphology="bimanual",
calibration={
"origin": "room_center",
"scale_to_robot": 0.9,
},
)
From motion capture
from menily.toolkit import mocap
tasks = mocap.from_bvh(
bvh_path="./raw/optitrack_session.bvh",
segmentation_file="./raw/task_segments.json",
body_morphology="whole_body_humanoid",
retarget_to="unitree_g1",
retarget_backend="adamorph",
physics_filter=True,
)
Schema validation
task.validate()
# => ValidationReport(
# schema_version='menily.task-demo/1',
# passed=True,
# warnings=[
# "language.variants is recommended but empty",
# ],
# errors=[]
# )
Export to RLDS / HuggingFace
# Export as RLDS bundle (Open X-Embodiment compatible)
rlds_episode = task.to_rlds()
# Export as HuggingFace Dataset
hf_dataset = task.to_hf_dataset()
hf_dataset.push_to_hub("YOUR_ORG/your-dataset-name")
Retargeting backends
The toolkit.mocap adapter integrates existing open-source retargeting research as pluggable backends. menily/toolkit does not reimplement retargeting — it composes existing work.
- AdaMorph — neural retargeting across 12 humanoid morphologies, zero-shot
- OmniRetarget — interaction-preserving data generation with interaction-mesh constraints
- SPARK — skeleton-parameter alignment with three-stage kinodynamic optimization
- KDMR — multi-contact kinodynamic trajectory optimization
- custom — user-provided retargeting function
Roadmap
| Component | Status | PyPI release |
|---|---|---|
toolkit.core (Task, validation, I/O) |
Stable | 2-3 weeks |
toolkit.pov |
Internal alpha | 4-6 weeks |
toolkit.vr |
Internal alpha | 4-6 weeks |
toolkit.mocap |
Design finalized | 8-10 weeks |
Why open source
Data processing toolkits of this kind can be kept closed as a competitive moat. We chose not to, for two reasons:
- A schema has value only if it is adopted. If only Menily uses the format, it is not a schema — it is an internal file format. Adoption requires usable open tooling, not just a spec document.
- The moat in the data business is not the toolkit — it is the data collection network (workforce, quality control, geographic distribution, client relationships). Toolkits are copied in months; distributed data operations take years to build.
Contributing
- Issues and pull requests: github.com/MenilyIntelligence/toolkit
- API design discussion: [email protected]
- Schema design discussion (upstream dependency): schema/issues
Related
- menily/schema v1 — the specification that
toolkitoutputs - research notes — technical design rationale
- About Menily Intelligence