menily/toolkit

Python library that converts heterogeneous raw data sources — first-person video, VR hand-tracking, motion capture, teleoperation traces — into task-level demonstration data conforming to menily/schema v1. Apache-2.0 open source.

Repository: github.com/MenilyIntelligence/toolkit · Status: Internal alpha · PyPI release: Planned in coming weeks · License: Apache-2.0

What it does

The embodied AI community today works with four structurally incompatible raw data sources:

Source	Typical device	Raw format	Sample rate
POV video	iPhone, GoPro, Vision Pro recording	.mp4 / .mov	24-60 fps
VR hand-tracking	Meta Quest Pro, Vision Pro, PICO 4U	Custom binary / JSON frames	60-90 Hz
Motion capture	OptiTrack, Vicon, Xsens	.bvh / .fbx / .c3d	120-240 Hz
Teleoperation	URDF + teleop SDK	HDF5 / pickle / RLDS	10-30 Hz

menily/toolkit provides a unified Python API that accepts any of these four sources and outputs a Task object conforming to menily/schema v1. Downstream consumers get a consistent interface regardless of where the data originated.

Architecture

Three adapters, one core:

toolkit.pov — First-person video → task-level demonstration data
toolkit.vr — VR hand-tracking → end-effector trajectories
toolkit.mocap — Motion capture (BVH/FBX) → full-body action sequences with retargeting
toolkit.core — Task object, schema validation, serialization, RLDS / HF Datasets export

Installation

# Planned PyPI release
pip install menily-toolkit

# Development install (current)
git clone https://github.com/MenilyIntelligence/toolkit
cd toolkit
pip install -e .

Quick start

From first-person video

from menily.toolkit import pov, schema

tasks = pov.segment(
    video_path="./demo_pour_water.mp4",
    language="Pour water from the blue cup into the kettle.",
    language_variants=[
        "把蓝色杯子里的水倒进水壶里",
        "Fill the kettle with water from the blue cup",
    ],
    fps=30,
    viewpoint="ego",
    body_morphology="bimanual_humanoid",
    collection_region="SEA",
)

for task in tasks:
    task.save_as(
        schema="menily.task-demo/1",
        out_dir="./processed/",
    )

From VR hand-tracking

from menily.toolkit import vr

tasks = vr.from_quest_log(
    log_path="./raw/quest_session_20260414.json",
    language="Assemble the blue widget onto the base plate.",
    fps=60,
    viewpoint="ego",
    body_morphology="bimanual",
    calibration={
        "origin": "room_center",
        "scale_to_robot": 0.9,
    },
)

From motion capture

from menily.toolkit import mocap

tasks = mocap.from_bvh(
    bvh_path="./raw/optitrack_session.bvh",
    segmentation_file="./raw/task_segments.json",
    body_morphology="whole_body_humanoid",
    retarget_to="unitree_g1",
    retarget_backend="adamorph",
    physics_filter=True,
)

Schema validation

task.validate()
# => ValidationReport(
#      schema_version='menily.task-demo/1',
#      passed=True,
#      warnings=[
#        "language.variants is recommended but empty",
#      ],
#      errors=[]
#    )

Export to RLDS / HuggingFace

# Export as RLDS bundle (Open X-Embodiment compatible)
rlds_episode = task.to_rlds()

# Export as HuggingFace Dataset
hf_dataset = task.to_hf_dataset()
hf_dataset.push_to_hub("YOUR_ORG/your-dataset-name")

Retargeting backends

The toolkit.mocap adapter integrates existing open-source retargeting research as pluggable backends. menily/toolkit does not reimplement retargeting — it composes existing work.

AdaMorph — neural retargeting across 12 humanoid morphologies, zero-shot
OmniRetarget — interaction-preserving data generation with interaction-mesh constraints
SPARK — skeleton-parameter alignment with three-stage kinodynamic optimization
KDMR — multi-contact kinodynamic trajectory optimization
custom — user-provided retargeting function

Roadmap

Component	Status	PyPI release
`toolkit.core` (Task, validation, I/O)	Stable	2-3 weeks
`toolkit.pov`	Internal alpha	4-6 weeks
`toolkit.vr`	Internal alpha	4-6 weeks
`toolkit.mocap`	Design finalized	8-10 weeks

Why open source

Data processing toolkits of this kind can be kept closed as a competitive moat. We chose not to, for two reasons:

A schema has value only if it is adopted. If only Menily uses the format, it is not a schema — it is an internal file format. Adoption requires usable open tooling, not just a spec document.
The moat in the data business is not the toolkit — it is the data collection network (workforce, quality control, geographic distribution, client relationships). Toolkits are copied in months; distributed data operations take years to build.

Contributing

Issues and pull requests: github.com/MenilyIntelligence/toolkit
API design discussion: [email protected]
Schema design discussion (upstream dependency): schema/issues

menily/schema v1 — the specification that toolkit outputs
research notes — technical design rationale
About Menily Intelligence