TELEOPERATION DATA

Robot Teleoperation

Name: Robot Teleoperation
Creator: Gerra

Success-labeled robot manipulation episodes collected via human teleoperation across diverse tasks and embodiments.

400K+ EPISODESH.264 · JSONL · NPZ35K+ new episodes per month

794

Accepted episodes

23-DOF

Named joints

915Hz

Top sensor rate

ORTF

Open format

Sample Schema Included Methodology Evals Graders Application Integration Research

01Sample

See the data

Representative records in the exact shape we deliver. Real provenance and full slices are shared under license.

Booster T1 episode, light-box pick-and-place (23-DOF)

Representative of the real metadata shape. Audio is 16 kHz wav; counts re-verified on disk (925 depth frames, 2,338 joint samples).

metadata.jsonrepresentative

{
  "episode": { "id": "episode_f0fe8050-...", "version": "2.0", "duration_seconds": 29.51,
    "task": { "name": "Light Box Table To Chair Behind V0",
              "instruction": "Pick up the light box, turn around, and place it on the chair behind you." } },
  "timing": { "start_unix": 1766526819.30, "fps_target": 30, "fps_actual": 40.06 },
  "sensors": {
    "visual": { "rgb_frames": 1182, "depth_frames": 925, "resolution": { "width": 1280, "height": 720 } },
    "proprioception": { "joint_samples": 2338, "joint_sample_rate_hz": 79.23 },
    "audio": { "format": "wav", "sample_rate": 16000, "channels": 1 } },
  "stats": { "recording_quality": "good", "active_streams_count": 5 },
  "provenance": { "watermarked": true, "signature": "TW7wMQxo...A8=" }
}

Per-step proprioception frame (23 named joints)

Representative. One timestep; positions in rad, velocities in rad/s, efforts in Nm.

joints.jsonlrepresentative

{
  "timestamp": 3.0976,
  "unix_timestamp": 1766526822.3984,
  "positions":  { "Left_Knee_Pitch": 0.4141, "Right_Knee_Pitch": 0.4229, "Waist": -0.0048, "Head_pitch": 0.3603 },
  "velocities": { "Left_Knee_Pitch": -0.00929, "Right_Knee_Pitch": 0.00759 },
  "efforts":    { "Left_Knee_Pitch": -4.2377, "Right_Knee_Pitch": -5.0484, "Left_Hip_Yaw": 2.9137 }
}

LimX Tron quadruped, fall-recovery, success-labeled

Representative, cross-embodiment. Shows the success label and per-stream rates including 915 Hz motors.

tron_metadata.jsonrepresentative

{
  "episode": { "id": "tron_success_001", "duration_seconds": 15.6 },
  "machine": { "type": "limx_tron1", "class": "quadruped_robot" },
  "task": { "type": "fall_recovery_v0", "success": true, "failure_reason": null },
  "sensors": { "feed": { "proprioception": {
    "motors":   { "sample_count": 14277, "avg_rate_hz": 915.2 },
    "imu":      { "sample_count": 1518,  "avg_rate_hz": 97.3 },
    "odometry": { "sample_count": 1501,  "avg_rate_hz": 96.2 } } } },
  "files": { "total_size_bytes": 64957399 }
}

02Schema

Record shape

Every field, its type, whether it can be null, and a representative value.

Field	Type	Constraint	Description
episode.id	string	required	Unique episode identifier. e.g. episode_f0fe8050-...
duration_seconds	float64 · s	required	Episode wall-clock duration. e.g. 29.51
task.instruction	string	required	Language-conditioned instruction given to the operator. e.g. Pick up the light box and place it on the chair behind you.
task.success	bool	nullable	Human-verified task outcome (null if unevaluated). e.g. true
task.failure_reason	string	nullable	Reason on failure; null on success. e.g. null
timing.fps_target	int · fps	required	Target video frame rate. e.g. 30
timing.fps_actual	float · fps	required	Measured achieved frame rate. e.g. 40.06
sensors.visual.rgb_frames	int · frames	required	RGB frame count (mp4, H.264). e.g. 1182
sensors.visual.depth_frames	int · frames	required	Depth frame count (16-bit PNG). e.g. 925
sensors.visual.resolution	{w,h} · px	required	Image resolution. e.g. 1280 x 720
sensors.proprioception.joint_samples	int · samples	required	Joint-state sample count. e.g. 2338
sensors.proprioception.joint_sample_rate_hz	float64 · Hz	required	Measured joint sampling rate. e.g. 79.23
sensors.audio.sample_rate	int · Hz	required	Audio sample rate (wav, mono). e.g. 16000
license.license_id	string	required	Per-delivery license id. e.g. GRR-20251223-A135CB89
provenance.signature	string	required	Cryptographic integrity signature; episodes are watermarked. e.g. TW7wMQxo...A8=

03What's included

Manipulation Trajectories

Complete teleoperated task demonstrations with synchronized vision and robot state across diverse manipulation tasks.

State-Action Pairs

Frame-aligned observation and action tuples ready for behavior cloning and inverse RL.

Success Labels

Human-verified task outcomes and metadata for filtering, curriculum design, and reward modeling.

04Methodology

How it is built

01
Human teleoperation capture
Episodes are recorded by human teleoperators driving real robots (Booster T1, LimX Tron, Unitree). Each episode is a single contiguous recording with a task name, description, and language instruction.
02
Multi-modal synchronized recording
RGB (mp4/H.264), depth (PNG), proprioception (joint position, velocity, effort), robot state, and audio (wav) are captured concurrently, each stamped with both an episode-relative timestamp and an absolute Unix timestamp at its native rate.
03
Timestamp normalization
All streams carry float64 Unix and episode-relative timestamps for cross-stream alignment. Native measurement rates are preserved rather than resampled, with a sync-tolerance budget recorded in the format.
04
Quality and success labeling
Each episode gets a recording-quality tag and an active-stream count, and tasks carry a boolean success plus failure reason. Accepted episodes are organized by outcome, and admin review gates customer visibility.
05
Licensing and watermark
On delivery each episode is stamped with a license block, a multi-layer watermark, a cryptographic signature, and per-file checksums.
06
ORTF packaging
Episodes are packaged in the Open Robot Training Format: a manifest, an episode index, chunked step records, videos, and optional annotations and robot description, convertible to LeRobot and RLDS / Open-X-Embodiment.

05Evals

How we validate

What each evaluation measures and how it is run. Where no benchmark is published, we show the methodology and say so.

Sim2Real Gap Characterization

Measures

How faithfully a MuJoCo simulation reproduces the real teleoperated trajectories - which validates that the captured real data is the ground-truth reference.

Method

Position-controlled trajectory replay in MuJoCo over 123 episodes, 3,687 seconds, and 284,794 joint samples, scored per joint with MAE, RMSE, and Pearson correlation plus a composite gap score.

Result

Real measured: aggregate position MAE 5.56 degrees, knee pitch highest at 12.2 degrees, velocity MAE 0.41 rad/s, legs 6.7 vs arms 5.5 degrees (about 1.2x), overall gap score 0.36. See the sim2real paper.

Episode acceptance / QA gate

Measures

Whether an episode is clean enough to enter the saleable catalog.

Method

Automated format, stream-presence, and quality checks plus human QA, with a recording-quality tag, an active-stream count, and success/fail bucketing.

Result

Methodology-stage. Pass/fail per episode; no published aggregate acceptance-rate metric.

ORTF schema validation

Measures

Whether a packaged dataset conforms to the format.

Method

A validator checks manifest schema, required files, parquet-to-manifest match, video frame-count integrity, timestamp monotonicity, and action/observation dimension match.

Result

Methodology-stage. Validator pass/fail, not a score.

06Graders

Ground truth

What correct means for this data, and how it is established.

Ground truth

The human teleoperator executed trajectory is the demonstration ground truth, and the human-applied success boolean and failure reason are the task-outcome ground truth. For simulation fidelity, the real robot trajectory is the reference the simulation is scored against.

How it is established

Success labels are applied by humans and encoded per episode. Sim2real validation is an execution-based comparison of simulated vs real joint trajectories via per-joint MAE, RMSE, and correlation, with a normalized composite gap score. Integrity is graded by per-file SHA-256 checksums and a signature.

Agreement

No inter-rater agreement statistic is published. Correctness for the simulation study is anchored to the recorded real trajectory itself.

07Application

Imitation Learning

Behavior cloning from success-labeled human demonstrations on real hardware, with full state-action coverage.

VLA Pretraining

Vision-language-action triplets across tasks and embodiments for generalist manipulation policy pretraining.

RL Fine-Tuning

Seed real-world trajectories and success signals for offline RL and policy refinement before deployment.

08Environment & integration

How you load it

Delivery

S3 / pre-signed URL, REST API, CDN

Formats

ORTF (manifest + parquet + MP4 + PNG depth + WAV), LeRobot v3.0, RLDS / Open-X-Embodiment

Auth

Org-scoped API keys, tenant isolation, per-customer storage paths, and mutual TLS. Artifacts are signed, watermarked, and access-logged under a restricted per-delivery license.

Cadence

The back catalog is a one-time licensable archive of 794 accepted episodes, roughly 6.7 hours and 63 GiB; ongoing direct-collection programs are supported.

quickstart.sh

# Load and convert an ORTF episode set
GET /datasets                      # list available datasets
GET /datasets/{id}                 # metadata + episode index
GET /datasets/{id}/download        # pre-signed URL to the ORTF bundle
 
ortf validate ./dataset            # manifest + frame-count + timestamp checks
to_lerobot(load_dataset("./ortf"), "out/lerobot")   # convert to LeRobot v3.0

09Related research

Open Robot Training Format (ORTF)Read →Sim2Real Gap in Bipedal HumanoidsRead →

Request access.

Restricted-scope evaluation access for qualified teams. We share real samples, full schema, and provenance under a mutual NDA.

Talk to us team@gerra.com

Robot Teleoperation

See the data

Record shape

Manipulation Trajectories

State-Action Pairs

Success Labels

How it is built

Human teleoperation capture

Multi-modal synchronized recording

Timestamp normalization

Quality and success labeling

Licensing and watermark

ORTF packaging

How we validate

Sim2Real Gap Characterization

Episode acceptance / QA gate

ORTF schema validation

Ground truth

Imitation Learning

VLA Pretraining

RL Fine-Tuning

How you load it

Request access.

Product

Company

Connect