Data Collection in Real#

Record demonstrations on a physical SO-101 into a LeRobot dataset, driving the follower with either teleop device (see Supported Teleop devices). The example scripts live in examples/isaac_teleop_to_so101/ in the LeRobot repository: teleoperate.py drives the arm live, and record.py does the same while saving a dataset. Both take the same --robot.* / --teleop.* flags; --teleop.type selects the device (xr_controller | so101_leader).

Before you start#

  1. A working SO-101 follower — assembled, motors set up, and calibrated. See SO-101 support in LeRobot.

  2. The isaac-teleop extra installed (isaacteleop ships on public PyPI; its [cloudxr,retargeters] extras pull the CloudXR runtime bindings and the retargeter library):

    uv pip install -e '.[isaac-teleop]'
    
  3. Run the scripts from the example directory, and log in to the Hugging Face Hub — recorded datasets are pushed to the Hub by default (pass --dataset.push_to_hub=false to keep them local):

    cd examples/isaac_teleop_to_so101
    huggingface-cli login
    

Then follow the steps for your teleop device:

The controller pose drives the follower’s end-effector through the clutch + IK pipeline, streamed over CloudXR.

  1. Fetch the robot model. The XR path solves inverse kinematics, so it needs the SO-101 URDF and meshes (downloaded into ./SO101/):

    python download_assets.py
    
  2. Connect a headset. Bring up CloudXR and connect your XR headset — follow the Quick Start.

  3. (Optional) Try teleoperation without recording. A good way to check the setup before committing to a dataset:

    python teleoperate.py \
        --robot.type=so101_follower \
        --robot.port=/dev/ttyACM0 \
        --robot.id=so101_follower_arm \
        --teleop.type=xr_controller
    

    Squeeze and hold the grip to engage the clutch and move the arm; the trigger controls the gripper. Release the grip to pause.

  4. Record a dataset. Add cameras and the dataset parameters:

    python record.py \
        --robot.type=so101_follower \
        --robot.port=/dev/ttyACM0 \
        --robot.id=so101_follower_arm \
        --teleop.type=xr_controller \
        --robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
        --dataset.repo_id=<hf_user>/<dataset_name> \
        --dataset.single_task="Pick up vial from rack on the left side" \
        --dataset.num_episodes=3 \
        --dataset.episode_time_s=20 \
        --dataset.reset_time_s=5
    

Note

Customizing the reset pose. On startup the XR path slews the arm to a built-in default reset pose (a comfortable mid-range pose) before handing control to the clutch — you do not need to record anything. To tailor it to your setup, back-drive the arm to the pose you want and run python override_reset_pose.py; it writes reset_pose.json (git-ignored, user-local), which takes priority over the default on the next run. Pass --reset_to_origin=false to skip the slew and keep the arm where it is.

A back-drivable SO-101 leader arm mirrored 1:1 to the follower. Its joints are streamed by Isaac Teleop’s so101_leader plugin, which the script launches for you.

  1. Build the so101_leader plugin. It is part of Isaac Teleop’s C++ source, not the isaacteleop pip package, so build it from an Isaac Teleop checkout:

    cmake -B build && cmake --build build --parallel && cmake --install build
    

    The binary lands at install/plugins/so101_leader/so101_leader_plugin. For details see The SO-101 leader plugin and Build from Source.

  2. Calibrate the leader so the leader and follower agree on each joint’s zero and range. This reuses the serial SO-101 leader’s calibration (stored under so_leader/<id>.json and reused on every run):

    lerobot-calibrate \
        --teleop.type=so101_leader \
        --teleop.port=/dev/ttyACM1 \
        --teleop.id=so101_leader_arm
    
  3. (Optional) Try teleoperation without recording. --launch_plugin spawns the plugin after CloudXR is up; --teleop.port is the leader’s serial port:

    python teleoperate.py \
        --robot.type=so101_follower \
        --robot.port=/dev/ttyACM0 \
        --robot.id=so101_follower_arm \
        --teleop.type=so101_leader \
        --teleop.port=/dev/ttyACM1 \
        --teleop.id=so101_leader_arm \
        --launch_plugin=/path/to/IsaacTeleop/install/plugins/so101_leader/so101_leader_plugin
    

    Back-drive the leader arm by hand to move the follower.

  4. Record a dataset. Same flags as teleoperation, plus the cameras and dataset parameters (keep --launch_plugin so the plugin is started):

    python record.py \
        --robot.type=so101_follower \
        --robot.port=/dev/ttyACM0 \
        --robot.id=so101_follower_arm \
        --teleop.type=so101_leader \
        --teleop.port=/dev/ttyACM1 \
        --teleop.id=so101_leader_arm \
        --launch_plugin=/path/to/IsaacTeleop/install/plugins/so101_leader/so101_leader_plugin \
        --robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
        --dataset.repo_id=<hf_user>/<dataset_name> \
        --dataset.single_task="Pick up vial from rack on the left side" \
        --dataset.num_episodes=3 \
        --dataset.episode_time_s=20 \
        --dataset.reset_time_s=5
    

Recording controls#

record.py records --dataset.num_episodes episodes of --dataset.episode_time_s seconds each, with a --dataset.reset_time_s window between episodes to reposition the scene. While it is running, press these keys in the terminal where record.py is running — the example reads them from that terminal, so they work over SSH and in a plain terminal (Linux/macOS), with no desktop session required:

Key

Action

Right arrow →

End the current episode early and save it.

Left arrow ←

Discard the current take and re-record it.

Escape

Stop after the current episode (already-saved episodes are kept).

Set LEROBOT_KEYBOARD_BACKEND to override how keys are read — auto (the default; uses the terminal when one is attached, otherwise a global listener), stdin, pynput, or none. The dataset is written under $HF_LEROBOT_HOME/<repo_id> and pushed to the Hub when recording finishes (unless --dataset.push_to_hub=false). Next, train a policy on it: Model Training with GR00T.