Editing Ai Robotics (section)

== <span style="color: #FFFFFF;">Applying</span> ==
'''Training a robot locomotion policy with Stable-Baselines3 + Isaac Gym:'''

<syntaxhighlight lang="python">
# Conceptual code: real implementation uses Isaac Gym / MuJoCo
import gymnasium as gym
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import SubprocVecEnv

def make_env(env_id, seed):
    def _init():
        env = gym.make(env_id)
        env.reset(seed=seed)
        return env
    return _init

# Vectorized environments for parallel rollout collection
# Isaac Gym can run 4096 robot simulations simultaneously on GPU
n_envs = 4096 if gpu_available else 8
env = SubprocVecEnv([make_env("HalfCheetah-v4", seed=i) for i in range(n_envs)])

# PPO for locomotion training
model = PPO(
    "MlpPolicy",
    env,
    learning_rate=3e-4,
    n_steps=2048,
    batch_size=64 * n_envs,
    gamma=0.99,
    gae_lambda=0.95,
    clip_range=0.2,
    ent_coef=0.0,
    verbose=1,
    tensorboard_log="./locomotion_log/",
    policy_kwargs={"net_arch": [512, 256, 128]}  # Larger network for complex control
)

model.learn(total_timesteps=10_000_000)
model.save("cheetah_locomotion_policy")
</syntaxhighlight>

'''Grasp pose estimation:'''
<syntaxhighlight lang="python">
# Using GraspNet or GR-ConvNet for grasp point prediction from depth image
import torch
from grconvnet import GRConvNet

model = GRConvNet.from_pretrained("cornell-grasp")
model.eval()

# Input: RGB-D image (4 channels: R, G, B, Depth)
rgbd = load_rgbd_image("scene.png")  # shape: (4, H, W)

with torch.no_grad():
    q, angle, width = model(rgbd.unsqueeze(0))
    # q: grasp quality map
    # angle: grasp rotation map
    # width: gripper width map

# Find best grasp
best_grasp_idx = q.argmax()
grasp_y, grasp_x = divmod(best_grasp_idx.item(), q.shape[-1])
grasp_angle = angle[0, 0, grasp_y, grasp_x].item()
gripper_width = width[0, 0, grasp_y, grasp_x].item()
print(f"Grasp at ({grasp_x}, {grasp_y}), angle={grasp_angle:.1f}°, width={gripper_width:.3f}m")
</syntaxhighlight>

; Key robot learning paradigms and tooling
: '''Physics simulation''' → Isaac Gym/Isaac Lab (NVIDIA), MuJoCo, PyBullet, Gazebo
: '''Grasping''' → GraspNet, GR-ConvNet, AnyGrasp; 6-DoF pose estimation
: '''Locomotion RL''' → PPO/SAC in Isaac Gym; Boston Dynamics Spot uses RL
: '''Imitation learning''' → ACT (Action Chunking Transformer), Diffusion Policy, DROID dataset
: '''Foundation model policies''' → RT-2 (Google), π0 (Physical Intelligence), OpenVLA
: '''Robot middleware''' → ROS 2 (Robot Operating System); industry standard for integration
</div>

<div style="background-color: #8B4500; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">