IntelliGrow
/

FetchPickAndPlace-v4

@@ -21,27 +21,59 @@ model-index:
       verified: false
 ---
-# SAC Agent for FetchPickAndPlace-v4
-## Model Description
-This repository contains a Soft Actor-Critic (SAC) agent trained on the `FetchPickAndPlace-v4` environment using Hindsight Experience Replay (HER). The agent learns to pick and place objects in a simulated robotic environment.
 - **Algorithm:** Soft Actor-Critic (SAC)
 - **Replay Buffer:** Hindsight Experience Replay (HER)
-- **Environment:** FetchPickAndPlace-v4 (from gymnasium-robotics)
 - **Framework:** Stable Baselines3
 ## Training Details
 - **Total Timesteps:** 500,000
-- **Dense Shaping:** Disabled
-- **Evaluation:** Success rate and mean reward measured every 2,000 steps
 - **Seed:** 42
 ## Usage
-To load and use the model:
 ```python
 from stable_baselines3 import SAC
@@ -49,7 +81,7 @@ import gymnasium as gym
 import gymnasium_robotics
 env = gym.make("FetchPickAndPlace-v4", render_mode="rgb_array")
-model = SAC.load("path/to/sac-FetchPickAndPlace-v4.zip", env=env)
 obs, info = env.reset()
 done = False
@@ -59,23 +91,86 @@ while not done:
     env.render()
 ```
-## Evaluation & Replay
-A replay video (`replay.mp4`) is included to visualize the agent's performance over two episodes.
-## Files
-- `sac-FetchPickAndPlace-v4.zip`: Trained SAC model
-- `replay.mp4`: Agent replay video
-- `README.md`: Model card
 ## Citation
 If you use this model, please cite:
 ```
-@misc{IntelliGrow_FetchPickAndPlace_SAC,
-  title={SAC Agent for FetchPickAndPlace-v4},
   author={IntelliGrow},
   year={2025},
   howpublished={Hugging Face Hub},
@@ -88,3 +183,5 @@ If you use this model, please cite:
 MIT License
 ---

       verified: false
 ---
+# SAC + HER Agent for FetchPickAndPlace-v4
+## Model Overview
+This repository contains a Soft Actor-Critic (SAC) agent trained with Hindsight Experience Replay (HER) on the `FetchPickAndPlace-v4` environment from `gymnasium-robotics`. The agent learns to pick and place objects using sparse or dense rewards, and is suitable for robotic manipulation research.
 - **Algorithm:** Soft Actor-Critic (SAC)
 - **Replay Buffer:** Hindsight Experience Replay (HER)
+- **Environment:** FetchPickAndPlace-v4 (`gymnasium-robotics`)
 - **Framework:** Stable Baselines3
 ## Training Details
 - **Total Timesteps:** 500,000
+- **Evaluation Frequency:** Every 2,000 steps (15 episodes per eval)
+- **Checkpoint Frequency:** Every 50,000 steps (model + replay buffer)
 - **Seed:** 42
+- **Dense Shaping:** `False` (can be enabled with wrapper)
+- **Device:** CUDA if available, otherwise auto
+### Hyperparameters
+| Parameter                | Value                |
+|--------------------------|----------------------|
+| Algorithm                | SAC                  |
+| Policy                   | MultiInputPolicy     |
+| Replay Buffer            | HER                  |
+| n_sampled_goal           | 4                    |
+| goal_selection_strategy  | future               |
+| Batch Size               | 512                  |
+| Buffer Size              | 1,000,000            |
+| Learning Rate            | 1e-3                 |
+| Gamma                    | 0.95                 |
+| Tau                      | 0.05                 |
+| Entropy Coefficient      | auto                 |
+| Train Frequency          | 1 step               |
+| Gradient Steps           | 1                    |
+| Tensorboard Log          | logs_pnp_sac_her/tb  |
+| Seed                     | 42                   |
+| Device                   | CUDA/Auto            |
+| Dense Shaping            | False (default)      |
+## Files
+- `sac_her_pnp.zip`: Final trained SAC model
+- `ckpt_sac_her_250000_steps.zip`: Latest checkpoint
+- `replay_buffer.pkl`: Replay buffer for continued training
+- `replay.mp4`: Replay video of agent performance (manual generation recommended)
+- `README.md`: This model card
 ## Usage
+To load and use the model for inference:
 ```python
 from stable_baselines3 import SAC
 import gymnasium_robotics
 env = gym.make("FetchPickAndPlace-v4", render_mode="rgb_array")
+model = SAC.load("path/to/sac_her_pnp.zip", env=env)
 obs, info = env.reset()
 done = False
     env.render()
 ```
+## Evaluation
+To evaluate the agent over multiple episodes:
+```python
+from stable_baselines3 import SAC
+import gymnasium as gym
+import gymnasium_robotics
+env = gym.make("FetchPickAndPlace-v4", render_mode="human")
+model = SAC.load("path/to/sac_her_pnp.zip", env=env)
+num_episodes = 10
+for ep in range(num_episodes):
+    obs, info = env.reset()
+    done = False
+    truncated = False
+    episode_reward = 0
+    while not (done or truncated):
+        action, _ = model.predict(obs, deterministic=True)
+        obs, reward, done, truncated, info = env.step(action)
+        env.render()
+        episode_reward += reward
+    print(f"Episode {ep+1} reward: {episode_reward}")
+env.close()
+```
+## Replay Video
+If `replay.mp4` is not present, you can manually generate it:
+```python
+import gymnasium as gym
+import gymnasium_robotics
+from stable_baselines3 import SAC
+import moviepy.editor as mpy
+env = gym.make("FetchPickAndPlace-v4", render_mode="rgb_array")
+model = SAC.load("path/to/sac_her_pnp.zip", env=env)
+frames = []
+obs, info = env.reset()
+done = False
+truncated = False
+step = 0
+max_steps = 1000
+while not (done or truncated) and step < max_steps:
+    frame = env.render()
+    frames.append(frame)
+    action, _ = model.predict(obs, deterministic=True)
+    obs, reward, done, truncated, info = env.step(action)
+    step += 1
+env.close()
+clip = mpy.ImageSequenceClip(frames, fps=30)
+clip.write_videofile("replay.mp4", codec="libx264")
+```
+## Continued Training
+To continue training from a checkpoint:
+```python
+from stable_baselines3 import SAC
+import gymnasium as gym
+import gymnasium_robotics
+env = gym.make("FetchPickAndPlace-v4", render_mode=None)
+model = SAC.load("logs_pnp_sac_her/ckpt_sac_her_250000_steps.zip", env=env)
+model.learn(total_timesteps=500_000, reset_num_timesteps=False)
+```
 ## Citation
 If you use this model, please cite:
 ```
+@misc{IntelliGrow_FetchPickAndPlace_SAC_HER,
+  title={SAC + HER Agent for FetchPickAndPlace-v4},
   author={IntelliGrow},
   year={2025},
   howpublished={Hugging Face Hub},
 MIT License
 ---
+**Contact:** For questions or issues, open an issue on the [Hugging Face repository](https://huggingface.co/IntelliGrow/FetchPickAndPlace-v4).