Spaces:
Running
Pipeline Parallelism Emulation
This project provides tools for emulating and visualizing pipeline parallelism strategies used in large language model training.
Overview
Pipeline parallelism is a technique used to train large models by partitioning the model across multiple devices and processing data in a pipelined fashion. This project allows you to:
- Simulate different pipeline parallelism strategies (1F1B, Interleaved, Zero-Bubble, etc.)
- Visualize the execution schedule on multiple devices
- Compare different strategies for efficiency
Features
Supported Pipeline Strategies:
- 1F1B (One-Forward-One-Backward)
- Interleaved 1F1B
- Zero-Bubble 1F1B (ZB-1P)
- 1F1B with computation-communication overlap
- Interleaved 1F1B with computation-communication overlap
- DualPipe (Bidirectional pipeline parallelism with full forward-backward overlap)
Visualization:
- Interactive visualization dashboard using Plotly/Dash
Configuration:
- Configurable simulation parameters through Hydra
- Customizable stage latency and communication costs
Installation
This project uses uv for dependency management.
Setup uv
if not installed on your computer:
# On macOS and Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
Usage
Running for 1F1B strategy:
uv run python main.py strategy=1f1b num_devices=4 num_stages=4 num_batches=8
Running for interleaved strategy:
uv run python main.py strategy=interleave num_devices=4 num_stages=8 num_batches=8
Running for ZB-1P strategy:
uv run python main.py strategy=zb1p num_devices=4 num_stages=4 num_batches=8
Running for DualPipe strategy:
uv run python main.py strategy=dualpipe num_devices=8 num_stages=8 num_batches=20
Running for 1F1B-batch-overlap strategy:
uv run python main.py strategy=1f1b_overlap num_devices=4 num_stages=4 num_batches=8
Running for 1F1B-interleave-overlap strategy:
uv run python main.py strategy=1f1b_interleave_overlap num_devices=4 num_stages=8 num_batches=8
Configuration
The default configuration is in conf/config.yaml
. You can override any parameter on the command line or create configuration groups for different scenarios.
Override Specific Parameters
You can override specific parameters at runtime:
uv run python main.py op_times.forward=0.5 op_times.backward=1.0 num_batches=6
Use DualPipe as an example, you can manually set different time for forward/backward/backward_D/backward_W/overlapped_forward_backward:
uv run python main.py strategy=dualpipe num_devices=8 num_stages=8 num_batches=32 op_times.forward=1.0 op_times.backward=2.0 op_times.backward_D=1.0 op_times.backward_W=1.0 op_times.overlapped_forward_backward=2.5
Using Different Configuration Files
You can use different configuration files with Hydra in several ways:
Recommended Approach
Create multiple configuration files in the
conf
directory for different use cases:conf/ βββ config.yaml # Default configuration βββ model_A.yaml # Create your own config with stage-specific latency for performance projection
Run with your desired configuration using the
--config-name
flag:uv run python main.py --config-name=model_A
Project Structure
PP-Emulation/
βββ conf/ # Hydra configuration files
β βββ config.yaml # Default configuration
βββ src/ # Source code
β βββ __init__.py # Package initialization
β βββ execution_model.py # Schedule execution models
β βββ strategies.py # Pipeline parallelism strategies
β βββ visualizer.py # Visualization utilities
βββ main.py # Main entry point
βββ pyproject.toml # Project metadata and dependencies
βββ README.md # This file
References
- PipeDream: Fast and Efficient Pipeline Parallel DNN Training. arxiv
- Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM. arxiv
- Zero Bubble Pipeline Parallelism. arxiv
- Communication-Computation Overlap in MoE Training with 1F1B Pipeline Parallelism. blog
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.