Spaces:
Running
Running
| # SFT Trainer Configuration Usage Guide | |
| ## Overview | |
| This guide describes how the SFT (Supervised Fine-tuning) trainer uses the premade configuration files and how the `trainer_type` field is passed through the system. | |
| ## How SFT Trainer Uses Premade Configs | |
| ### 1. Configuration Loading Process | |
| The SFT trainer uses premade configs through the following process: | |
| 1. **Config File Selection**: Users specify a config file via command line or launch script | |
| 2. **Config Loading**: The system loads the config using `get_config()` function | |
| 3. **Config Inheritance**: All configs inherit from `SmolLM3Config` base class | |
| 4. **Trainer Type Detection**: The system checks for `trainer_type` field in the config | |
| 5. **Training Arguments Creation**: Config parameters are used to create `TrainingArguments` | |
| ### 2. Configuration Parameters Used by SFT Trainer | |
| The SFT trainer uses the following config parameters: | |
| #### Model Configuration | |
| - `model_name`: Model to load (e.g., "HuggingFaceTB/SmolLM3-3B") | |
| - `max_seq_length`: Maximum sequence length for tokenization | |
| - `use_flash_attention`: Whether to use flash attention | |
| - `use_gradient_checkpointing`: Whether to use gradient checkpointing | |
| #### Training Configuration | |
| - `batch_size`: Per-device batch size | |
| - `gradient_accumulation_steps`: Gradient accumulation steps | |
| - `learning_rate`: Learning rate for optimization | |
| - `weight_decay`: Weight decay for optimizer | |
| - `warmup_steps`: Number of warmup steps | |
| - `max_iters`: Maximum training iterations | |
| - `save_steps`: Save checkpoint every N steps | |
| - `eval_steps`: Evaluate every N steps | |
| - `logging_steps`: Log every N steps | |
| #### Optimizer Configuration | |
| - `optimizer`: Optimizer type (e.g., "adamw_torch") | |
| - `beta1`, `beta2`, `eps`: Optimizer parameters | |
| #### Scheduler Configuration | |
| - `scheduler`: Learning rate scheduler type | |
| - `min_lr`: Minimum learning rate | |
| #### Mixed Precision | |
| - `fp16`: Whether to use fp16 precision | |
| - `bf16`: Whether to use bf16 precision | |
| #### Data Configuration | |
| - `dataset_name`: Hugging Face dataset name | |
| - `data_dir`: Local dataset directory | |
| - `train_file`: Training file name | |
| - `validation_file`: Validation file name | |
| #### Monitoring Configuration | |
| - `enable_tracking`: Whether to enable Trackio tracking | |
| - `trackio_url`: Trackio server URL | |
| - `experiment_name`: Experiment name for tracking | |
| ### 3. Training Arguments Creation | |
| The SFT trainer creates `TrainingArguments` from config parameters: | |
| ```python | |
| def get_training_arguments(self, output_dir: str, **kwargs) -> TrainingArguments: | |
| training_args = { | |
| "output_dir": output_dir, | |
| "per_device_train_batch_size": self.config.batch_size, | |
| "per_device_eval_batch_size": self.config.batch_size, | |
| "gradient_accumulation_steps": self.config.gradient_accumulation_steps, | |
| "learning_rate": self.config.learning_rate, | |
| "weight_decay": self.config.weight_decay, | |
| "warmup_steps": self.config.warmup_steps, | |
| "max_steps": self.config.max_iters, | |
| "save_steps": self.config.save_steps, | |
| "eval_steps": self.config.eval_steps, | |
| "logging_steps": self.config.logging_steps, | |
| "fp16": self.config.fp16, | |
| "bf16": self.config.bf16, | |
| # ... additional parameters | |
| } | |
| return TrainingArguments(**training_args) | |
| ``` | |
| ### 4. Trainer Selection Logic | |
| The system determines which trainer to use based on the `trainer_type` field: | |
| ```python | |
| # Determine trainer type (command line overrides config) | |
| trainer_type = args.trainer_type or getattr(config, 'trainer_type', 'sft') | |
| # Initialize trainer based on type | |
| if trainer_type.lower() == 'dpo': | |
| trainer = SmolLM3DPOTrainer(...) | |
| else: | |
| trainer = SmolLM3Trainer(...) # SFT trainer | |
| ``` | |
| ## Configuration Files Structure | |
| ### Base Config (`config/train_smollm3.py`) | |
| ```python | |
| @dataclass | |
| class SmolLM3Config: | |
| # Trainer type selection | |
| trainer_type: str = "sft" # "sft" or "dpo" | |
| # Model configuration | |
| model_name: str = "HuggingFaceTB/SmolLM3-3B" | |
| max_seq_length: int = 4096 | |
| # ... other fields | |
| ``` | |
| ### DPO Config (`config/train_smollm3_dpo.py`) | |
| ```python | |
| @dataclass | |
| class SmolLM3DPOConfig(SmolLM3Config): | |
| # Trainer type selection | |
| trainer_type: str = "dpo" # Override default to use DPO trainer | |
| # DPO-specific configuration | |
| beta: float = 0.1 | |
| # ... DPO-specific fields | |
| ``` | |
| ### Specialized Configs (e.g., `config/train_smollm3_openhermes_fr_a100_multiple_passes.py`) | |
| ```python | |
| @dataclass | |
| class SmolLM3ConfigOpenHermesFRMultiplePasses(SmolLM3Config): | |
| # Inherits trainer_type = "sft" from base config | |
| # Specialized configuration for multiple passes | |
| batch_size: int = 6 | |
| gradient_accumulation_steps: int = 20 | |
| learning_rate: float = 3e-6 | |
| max_iters: int = 25000 | |
| # ... other specialized fields | |
| ``` | |
| ## Trainer Type Priority | |
| The trainer type is determined in the following order of priority: | |
| 1. **Command line argument** (`--trainer_type`) - Highest priority | |
| 2. **Config file** (`trainer_type` field) - Medium priority | |
| 3. **Default value** (`"sft"`) - Lowest priority | |
| ## Usage Examples | |
| ### Using SFT Trainer with Different Configs | |
| ```bash | |
| # Basic SFT training (uses base config) | |
| python src/train.py config/train_smollm3.py | |
| # SFT training with specialized config | |
| python src/train.py config/train_smollm3_openhermes_fr_a100_multiple_passes.py | |
| # SFT training with override | |
| python src/train.py config/train_smollm3.py --trainer_type sft | |
| # DPO training (uses DPO config) | |
| python src/train.py config/train_smollm3_dpo.py | |
| # Override config's trainer type | |
| python src/train.py config/train_smollm3.py --trainer_type dpo | |
| ``` | |
| ### Launch Script Usage | |
| ```bash | |
| ./launch.sh | |
| # Select "SFT" when prompted for trainer type | |
| # The system will use the appropriate config based on selection | |
| ``` | |
| ## Configuration Inheritance | |
| All specialized configs inherit from `SmolLM3Config` and automatically get: | |
| - `trainer_type = "sft"` (default) | |
| - All base training parameters | |
| - All monitoring configuration | |
| - All data configuration | |
| Specialized configs can override any of these parameters for their specific use case. | |
| ## SFT Trainer Features | |
| The SFT trainer provides: | |
| 1. **SFTTrainer Backend**: Uses Hugging Face's `SFTTrainer` for instruction tuning | |
| 2. **Fallback Support**: Falls back to standard `Trainer` if `SFTTrainer` fails | |
| 3. **Config Integration**: Uses all config parameters for training setup | |
| 4. **Monitoring**: Integrates with Trackio for experiment tracking | |
| 5. **Checkpointing**: Supports model checkpointing and resuming | |
| 6. **Mixed Precision**: Supports fp16 and bf16 training | |
| ## Troubleshooting | |
| ### Common Issues | |
| 1. **Missing trainer_type field**: Ensure all configs have the `trainer_type` field | |
| 2. **Config inheritance issues**: Check that specialized configs properly inherit from base | |
| 3. **Parameter conflicts**: Ensure command line arguments don't conflict with config values | |
| ### Debugging | |
| Enable verbose logging to see config usage: | |
| ```bash | |
| python src/train.py config/train_smollm3.py --trainer_type sft | |
| ``` | |
| Look for these log messages: | |
| ``` | |
| Using trainer type: sft | |
| Initializing SFT trainer... | |
| Creating SFTTrainer with training arguments... | |
| ``` | |
| ## Related Documentation | |
| - [Trainer Selection Guide](TRAINER_SELECTION_GUIDE.md) | |
| - [Training Configuration Guide](TRAINING_CONFIGURATION_GUIDE.md) | |
| - [Monitoring Integration Guide](MONITORING_INTEGRATION_GUIDE.md) |