|
--- |
|
license: mit |
|
datasets: |
|
- Iker/GTAV-Driving-Dataset |
|
base_model: |
|
- Etched/oasis-500m |
|
tags: |
|
- video |
|
--- |
|
|
|
# AI Generated GTA V |
|
|
|
A Deep Learning project that uses Diffusion transformers (DiT) to generate and Theft Auto V driving footage. This project is based on the [Open-Oasis Project](https://github.com/etched-ai/open-oasis) |
|
|
|
Please see the GitHub repo for more info: https://github.com/ikergarcia1996/AI-Generated-GTAV |
|
|
|
|
|
### dit.safetensors |
|
- Trained using 4xNvidia A100 80Gb in Bfloat16 |
|
- 64 batch size |
|
- 1e-4 learning rate with constant scheduler and 5% warmup |
|
- 1,610,000 steps |
|
- ddim_noise_steps 50 |
|
- ctx noise increased from 0 to 40 during the first 50% of the training. Set to 40 during the remaining steps. |
|
- No Action conditioning |
|
|
|
### dit_action.safetensors |
|
- Continue training of `dit.safetensors` with `action_conditioning` for 210,000 steps. |
|
- Trained using 4xNvidia A100 80Gb in Bfloat16 |
|
- 64 batch size |
|
- 1e-4 learning rate with cosine scheduler to 1e-5 and 5% warmup |
|
|
|
|