---
datasets:
- commaai/commabody
pipeline_tag: robotics
---

This model has been trained on a larger version of the commabody dataset.   
It includes a [vqgan](https://github.com/CompVis/taming-transformers) encoder/decoder fine tuned from imagenet. It compresses images of size 250x160 to 16x10 tokens.  

It also includes a GPT2 model trained to predict the next frame, wheel speeds and actions. It can be used either as a simulator or as a policy.