datasets: | |
- commaai/commabody | |
pipeline_tag: robotics | |
This model has been trained on a larger version of the commabody dataset. | |
It includes a [vqgan](https://github.com/CompVis/taming-transformers) encoder/decoder fine tuned from imagenet. It compresses images of size 250x160 to 16x10 tokens. | |
It also includes a GPT2 model trained to predict the next frame, wheel speeds and actions. It can be used either as a simulator or as a policy. | |