Spaces:
Runtime error
Runtime error
# One-Shot Free-View Neural Talking Head Synthesis | |
Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing". | |
```Python 3.6``` and ```Pytorch 1.7``` are used. | |
Updates: | |
-------- | |
```2021.11.05``` : | |
* <s>Replace Jacobian with the rotation matrix (Assuming J = R) to avoid estimating Jacobian.</s> | |
* Correct the rotation matrix. | |
```2021.11.17``` : | |
* Better Generator, better performance (models and checkpoints have been released). | |
Driving | Beta Version | FOMM | New Version: | |
https://user-images.githubusercontent.com/17874285/142828000-db7b324e-c2fd-4fdc-a272-04fb8adbc88a.mp4 | |
-------- | |
Driving | FOMM | Ours: | |
 | |
Free-View: | |
 | |
Train: | |
-------- | |
``` | |
python run.py --config config/vox-256.yaml --device_ids 0,1,2,3,4,5,6,7 | |
``` | |
Demo: | |
-------- | |
``` | |
python demo.py --config config/vox-256.yaml --checkpoint path/to/checkpoint --source_image path/to/source --driving_video path/to/driving --relative --adapt_scale --find_best_frame | |
``` | |
free-view (e.g. yaw=20, pitch=roll=0): | |
``` | |
python demo.py --config config/vox-256.yaml --checkpoint path/to/checkpoint --source_image path/to/source --driving_video path/to/driving --relative --adapt_scale --find_best_frame --free_view --yaw 20 --pitch 0 --roll 0 | |
``` | |
Note: run ```crop-video.py --inp driving_video.mp4``` first to get the cropping suggestion and crop the raw video. | |
Pretrained Model: | |
-------- | |
Model | Train Set | Baidu Netdisk | Media Fire | | |
------- |------------ |----------- |-------- | | |
Vox-256-Beta| VoxCeleb-v1 | [Baidu](https://pan.baidu.com/s/1lLS4ArbK2yWelsL-EtwU8g) (PW: c0tc)| [MF](https://www.mediafire.com/folder/rw51an7tk7bh2/TalkingHead) | | |
Vox-256-New | VoxCeleb-v1 | - | [MF](https://www.mediafire.com/folder/fcvtkn21j57bb/TalkingHead_Update) | | |
Vox-512 | VoxCeleb-v2 | soon | soon | | |
Note: | |
1. <s>For now, the Beta Version is not well tuned.</s> | |
2. For free-view synthesis, it is recommended that Yaw, Pitch and Roll are within ±45°, ±20° and ±20° respectively. | |
3. Face Restoration algorithms ([GPEN](https://github.com/yangxy/GPEN)) can be used for post-processing to significantly improve the resolution. | |
 | |
Acknowlegement: | |
-------- | |
Thanks to [NV](https://github.com/NVlabs/face-vid2vid), [AliaksandrSiarohin](https://github.com/AliaksandrSiarohin/first-order-model) and [DeepHeadPose](https://github.com/DriverDistraction/DeepHeadPose). | |