# One-Shot Free-View Neural Talking Head Synthesis
Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing".
```Python 3.6``` and ```Pytorch 1.7``` are used.
Updates:
--------
```2021.11.05``` :
* Replace Jacobian with the rotation matrix (Assuming J = R) to avoid estimating Jacobian.
* Correct the rotation matrix.
```2021.11.17``` :
* Better Generator, better performance (models and checkpoints have been released).
Driving | Beta Version | FOMM | New Version:
https://user-images.githubusercontent.com/17874285/142828000-db7b324e-c2fd-4fdc-a272-04fb8adbc88a.mp4
--------
Driving | FOMM | Ours:

Free-View:

Train:
--------
```
python run.py --config config/vox-256.yaml --device_ids 0,1,2,3,4,5,6,7
```
Demo:
--------
```
python demo.py --config config/vox-256.yaml --checkpoint path/to/checkpoint --source_image path/to/source --driving_video path/to/driving --relative --adapt_scale --find_best_frame
```
free-view (e.g. yaw=20, pitch=roll=0):
```
python demo.py --config config/vox-256.yaml --checkpoint path/to/checkpoint --source_image path/to/source --driving_video path/to/driving --relative --adapt_scale --find_best_frame --free_view --yaw 20 --pitch 0 --roll 0
```
Note: run ```crop-video.py --inp driving_video.mp4``` first to get the cropping suggestion and crop the raw video.
Pretrained Model:
--------
Model | Train Set | Baidu Netdisk | Media Fire |
------- |------------ |----------- |-------- |
Vox-256-Beta| VoxCeleb-v1 | [Baidu](https://pan.baidu.com/s/1lLS4ArbK2yWelsL-EtwU8g) (PW: c0tc)| [MF](https://www.mediafire.com/folder/rw51an7tk7bh2/TalkingHead) |
Vox-256-New | VoxCeleb-v1 | - | [MF](https://www.mediafire.com/folder/fcvtkn21j57bb/TalkingHead_Update) |
Vox-512 | VoxCeleb-v2 | soon | soon |
Note:
1. For now, the Beta Version is not well tuned.
2. For free-view synthesis, it is recommended that Yaw, Pitch and Roll are within ±45°, ±20° and ±20° respectively.
3. Face Restoration algorithms ([GPEN](https://github.com/yangxy/GPEN)) can be used for post-processing to significantly improve the resolution.

Acknowlegement:
--------
Thanks to [NV](https://github.com/NVlabs/face-vid2vid), [AliaksandrSiarohin](https://github.com/AliaksandrSiarohin/first-order-model) and [DeepHeadPose](https://github.com/DriverDistraction/DeepHeadPose).