Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
…into main
  • Loading branch information
ChenyangQiQi committed Apr 21, 2023
2 parents 4c259cd + 87b13b3 commit c9bcf11
Show file tree
Hide file tree
Showing 11 changed files with 1,206 additions and 24 deletions.
59 changes: 41 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,8 @@ By [Bowen Zhang](http://home.ustc.edu.cn/~zhangbowen)\*, [Chenyang Qi](https://c

## Todo

- [x] Release the inference code of base model and temporal super-resolution mode
- [ ] Code refactoring and upte README for super-resolution model
- [ ] Release the training code of base model
- [x] Release the inference code of base model and temporal super-resolution model
- [x] Release the training code of base model
- [ ] Release the training code of super-resolution model

## Setup Environment
Expand All @@ -28,7 +27,9 @@ conda env create -f environment.yml
conda activate meta_portrait_base
```

## Inference Base Model
## Base Model

### Inference Base Model

Download the [checkpoint of base model](https://drive.google.com/file/d/1Kmdv3w6N_we7W7lIt6LBzqRHwwy1dBxD/view?usp=share_link) and put it to `base_model/checkpoint`. We provide [preprocessed example data for inference](https://drive.google.com/file/d/166eNbabM6TeJVy7hxol2gL1kUGKHi3Do/view?usp=share_link), you could download the data, unzip and put it to `data`. The directory structure should like this:

Expand Down Expand Up @@ -57,26 +58,35 @@ cd base_model
python inference.py --save_dir /path/to/output --config config/meta_portrait_256_eval.yaml --ckpt checkpoint/ckpt_base.pth.tar
```

## Citing MetaPortrait
### Train Base Model from Scratch

Train the warping network first using the following command:
```bash
cd base_model
python main.py --config config/meta_portrait_256_pretrain_warp.yaml --fp16 --stage Warp --task Pretrain
```
@misc{zhang2022metaportrait,
title={MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation},
author={Bowen Zhang and Chenyang Qi and Pan Zhang and Bo Zhang and HsiangTao Wu and Dong Chen and Qifeng Chen and Yong Wang and Fang Wen},
year={2022},
eprint={2212.08062},
archivePrefix={arXiv},
primaryClass={cs.CV}
}

Then, modify the path of `warp_ckpt` in `config/meta_portrait_256_pretrain_full.yaml` and joint train the warping network and refinement network using the following command:
```bash
python main.py --config config/meta_portrait_256_pretrain_full.yaml --fp16 --stage Full --task Pretrain
```

### Meta Training for Faster Personalization of Base Model

You could start from the standard pretrained checkpoint and further optimize the personalized adaptation speed of the model by utilizing meta-learning using the following command:
```bash
python main.py --config config/meta_portrait_256_pretrain_meta_train.yaml --fp16 --stage Full --task Meta --remove_sn --ckpt /path/to/standard_pretrain_ckpt
```
## Inference Temporal Super-resolution Model

### Base Environment
## Temporal Super-resolution Model

### Base Environment

- Python >= 3.7 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html))
- [PyTorch >= 1.7](https://pytorch.org/)
- System: Linux + NVIDIA GPU + [CUDA](https://developer.nvidia.com/cuda-downloads)
- Set the root path to [sr_model](sr_model)

### Data and checkpoint

Download the [dataset](
Expand All @@ -102,7 +112,6 @@ options

### Installation Bash command


```bash
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu116
# Install a modified basicsr - https://github.com/xinntao/BasicSR
Expand All @@ -120,14 +129,28 @@ python setup.py develop
```

### Quick Inference

ckpt for inference: pretrained_ckpt/temporal_gfpgan.pth

Example code to conduct face temporal super-resolution:

```bash
CUDA_VISIBLE_DEVICES=7 python -m torch.distributed.launch --nproc_per_node=1 --master_port=4321 Experimental_root/test.py -opt options/test/same_id.yml --launcher pytorch
python -m torch.distributed.launch --nproc_per_node=1 --master_port=4321 Experimental_root/test.py -opt options/test/same_id.yml --launcher pytorch
```
You may adjust the ```nproc_per_node``` to the number of GPUs on your own machine.

## Citing MetaPortrait

```
@misc{zhang2022metaportrait,
title={MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation},
author={Bowen Zhang and Chenyang Qi and Pan Zhang and Bo Zhang and HsiangTao Wu and Dong Chen and Qifeng Chen and Yong Wang and Fang Wen},
year={2022},
eprint={2212.08062},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
You may adjust the ```CUDA_VISIBLE_DEVICES``` and ```nproc_per_node``` to the number of GPUs on your own machine.

## Acknowledgements

Expand Down
97 changes: 97 additions & 0 deletions base_model/config/meta_portrait_256_meta_train.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
general:
exp_name: meta_portrait_meta_train
random_seed: 365

dataset:
frame_shape: [256, 256, 3]
eye_enhance: True
mouth_enhance: True
ldmkimg: True
ldmk_idx: [521, 505, 338, 398, 347, 35, 191, 30, 32, 207, 630, 629, 319, 4, 541, 61, 637, 660, 638, 587, 273, 590, 269, 432, 118,327, 12, 373, 58, 619, 466, 469, 464, 308, 152, 305, 150, 411, 635, 634, 564, 250, 443, 129, 364, 322, 49, 7, 361, 105, 434, 120, 500, 186, 575, 261, 636, 74]

train_data: [meta]
train_data_weight: [1]

meta:
root: ../data/
crop_expand: 1.3
crop_offset_y: 0.2

model:
arch: 'SPADEID'
# warp_ckpt: /mnt/blob/projects/IMmeeting/amlt-results/Meeting_exp_25/Orig_RegMotion_Ladder256_VoxCeleb2_Warp_SPADEInit_FeatureNorm_Bs48_Baseline_15eps_256/results/ckpt/spade/ckpt_15_2022-08-27-00-10-08.pth.tar
common:
num_channels: 3

kp_detector:
temperature: 0.1
block_expansion: 32
max_features: 1024
scale_factor: 0.25
num_blocks: 5
generator:
block_expansion: 64
max_features: 512
with_gaze_htmap: True
with_mouth_line: True
with_ldmk_line: True
use_IN: True
ladder:
need_feat: False
use_mask: False
label_nc: 0
z_dim: 512
dense_motion_params:
label_nc: 0
ldmkimg: True
occlusion: True
block_expansion: 64
max_features: 1024
num_blocks: 5
dec_lease: 2
Lwarp: True
AdaINc: 512
discriminator:
scales: [1]
block_expansion: 32
max_features: 512
num_blocks: 4
use_kp: False

train:
epochs: 5
batch_size: 0
dataset_repeats: 1

epoch_milestones: [2]
lr_generator: 2.0e-5
lr_discriminator: 2.0e-5
lr_kp_detector: 2.0e-5
warplr_tune: 0.1
outer_beta_1: 0.5
outer_beta_2: 0.999
inner_lr_generator: 2.0e-4
inner_lr_discriminator: 2.0e-4
inner_warplr_tune: 0.1
inner_beta_1: 0.5
inner_beta_2: 0.999

scales: [1, 0.5, 0.25, 0.125]

loss_weights:
generator_gan: 0
discriminator_gan: 1
feature_matching: [10, 10, 10, 10]
perceptual: [10, 10, 10, 10, 10]
id: 20
eye_enhance: 50
mouth_enhance: 50

tensorboard: True
event_save_path: ./results/events/
event_save_freq: 50

ckpt_save_path: ./results/ckpt/
ckpt_save_iter_freq: 500
ckpt_save_freq: 1
print_freq: 50
90 changes: 90 additions & 0 deletions base_model/config/meta_portrait_256_pretrain_full.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
general:
exp_name: meta_portrait_base
random_seed: 365

dataset:
frame_shape: [256, 256, 3]
eye_enhance: True
mouth_enhance: True
ldmkimg: True
ldmk_idx: [521, 505, 338, 398, 347, 35, 191, 30, 32, 207, 630, 629, 319, 4, 541, 61, 637, 660, 638, 587, 273, 590, 269, 432, 118,327, 12, 373, 58, 619, 466, 469, 464, 308, 152, 305, 150, 411, 635, 634, 564, 250, 443, 129, 364, 322, 49, 7, 361, 105, 434, 120, 500, 186, 575, 261, 636, 74]

train_data: [personalized]
train_data_weight: [1]

personalized:
root: ../data/
crop_expand: 1.3
crop_offset_y: 0.2
static_bbox: True

model:
arch: 'SPADEID'
warp_ckpt: /path/to/warp_ckpt
common:
num_channels: 3

kp_detector:
temperature: 0.1
block_expansion: 32
max_features: 1024
scale_factor: 0.25
num_blocks: 5
generator:
block_expansion: 64
max_features: 512
with_gaze_htmap: True
with_mouth_line: True
with_ldmk_line: True
use_IN: True
ladder:
need_feat: False
use_mask: False
label_nc: 0
z_dim: 512
dense_motion_params:
label_nc: 0
ldmkimg: True
occlusion: True
block_expansion: 64
max_features: 1024
num_blocks: 5
dec_lease: 2
Lwarp: True
AdaINc: 512
discriminator:
scales: [1]
block_expansion: 32
max_features: 512
num_blocks: 4
use_kp: False

train:
epochs: 60
batch_size: 2
dataset_repeats: 1

epoch_milestones: [45]
lr_generator: 2.0e-4
lr_discriminator: 2.0e-4
warplr_tune: 0.1

scales: [1, 0.5, 0.25, 0.125]

loss_weights:
generator_gan: 1
discriminator_gan: 1
feature_matching: [10, 10, 10, 10]
perceptual: [10, 10, 10, 10, 10]
id: 20
eye_enhance: 50
mouth_enhance: 50

tensorboard: True
event_save_path: ./results/events/
event_save_freq: 500

ckpt_save_path: ./results/ckpt/
ckpt_save_iter_freq: 5000
ckpt_save_freq: 1
print_freq: 1000
91 changes: 91 additions & 0 deletions base_model/config/meta_portrait_256_pretrain_warp.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
general:
exp_name: meta_portrait_base
random_seed: 365

dataset:
frame_shape: [256, 256, 3]
eye_enhance: True
mouth_enhance: True
ldmkimg: True
ldmk_idx: [521, 505, 338, 398, 347, 35, 191, 30, 32, 207, 630, 629, 319, 4, 541, 61, 637, 660, 638, 587, 273, 590, 269, 432, 118,327, 12, 373, 58, 619, 466, 469, 464, 308, 152, 305, 150, 411, 635, 634, 564, 250, 443, 129, 364, 322, 49, 7, 361, 105, 434, 120, 500, 186, 575, 261, 636, 74]

train_data: [personalized]
train_data_weight: [1]

personalized:
root: ../data/
crop_expand: 1.3
crop_offset_y: 0.2
static_bbox: True

model:
arch: 'SPADEID'
common:
num_channels: 3

kp_detector:
temperature: 0.1
block_expansion: 32
max_features: 1024
scale_factor: 0.25
num_blocks: 5
generator:
block_expansion: 64
max_features: 512
with_gaze_htmap: True
with_mouth_line: True
with_ldmk_line: True
use_IN: True
ladder:
need_feat: False
use_mask: False
label_nc: 0
z_dim: 512
dense_motion_params:
label_nc: 0
ldmkimg: True
occlusion: True
block_expansion: 64
max_features: 1024
num_blocks: 5
dec_lease: 2
Lwarp: True
AdaINc: 512
discriminator:
scales: [1]
block_expansion: 32
max_features: 512
num_blocks: 4
use_kp: False

train:
epochs: 60
batch_size: 2
dataset_repeats: 1

epoch_milestones: [45]
lr_generator: 2.0e-4
lr_discriminator: 2.0e-4
warplr_tune: 0.1

scales: [1, 0.5, 0.25, 0.125]

loss_weights:
generator_gan: 1
discriminator_gan: 1
feature_matching: [10, 10, 10, 10]
perceptual: [10, 10, 10, 10, 10]
id: 20
eye_enhance: 50
mouth_enhance: 50

tensorboard: True
event_save_path: ./results/events/
event_save_freq: 500

ckpt_save_path: ./results/ckpt/
ckpt_save_iter_freq: 5000
ckpt_save_freq: 1
print_freq: 1000

eval_freq: 10000
Loading

0 comments on commit c9bcf11

Please sign in to comment.