Skip to content

kjid1999/ccthesis

Repository files navigation

Setup:

The code is based on PyTorch and HuggingFace transformers.

pip install -r requirements.txt 

DiffuSeq Training

cd scripts
bash train.sh

Arguments explanation:

  • --dataset: the name of datasets, just for notation
  • --data_dir: the path to the saved datasets folder, containing train.jsonl,test.jsonl,valid.jsonl
  • --seq_len: the max length of sequence $z$ ($x\oplus y$)
  • --resume_checkpoint: if not none, restore this checkpoint and continue training
  • --vocab: the tokenizer is initialized using bert or load your own preprocessed vocab dictionary (e.g. using BPE)
  • --learned_mean_embed: set whether to use the learned soft absorbing state.
  • --denoise: set whether to add discrete noise
  • --use_fp16: set whether to use mixed precision training
  • --denoise_rate: set the denoise rate, with 0.5 as the default, no effect in this version

Model Weight

We provide our training weight on Google Drive.

Decoding

Perform full 2000 steps diffusion process. Achieve higher performance compare with Speed-up Decoding

cd scripts
bash run_decode.sh

Speed-up Decoding

We customize the implementation of DPM-Solver++ to DiffuSeq to accelerate its sampling speed.

cd scripts
bash run_decode_solver.sh

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published