GitHub

Setup:

The code is based on PyTorch and HuggingFace transformers.

pip install -r requirements.txt

cd scripts
bash train.sh

Arguments explanation:

--dataset: the name of datasets, just for notation
--data_dir: the path to the saved datasets folder, containing train.jsonl,test.jsonl,valid.jsonl
--seq_len: the max length of sequence $z$ ($x\oplus y$)
--resume_checkpoint: if not none, restore this checkpoint and continue training
--vocab: the tokenizer is initialized using bert or load your own preprocessed vocab dictionary (e.g. using BPE)
--learned_mean_embed: set whether to use the learned soft absorbing state.
--denoise: set whether to add discrete noise
--use_fp16: set whether to use mixed precision training
--denoise_rate: set the denoise rate, with 0.5 as the default, no effect in this version

We provide our training weight on Google Drive.

Perform full 2000 steps diffusion process. Achieve higher performance compare with Speed-up Decoding

cd scripts
bash run_decode.sh

We customize the implementation of DPM-Solver++ to DiffuSeq to accelerate its sampling speed.

cd scripts
bash run_decode_solver.sh

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
datasets/QQP		datasets/QQP
diffuseq		diffuseq
scripts		scripts
word_freq		word_freq
.gitignore		.gitignore
README.md		README.md
basic_utils.py		basic_utils.py
dpm_solver_pytorch.py		dpm_solver_pytorch.py
requirements.txt		requirements.txt
sample_seq2seq.py		sample_seq2seq.py
sample_seq2seq_dpmSolver.py		sample_seq2seq_dpmSolver.py
train.py		train.py
train_util.py		train_util.py
word_freq.py		word_freq.py