Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MoE/ZeRO] fix .github conflict with main branch. #5827

Closed
wants to merge 70 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
677cbfa
[Fix/Example] Fix Llama Inference Loading Data Type (#5763)
yuanheng-zhao May 30, 2024
68359ed
[release] update version (#5752)
ver217 May 31, 2024
3f2be80
fix (#5765)
flybird11111 Jun 3, 2024
1b76564
[test] Fix/fix testcase (#5770)
duanjunwen Jun 3, 2024
4064432
[Hotfix] Add missing init file in inference.executor (#5774)
yuanheng-zhao Jun 3, 2024
e22b827
[CI/tests] simplify some test case to reduce testing time (#5755)
Hz188 Jun 4, 2024
32f4187
[misc] update dockerfile (#5776)
ver217 Jun 4, 2024
ee6fd38
[devops] fix docker ci (#5780)
ver217 Jun 4, 2024
b45000f
[Inference]Add Streaming LLM (#5745)
isky-cd Jun 5, 2024
50b4c8e
[hotfix] fix llama flash attention forward (#5777)
flybird11111 Jun 5, 2024
79f7a7b
[misc] Accelerate CI for zero and dist optim (#5758)
Edenzzzz Jun 5, 2024
80c3c87
[Test/CI] remove test cases to reduce CI duration (#5753)
botbw Jun 5, 2024
10a19e2
[hotfix] fix testcase in test_fx/test_tracer (#5779)
duanjunwen Jun 5, 2024
3f7e313
[gemini] optimize reduce scatter d2h copy (#5760)
botbw Jun 5, 2024
c46e097
Allow building cuda extension without a device. (#5535)
ccoulombe Jun 5, 2024
b9d646f
[misc] fix dist logger (#5782)
ver217 Jun 5, 2024
a1e39f4
[install]fix setup (#5786)
flybird11111 Jun 6, 2024
5ead00f
[misc] update requirements (#5787)
ver217 Jun 6, 2024
73e88a5
[shardformer] fix import (#5788)
ver217 Jun 6, 2024
7a7e869
upgrade colossal-chat support tp_group>1, add sp for sft
YeAnbang May 27, 2024
929e1e3
upgrade ppo dpo rm script
YeAnbang May 28, 2024
7e65b71
run pre-commit
YeAnbang May 28, 2024
0b4a335
moupdate ci tests, st ci test cases passed, tp failed in generation f…
YeAnbang May 28, 2024
7ae87b3
fix training script
YeAnbang May 28, 2024
b1031f7
fix ci
YeAnbang May 28, 2024
1b880ce
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 28, 2024
b8b5cac
fix transformers version
YeAnbang May 29, 2024
62eb28b
remove duplicated test
YeAnbang May 29, 2024
0bbac15
fix datasets version
YeAnbang May 29, 2024
bf57b13
remove models that require huggingface auth from ci
YeAnbang May 29, 2024
45195ac
remove local data path
YeAnbang May 29, 2024
e16ccc2
update ci
YeAnbang May 29, 2024
ac1520c
remove baichuan from template test due to transformer version conflict
YeAnbang Jun 3, 2024
790e136
merge
YeAnbang Jun 7, 2024
04386d9
Refactor modeling by adding attention backend
char-1ee Jun 3, 2024
eec77e5
Fix tests and naming
char-1ee Jun 3, 2024
5f398fc
Pass inference model shard configs for module init
char-1ee Jun 7, 2024
ceba662
Clean up
char-1ee Jun 7, 2024
0d7ff10
replace the customized dataloader setup with the build-in one
YeAnbang Jun 7, 2024
77db216
replace the customized dataloader setup with the build-in one
YeAnbang Jun 7, 2024
f5981e8
Remove flash attention backend
char-1ee Jun 7, 2024
2abdede
fix readme
YeAnbang Jun 10, 2024
b303976
Fix test import
char-1ee Jun 10, 2024
77a219a
Merge pull request #5771 from char-1ee/refactor/modeling
char-1ee Jun 10, 2024
84eab13
update sft trainning script
YeAnbang Jun 11, 2024
c0948af
[Inference]refactor baichuan (#5791)
LRY89757 Jun 11, 2024
74f4a29
Merge pull request #5759 from hpcaitech/colossalchat_upgrade
YeAnbang Jun 11, 2024
587bbf4
[test] fix chatglm test kit (#5793)
ver217 Jun 11, 2024
aa125bc
[shardformer] fix modeling of bloom and falcon (#5796)
ver217 Jun 11, 2024
aac941e
[test] fix qwen2 pytest distLarge (#5797)
GuangyaoZhang Jun 12, 2024
b6ea9e7
[moe refactor] update unit test with the refactored ZeRO and remove u…
Hz188 Jun 12, 2024
79d63ec
sync with upstream
Hz188 Jun 12, 2024
8554585
[Inference] Fix flash-attn import and add model test (#5794)
char-1ee Jun 12, 2024
ec99700
move moe checkpoint to checkpoint folder and exchange global axis to …
Hz188 Jun 12, 2024
d9dddf5
[Gemini] Use async stream to prefetch and h2d data moving (#5781)
Hz188 Jun 12, 2024
3bcbba9
[gemini] quick fix on possible async operation (#5803)
botbw Jun 13, 2024
2ddf624
[shardformer] upgrade transformers to 4.39.3 (#5815)
flybird11111 Jun 14, 2024
be92747
Merge branch 'hpcaitech:feature/moe' into feature/moe
Hz188 Jun 14, 2024
76aeec3
Merge branches 'feature/moe' and 'feature/moe' of https://github.com/…
Hz188 Jun 14, 2024
64fc0f7
update moe hybrid parallel plugin with newest version of zero & fix z…
Hz188 Jun 14, 2024
8b277cc
fix zero unit test
Hz188 Jun 14, 2024
ed42193
Add an assertion to prevent users from using it incorrectly
Hz188 Jun 14, 2024
88b78fa
Merge remote-tracking branch 'upstream/feature/moe' into feature/moe
Hz188 Jun 14, 2024
419d25e
Modify function parameter names to resolve compatibility issues
Hz188 Jun 17, 2024
3364ac9
remove useless code: MoECheckpoint
Hz188 Jun 17, 2024
f7298bc
update github workflow config file
Hz188 Jun 17, 2024
e6839fb
fix typo
Hz188 Jun 17, 2024
cc9d0bb
Merge branch 'hpcaitech:feature/moe' into feature/moe
Hz188 Jun 17, 2024
8795bb2
Support 4d parallel + flash attention (#5789)
Edenzzzz Jun 17, 2024
1405cf1
fix .github worfflow conflict with main branch
Hz188 Jun 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
pre-commit-ci[bot] authored and YeAnbang committed Jun 7, 2024
commit 1b880ce09560ab65a55ccf0a6ae1864724651069
4 changes: 3 additions & 1 deletion applications/ColossalChat/coati/dataset/conversation.py
Original file line number Diff line number Diff line change
@@ -25,7 +25,9 @@ def from_config(cls, tokenizer: PreTrainedTokenizer, config: Dict):
Setup the conversation template from config
"""
tokenizer.chat_template = config["chat_template"]
conv = cls(tokenizer, config["system_message"], config["chat_template"], config["stop_ids"], config["end_of_assistant"])
conv = cls(
tokenizer, config["system_message"], config["chat_template"], config["stop_ids"], config["end_of_assistant"]
)
conv.clear()
return conv

27 changes: 15 additions & 12 deletions applications/ColossalChat/coati/dataset/tokenization_utils.py
Original file line number Diff line number Diff line change
@@ -97,16 +97,17 @@ def supervised_tokenize_sft(

target_turn = turns[target_turn_index - 1]
prompt = template.get_prompt(2 * target_turn)
chunks, require_loss = split_templated_prompt_into_chunks(template.messages[: 2 * target_turn], prompt,
conversation_template.end_of_assistant)
chunks, require_loss = split_templated_prompt_into_chunks(
template.messages[: 2 * target_turn], prompt, conversation_template.end_of_assistant
)
tokenized, starts, ends = tokenize_and_concatenate(tokenizer, chunks, require_loss)

labels = [ignore_index] * len(tokenized)
for start, end in zip(starts, ends):
if end == len(tokenized):
tokenized = tokenized + [tokenizer.eos_token_id]
labels = labels + [ignore_index]
labels[start : end] = tokenized[start : end]
labels[start:end] = tokenized[start:end]

# truncate the sequence at the last token that requires loss calculation
to_truncate_len = 0
@@ -139,14 +140,14 @@ def supervised_tokenize_sft(
label_decode = []
for i in range(len(labels)):
if labels[i] == ignore_index:
if start!=end:
label_decode.append(tokenizer.decode(labels[start+1:i], skip_special_tokens=False))
if start != end:
label_decode.append(tokenizer.decode(labels[start + 1 : i], skip_special_tokens=False))
start = i
end = i
else:
end = i
if i == len(labels) - 1:
label_decode.append(tokenizer.decode(labels[start+1:], skip_special_tokens=False))
label_decode.append(tokenizer.decode(labels[start + 1 :], skip_special_tokens=False))

except TypeError as e:
raise TypeError(str(e) + f"\nUnable to decode input_ids: {tokenized}")
@@ -216,8 +217,9 @@ def tokenize_prompt_dataset(

# Prepare data
prompt = template.get_prompt(target_turn, add_generation_prompt=True)
chunks, require_loss = split_templated_prompt_into_chunks(template.messages[: target_turn], prompt,
conversation_template.end_of_assistant)
chunks, require_loss = split_templated_prompt_into_chunks(
template.messages[:target_turn], prompt, conversation_template.end_of_assistant
)
tokenized, starts, ends = tokenize_and_concatenate(tokenizer, chunks, require_loss)
if tokenizer.bos_token_id is not None:
if tokenized[0] != tokenizer.bos_token_id:
@@ -246,8 +248,9 @@ def apply_rlhf_data_format(
):
target_turn = int(len(template.messages) / 2)
prompt = template.get_prompt(target_turn * 2)
chunks, require_loss = split_templated_prompt_into_chunks(template.messages[: 2 * target_turn], prompt,
template.end_of_assistant)
chunks, require_loss = split_templated_prompt_into_chunks(
template.messages[: 2 * target_turn], prompt, template.end_of_assistant
)
tokenized, starts, ends = tokenize_and_concatenate(tokenizer, chunks, require_loss)
loss_mask = [0] * len(tokenized)
mask_token = tokenizer.eos_token_id or tokenizer.pad_token_id
@@ -260,8 +263,8 @@ def apply_rlhf_data_format(
if end == len(tokenized):
tokenized = tokenized + [tokenizer.eos_token_id]
loss_mask = loss_mask + [1]
loss_mask[start : end] = [1] * len(loss_mask[start : end])
label_decode.append(tokenizer.decode(tokenized[start : end], skip_special_tokens=False))
loss_mask[start:end] = [1] * len(loss_mask[start:end])
label_decode.append(tokenizer.decode(tokenized[start:end], skip_special_tokens=False))
if tokenizer.bos_token_id is not None:
if tokenized[0] != tokenizer.bos_token_id:
tokenized = [tokenizer.bos_token_id] + tokenized
6 changes: 4 additions & 2 deletions applications/ColossalChat/coati/dataset/utils.py
Original file line number Diff line number Diff line change
@@ -121,8 +121,10 @@ def split_templated_prompt_into_chunks(messages: List[Dict[str, str]], prompt: s
for line in messages:
content_length = len(line["content"])
first_occur = prompt.find(line["content"], start_idx)
if line["role"].lower() == "assistant" and end_of_assistant in prompt[first_occur + content_length:]:
content_length = prompt.find(end_of_assistant, first_occur + content_length) + len(end_of_assistant) - first_occur
if line["role"].lower() == "assistant" and end_of_assistant in prompt[first_occur + content_length :]:
content_length = (
prompt.find(end_of_assistant, first_occur + content_length) + len(end_of_assistant) - first_occur
)
if prompt[first_occur - 1] != " ":
chunks.append(prompt[start_idx:first_occur])
chunks.append(prompt[first_occur : first_occur + content_length])
2 changes: 1 addition & 1 deletion applications/ColossalChat/coati/models/critic.py
Original file line number Diff line number Diff line change
@@ -37,4 +37,4 @@ def get_input_embeddings(self):
return self.model.get_input_embeddings()

def get_output_embeddings(self):
return self.model.get_output_embeddings()
return self.model.get_output_embeddings()
Original file line number Diff line number Diff line change
@@ -5,4 +5,4 @@
7
],
"end_of_assistant": "<|im_end|>"
}
}
Original file line number Diff line number Diff line change
@@ -6,4 +6,4 @@
151643
],
"end_of_assistant": "<|im_end|>"
}
}
Original file line number Diff line number Diff line change
@@ -5,4 +5,4 @@
2
],
"end_of_assistant": "<|im_end|>"
}
}
Original file line number Diff line number Diff line change
@@ -5,4 +5,4 @@
2
],
"end_of_assistant": "<|user|>"
}
}
Original file line number Diff line number Diff line change
@@ -5,4 +5,4 @@
2
],
"end_of_assistant": "<|im_end|>"
}
}
Original file line number Diff line number Diff line change
@@ -5,4 +5,4 @@
2
],
"end_of_assistant": "</s>"
}
}
Original file line number Diff line number Diff line change
@@ -5,4 +5,4 @@
100001
],
"end_of_assistant": "<|end▁of▁sentence|>"
}
}
Original file line number Diff line number Diff line change
@@ -5,4 +5,4 @@
2
],
"end_of_assistant": "</s>"
}
}
Original file line number Diff line number Diff line change
@@ -5,4 +5,4 @@
50256
],
"end_of_assistant": "<|im_end|>"
}
}
Original file line number Diff line number Diff line change
@@ -5,4 +5,4 @@
2
],
"end_of_assistant": "</s>"
}
}
Original file line number Diff line number Diff line change
@@ -226,7 +226,7 @@ def main():
"max_length": args.max_length,
},
keep_in_memory=False,
num_proc= min(len(dataset), cpu_count()),
num_proc=min(len(dataset), cpu_count()),
)

dataset = dataset.filter(
2 changes: 1 addition & 1 deletion applications/ColossalChat/tests/llama.json
Original file line number Diff line number Diff line change
@@ -6,4 +6,4 @@
2
],
"end_of_assistant": "</s>"
}
}
43 changes: 24 additions & 19 deletions applications/ColossalChat/tests/test_chat_template.py
Original file line number Diff line number Diff line change
@@ -1,36 +1,41 @@
import json
import os

from coati.dataset import setup_conversation_template
from coati.dataset.conversation import Conversation
from coati.dataset.tokenization_utils import supervised_tokenize_sft
from transformers import AutoTokenizer
import json
import os

model_data_mapping = {
'THUDM/chatglm2-6b': 'THUDM_chatglm2-6b.json',
'THUDM/chatglm3-6b': 'THUDM_chatglm3-6b.json',
'baichuan-inc/Baichuan2-13B-Chat': 'baichuan-inc_Baichuan2-13B-Chat.json',
'01-ai/Yi-1.5-9B-Chat': '01-ai_Yi-1.5-9B-Chat.json',
'01-ai/Yi-34B': '01-ai_Yi-34B.json',
'deepseek-ai/DeepSeek-V2-Lite': 'deepseek-ai_DeepSeek-V2-Lite.json',
'microsoft/phi-2': 'microsoft_phi-2.json',
'mistralai/Mixtral-8x7B-Instruct-v0.1': 'mistralai_Mixtral-8x7B-Instruct-v0.1.json'
}
chat_template_config_path = './config/conversation_template'
"THUDM/chatglm2-6b": "THUDM_chatglm2-6b.json",
"THUDM/chatglm3-6b": "THUDM_chatglm3-6b.json",
"baichuan-inc/Baichuan2-13B-Chat": "baichuan-inc_Baichuan2-13B-Chat.json",
"01-ai/Yi-1.5-9B-Chat": "01-ai_Yi-1.5-9B-Chat.json",
"01-ai/Yi-34B": "01-ai_Yi-34B.json",
"deepseek-ai/DeepSeek-V2-Lite": "deepseek-ai_DeepSeek-V2-Lite.json",
"microsoft/phi-2": "microsoft_phi-2.json",
"mistralai/Mixtral-8x7B-Instruct-v0.1": "mistralai_Mixtral-8x7B-Instruct-v0.1.json",
}
chat_template_config_path = "./config/conversation_template"


def test_tokenization_sft():
for model in model_data_mapping:
print(f"#############{model}#############")
conversation_template_config = os.path.join(chat_template_config_path, model_data_mapping[model])
messages = [{"from": "human", "content": "What are the three primary colors?"},
conversation_template_config = os.path.join(chat_template_config_path, model_data_mapping[model])
messages = [
{"from": "human", "content": "What are the three primary colors?"},
{"from": "assistant", "content": "The three primary colors are red, blue, and yellow."},
{"from": "human", "content": "解释个人电脑和服务器之间的区别。"},
{"from": "assistant", "content": "个人电脑和服务器是两种不同类型的计算机系统,它们的主要区别在于用途、硬件配置和性能。 个人电脑,顾名思义,是为个人使用而设计的计算机。它们通常用于日常的工作、娱乐和学习,可以运行各种各样的应用程序和游戏。个人电脑的硬件配置一般是按照标准配置来设计的,不过也可以根据个人需求进行定制。 而服务器是为了满足大量用户的需求而设计的计算机系统,它们通常用于为用户提供各种网络服务,如网站、电子邮件和文件传输等。服务器通常需要高性能的硬件配置,并且可以承受高负载和长时间的运行。由于服务器需要支持大量用户的访问,它们通常配备多核处理器、大容量内存和大容量硬盘驱动器,以提高系统的运行速度和稳定性。 总之,个人电脑和服务器之间的主要区别在于它们的用途、硬件配置和性能。个人电脑用于个人使用,而服务器用于支持大量用户的访问。服务器的硬件配置通常比个人电脑更高,以保证系统的性能和稳定性。"}]
{"from": "human", "content": "解释个人电脑和服务器之间的区别。"},
{
"from": "assistant",
"content": "个人电脑和服务器是两种不同类型的计算机系统,它们的主要区别在于用途、硬件配置和性能。 个人电脑,顾名思义,是为个人使用而设计的计算机。它们通常用于日常的工作、娱乐和学习,可以运行各种各样的应用程序和游戏。个人电脑的硬件配置一般是按照标准配置来设计的,不过也可以根据个人需求进行定制。 而服务器是为了满足大量用户的需求而设计的计算机系统,它们通常用于为用户提供各种网络服务,如网站、电子邮件和文件传输等。服务器通常需要高性能的硬件配置,并且可以承受高负载和长时间的运行。由于服务器需要支持大量用户的访问,它们通常配备多核处理器、大容量内存和大容量硬盘驱动器,以提高系统的运行速度和稳定性。 总之,个人电脑和服务器之间的主要区别在于它们的用途、硬件配置和性能。个人电脑用于个人使用,而服务器用于支持大量用户的访问。服务器的硬件配置通常比个人电脑更高,以保证系统的性能和稳定性。",
},
]
chat_template_config = json.load(open(conversation_template_config, "r", encoding="utf8"))
tokenizer = AutoTokenizer.from_pretrained(model, use_fast=False, trust_remote_code=True)
conversation_template = setup_conversation_template(
tokenizer, chat_template_config=chat_template_config, save_path=conversation_template_config
)
tokenizer, chat_template_config=chat_template_config, save_path=conversation_template_config
)

output = supervised_tokenize_sft({"messages": messages}, tokenizer, conversation_template)
with open(f"./tests/test_data/chat_template/{model_data_mapping[model]}", "r", encoding="utf8") as f:
Original file line number Diff line number Diff line change
@@ -582,4 +582,4 @@
],
"seq_length": 286,
"seq_category": "None"
}
}
Original file line number Diff line number Diff line change
@@ -604,4 +604,4 @@
],
"seq_length": 297,
"seq_category": "None"
}
}
Original file line number Diff line number Diff line change
@@ -600,4 +600,4 @@
],
"seq_length": 295,
"seq_category": "None"
}
}
Original file line number Diff line number Diff line change
@@ -712,4 +712,4 @@
],
"seq_length": 351,
"seq_category": "None"
}
}
Original file line number Diff line number Diff line change
@@ -582,4 +582,4 @@
],
"seq_length": 286,
"seq_category": "None"
}
}
Original file line number Diff line number Diff line change
@@ -694,4 +694,4 @@
],
"seq_length": 342,
"seq_category": "None"
}
}
Original file line number Diff line number Diff line change
@@ -578,4 +578,4 @@
],
"seq_length": 284,
"seq_category": "None"
}
}
Original file line number Diff line number Diff line change
@@ -2006,4 +2006,4 @@
],
"seq_length": 998,
"seq_category": "None"
}
}
Original file line number Diff line number Diff line change
@@ -916,4 +916,4 @@
],
"seq_length": 453,
"seq_category": "None"
}
}