Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MoE/ZeRO] fix .github conflict with main branch. #5827

Closed
wants to merge 70 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
677cbfa
[Fix/Example] Fix Llama Inference Loading Data Type (#5763)
yuanheng-zhao May 30, 2024
68359ed
[release] update version (#5752)
ver217 May 31, 2024
3f2be80
fix (#5765)
flybird11111 Jun 3, 2024
1b76564
[test] Fix/fix testcase (#5770)
duanjunwen Jun 3, 2024
4064432
[Hotfix] Add missing init file in inference.executor (#5774)
yuanheng-zhao Jun 3, 2024
e22b827
[CI/tests] simplify some test case to reduce testing time (#5755)
Hz188 Jun 4, 2024
32f4187
[misc] update dockerfile (#5776)
ver217 Jun 4, 2024
ee6fd38
[devops] fix docker ci (#5780)
ver217 Jun 4, 2024
b45000f
[Inference]Add Streaming LLM (#5745)
isky-cd Jun 5, 2024
50b4c8e
[hotfix] fix llama flash attention forward (#5777)
flybird11111 Jun 5, 2024
79f7a7b
[misc] Accelerate CI for zero and dist optim (#5758)
Edenzzzz Jun 5, 2024
80c3c87
[Test/CI] remove test cases to reduce CI duration (#5753)
botbw Jun 5, 2024
10a19e2
[hotfix] fix testcase in test_fx/test_tracer (#5779)
duanjunwen Jun 5, 2024
3f7e313
[gemini] optimize reduce scatter d2h copy (#5760)
botbw Jun 5, 2024
c46e097
Allow building cuda extension without a device. (#5535)
ccoulombe Jun 5, 2024
b9d646f
[misc] fix dist logger (#5782)
ver217 Jun 5, 2024
a1e39f4
[install]fix setup (#5786)
flybird11111 Jun 6, 2024
5ead00f
[misc] update requirements (#5787)
ver217 Jun 6, 2024
73e88a5
[shardformer] fix import (#5788)
ver217 Jun 6, 2024
7a7e869
upgrade colossal-chat support tp_group>1, add sp for sft
YeAnbang May 27, 2024
929e1e3
upgrade ppo dpo rm script
YeAnbang May 28, 2024
7e65b71
run pre-commit
YeAnbang May 28, 2024
0b4a335
moupdate ci tests, st ci test cases passed, tp failed in generation f…
YeAnbang May 28, 2024
7ae87b3
fix training script
YeAnbang May 28, 2024
b1031f7
fix ci
YeAnbang May 28, 2024
1b880ce
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 28, 2024
b8b5cac
fix transformers version
YeAnbang May 29, 2024
62eb28b
remove duplicated test
YeAnbang May 29, 2024
0bbac15
fix datasets version
YeAnbang May 29, 2024
bf57b13
remove models that require huggingface auth from ci
YeAnbang May 29, 2024
45195ac
remove local data path
YeAnbang May 29, 2024
e16ccc2
update ci
YeAnbang May 29, 2024
ac1520c
remove baichuan from template test due to transformer version conflict
YeAnbang Jun 3, 2024
790e136
merge
YeAnbang Jun 7, 2024
04386d9
Refactor modeling by adding attention backend
char-1ee Jun 3, 2024
eec77e5
Fix tests and naming
char-1ee Jun 3, 2024
5f398fc
Pass inference model shard configs for module init
char-1ee Jun 7, 2024
ceba662
Clean up
char-1ee Jun 7, 2024
0d7ff10
replace the customized dataloader setup with the build-in one
YeAnbang Jun 7, 2024
77db216
replace the customized dataloader setup with the build-in one
YeAnbang Jun 7, 2024
f5981e8
Remove flash attention backend
char-1ee Jun 7, 2024
2abdede
fix readme
YeAnbang Jun 10, 2024
b303976
Fix test import
char-1ee Jun 10, 2024
77a219a
Merge pull request #5771 from char-1ee/refactor/modeling
char-1ee Jun 10, 2024
84eab13
update sft trainning script
YeAnbang Jun 11, 2024
c0948af
[Inference]refactor baichuan (#5791)
LRY89757 Jun 11, 2024
74f4a29
Merge pull request #5759 from hpcaitech/colossalchat_upgrade
YeAnbang Jun 11, 2024
587bbf4
[test] fix chatglm test kit (#5793)
ver217 Jun 11, 2024
aa125bc
[shardformer] fix modeling of bloom and falcon (#5796)
ver217 Jun 11, 2024
aac941e
[test] fix qwen2 pytest distLarge (#5797)
GuangyaoZhang Jun 12, 2024
b6ea9e7
[moe refactor] update unit test with the refactored ZeRO and remove u…
Hz188 Jun 12, 2024
79d63ec
sync with upstream
Hz188 Jun 12, 2024
8554585
[Inference] Fix flash-attn import and add model test (#5794)
char-1ee Jun 12, 2024
ec99700
move moe checkpoint to checkpoint folder and exchange global axis to …
Hz188 Jun 12, 2024
d9dddf5
[Gemini] Use async stream to prefetch and h2d data moving (#5781)
Hz188 Jun 12, 2024
3bcbba9
[gemini] quick fix on possible async operation (#5803)
botbw Jun 13, 2024
2ddf624
[shardformer] upgrade transformers to 4.39.3 (#5815)
flybird11111 Jun 14, 2024
be92747
Merge branch 'hpcaitech:feature/moe' into feature/moe
Hz188 Jun 14, 2024
76aeec3
Merge branches 'feature/moe' and 'feature/moe' of https://github.com/…
Hz188 Jun 14, 2024
64fc0f7
update moe hybrid parallel plugin with newest version of zero & fix z…
Hz188 Jun 14, 2024
8b277cc
fix zero unit test
Hz188 Jun 14, 2024
ed42193
Add an assertion to prevent users from using it incorrectly
Hz188 Jun 14, 2024
88b78fa
Merge remote-tracking branch 'upstream/feature/moe' into feature/moe
Hz188 Jun 14, 2024
419d25e
Modify function parameter names to resolve compatibility issues
Hz188 Jun 17, 2024
3364ac9
remove useless code: MoECheckpoint
Hz188 Jun 17, 2024
f7298bc
update github workflow config file
Hz188 Jun 17, 2024
e6839fb
fix typo
Hz188 Jun 17, 2024
cc9d0bb
Merge branch 'hpcaitech:feature/moe' into feature/moe
Hz188 Jun 17, 2024
8795bb2
Support 4d parallel + flash attention (#5789)
Edenzzzz Jun 17, 2024
1405cf1
fix .github worfflow conflict with main branch
Hz188 Jun 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix datasets version
YeAnbang committed Jun 7, 2024
commit 0bbac158ed5226948d259a4e29c9a9f1da952d23
2 changes: 1 addition & 1 deletion applications/ColossalChat/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
transformers>=4.36.2
tqdm
datasets
datasets==2.14.7
loralib
colossalai>=0.3.7
torch>=1.12.1