Skip to content

Actions: deepspeedai/DeepSpeed

nv-lightning-v100

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
4,719 workflow runs
4,719 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Update workflows that use cuda 12.1 to use runners with 12.4
nv-lightning-v100 #14326: Pull request #7000 synchronize by loadams
February 12, 2025 01:16 12m 49s loadams/update-runners-124
February 12, 2025 01:16 12m 49s
Update workflows that use cuda 12.1 to use runners with 12.4
nv-lightning-v100 #14325: Pull request #7000 synchronize by loadams
February 12, 2025 00:53 22m 54s loadams/update-runners-124
February 12, 2025 00:53 22m 54s
nv-lightning-v100
nv-lightning-v100 #14324: Scheduled
February 12, 2025 00:20 1h 48m 41s master
February 12, 2025 00:20 1h 48m 41s
AIO on ROCM
nv-lightning-v100 #14323: Pull request #7023 synchronize by tjruwase
February 11, 2025 23:43 7m 33s jomayeri/aio-on-rocm
February 11, 2025 23:43 7m 33s
nv-lightning-v100
nv-lightning-v100 #14322: Merge group checks requested
February 11, 2025 23:39 10m 23s
February 11, 2025 23:39 10m 23s
AIO on ROCM
nv-lightning-v100 #14321: Pull request #7023 opened by jomayeri
February 11, 2025 21:54 4m 10s jomayeri/aio-on-rocm
February 11, 2025 21:54 4m 10s
Training multiple models
nv-lightning-v100 #14320: Pull request #7018 synchronize by tjruwase
February 11, 2025 10:41 4m 19s olruwase/zero_multi_models
February 11, 2025 10:41 4m 19s
Tecorigin sdaa accelerator
nv-lightning-v100 #14319: Pull request #6903 synchronize by tjruwase
February 11, 2025 10:01 4m 22s siqi654321:Tecorigin-SDAA-accelerator
February 11, 2025 10:01 4m 22s
Variable batch size and LR scheduler
nv-lightning-v100 #14318: Pull request #7020 synchronize by bm-synth
February 11, 2025 02:22 4m 20s bm-synth:variable_batch_size_and_lr
February 11, 2025 02:22 4m 20s
Variable batch size and LR scheduler
nv-lightning-v100 #14317: Pull request #7020 synchronize by bm-synth
February 11, 2025 02:20 Action required bm-synth:variable_batch_size_and_lr
February 11, 2025 02:20 Action required
nv-lightning-v100
nv-lightning-v100 #14316: Scheduled
February 11, 2025 00:20 1h 0m 59s master
February 11, 2025 00:20 1h 0m 59s
Variable batch size and LR scheduler
nv-lightning-v100 #14315: Pull request #7020 synchronize by bm-synth
February 10, 2025 22:59 Action required bm-synth:variable_batch_size_and_lr
February 10, 2025 22:59 Action required
config torch to avoid graph breaks caused by logger
nv-lightning-v100 #14314: Pull request #6999 synchronize by loadams
February 10, 2025 22:51 Action required ShellyNR:disable_logger_for_PT2.6
February 10, 2025 22:51 Action required
Fix assert on Lamb optimizers with BF16
nv-lightning-v100 #14313: Pull request #4451 synchronize by loadams
February 10, 2025 22:45 36m 30s loadams/lamb-bf16
February 10, 2025 22:45 36m 30s
fix hostname -I for macOS #6497
nv-lightning-v100 #14312: Pull request #6990 synchronize by loadams
February 10, 2025 22:38 Action required fitzjalen:master
February 10, 2025 22:38 Action required
Improve overflow handling in ZeRO
nv-lightning-v100 #14311: Pull request #6976 synchronize by loadams
February 10, 2025 22:32 4m 12s olruwase/ds_5241
February 10, 2025 22:32 4m 12s
Update workflows that use cuda 12.1 to use runners with 12.4
nv-lightning-v100 #14310: Pull request #7000 synchronize by loadams
February 10, 2025 22:30 4h 1m 20s loadams/update-runners-124
February 10, 2025 22:30 4h 1m 20s
Update workflows that use cuda 12.1 to use runners with 12.4
nv-lightning-v100 #14309: Pull request #7000 synchronize by loadams
February 10, 2025 22:14 16m 36s loadams/update-runners-124
February 10, 2025 22:14 16m 36s
nv-lightning-v100
nv-lightning-v100 #14308: Merge group checks requested
February 10, 2025 20:55 10m 25s
February 10, 2025 20:55 10m 25s
Update sharded_moe.py to support top2 gate with Tutel
nv-lightning-v100 #14307: Pull request #6948 synchronize by xenshinu
February 10, 2025 20:13 Action required xenshinu:patch-1
February 10, 2025 20:13 Action required
Update sharded_moe.py to support top2 gate with Tutel
nv-lightning-v100 #14306: Pull request #6948 synchronize by xenshinu
February 10, 2025 20:13 Action required xenshinu:patch-1
February 10, 2025 20:13 Action required
Update sharded_moe.py to support top2 gate with Tutel
nv-lightning-v100 #14305: Pull request #6948 synchronize by loadams
February 10, 2025 19:44 4m 11s xenshinu:patch-1
February 10, 2025 19:44 4m 11s
nv-lightning-v100
nv-lightning-v100 #14304: Merge group checks requested
February 10, 2025 19:22 4m 14s
February 10, 2025 19:22 4m 14s
fix hostname -I for macOS #6497
nv-lightning-v100 #14303: Pull request #6990 synchronize by loadams
February 10, 2025 17:03 4m 15s fitzjalen:master
February 10, 2025 17:03 4m 15s
Update workflows that use cuda 12.1 to use runners with 12.4
nv-lightning-v100 #14302: Pull request #7000 synchronize by loadams
February 10, 2025 16:21 2h 45m 28s loadams/update-runners-124
February 10, 2025 16:21 2h 45m 28s