[CUDA] add multi-node multi-GPU support for new CUDA version and improve efficiency #5993

Xuweijia-buaa · 2023-07-20T08:05:43Z

Summary

Current multi-GPU version just support one node scenario, we can add support for multi-node Multi-GPU scenario for new CUDA version, to deal with big data that can not fit in one node.

And we can also improve efficiency for such scenario. In the current Multi-GPU version, each device loads a complete histogram and performs follow-up work based on it. When dealing with datasets that have too many features, the full histogram may be too large to fit in one device. Even the histogram can fit in, each device will do repetitive work afterwards.

We can use a mixed data and feature parallel strategy in such scenario, like in the distributed data parallel scenario, alleviate loading and computing pressure on each device.

jameslamb · 2023-08-17T22:05:32Z

Added this to #2302 with other feature requests. Thanks for documenting it!

Per this repo's policy described there, I'm closing this discussion for now. Anyone wishing to implement this, leave a comment saying that you're working on it and we can re-open this discussion.

Xuweijia-buaa changed the title ~~[CUDA] support multi-node multi-GPU scenario for new CUDA version~~ [CUDA] add multi-node multi-GPU support for new CUDA version and imporve effiency Jul 20, 2023

Xuweijia-buaa changed the title ~~[CUDA] add multi-node multi-GPU support for new CUDA version and imporve effiency~~ [CUDA] add multi-node multi-GPU support for new CUDA version and imporve efficiency Jul 20, 2023

Xuweijia-buaa changed the title ~~[CUDA] add multi-node multi-GPU support for new CUDA version and imporve efficiency~~ [CUDA] add multi-node multi-GPU support for new CUDA version and improve efficiency Jul 20, 2023

jameslamb added the feature request label Jul 20, 2023

jameslamb mentioned this issue Aug 17, 2023

Feature Requests & Voting Hub #2302

Open

jameslamb closed this as completed Aug 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CUDA] add multi-node multi-GPU support for new CUDA version and improve efficiency #5993

[CUDA] add multi-node multi-GPU support for new CUDA version and improve efficiency #5993

Xuweijia-buaa commented Jul 20, 2023 •

edited

Loading

jameslamb commented Aug 17, 2023

[CUDA] add multi-node multi-GPU support for new CUDA version and improve efficiency #5993

[CUDA] add multi-node multi-GPU support for new CUDA version and improve efficiency #5993

Comments

Xuweijia-buaa commented Jul 20, 2023 • edited Loading

Summary

jameslamb commented Aug 17, 2023

Xuweijia-buaa commented Jul 20, 2023 •

edited

Loading