Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request to intergrate Structure Sparsity-based PEFT (S2FT) #2329

Open
Hanyuezhuohua opened this issue Jan 14, 2025 · 1 comment
Open

Request to intergrate Structure Sparsity-based PEFT (S2FT) #2329

Hanyuezhuohua opened this issue Jan 14, 2025 · 1 comment

Comments

@Hanyuezhuohua
Copy link

Feature request

This request proposes to intergrate S2FT, a pure structure sparsity-based PEFT method that concurrently achieve state-of-theart fine-tuning performance, training efficiency, and inference scalability. More information about our NeurIPS paper can be found here: https://infini-ai-lab.github.io/S2FT-Page/, of which i'm the first author. Here is our code for the implementation: https://github.com/Infini-AI-Lab/S2FT.

Motivation

As far as I know, S2FT is the first one to offer efficient and flexible sparsity-based PEFT for LLMs (previously only some add sparsity to LoRA or use layerwise freezing). Here, we'd like to mention several importance features of S2FT:

  • Model Versatility: The design of our structure sparsity is based on the coupled structure in LLMs, which commonly exists in LLMs, VLMs, CNNs, and GNNs. Therefore, our method should work for many different structures.

  • Generalization Ability: When evaluated on more recent models such as LLaMA-3-8B, we observe that our method can outperform both LoRA and Full FT, which is because we only modified a small fraction of the original parameters. Therefore, we can maintain most advanced abilities during pre-training.

Image
  • Training Efficiency: Instead of focusing on the parameter efficiency, S2FT can provide practical acceleration for model training. In our experiments, we show that S2FT can surpass LoRA in both training memory and time by 10%, which is important for resource-limited settings.
Image
  • Scalable Serving: Finally, S2FT also shows good serving ability in comparison with LoRA, where we consider adapter fusion, switch, and parallelism. For these settings, S2FT always outperforms LoRA in both efficiency and performance.
Image
  • Controllability: The model parameters to be updated in S2FT can be selected with user-specific functions, where LoRA cannot achieve this.

Based on these information, although S2FT is just released, we think it is new kind of PEFT method showing very good potential. And the integration of it should be benefit for future sparsity-based PEFT methods.

Your contribution

I will try to write most code for this new PEFT method based on the current PEFT

@BenjaminBossan
Copy link
Member

Thank you for presenting this novel PEFT technique. I skimmed the paper and code and this could indeed be a nice addition to PEFT. Feel free to open a PR with your contribution. As a tip:

  • You can start with a draft PR with only the core implementation to get quick feedback. We can then iterate on adding tests, docs, examples.
  • Check out previous PRs that add new PEFT methods (e.g. support HRA #1864) for guidance. However, notice that we made some changes in Refactor: PEFT method registration function #2282 to simplify this process.
  • Ideally, we can manage to replicate some paper findings with the PEFT implementation to show that it works as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants