Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatibility of X-LoRA and MistralForSequenceClassification #2281

Open
2 of 4 tasks
cyx96 opened this issue Dec 13, 2024 · 2 comments
Open
2 of 4 tasks

Incompatibility of X-LoRA and MistralForSequenceClassification #2281

cyx96 opened this issue Dec 13, 2024 · 2 comments

Comments

@cyx96
Copy link

cyx96 commented Dec 13, 2024

System Info

peft version: 0.13.2
accelerate version: 1.1.1
transformers version: 4.46.3

Python version: 3.10.15

Platform: Linux-5.10.0-33-cloud-amd64-x86_64-with-glibc2.31

Who can help?

@BenjaminBossan @EricLBuehler

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

The adapters are fine-tuned mistral 7b v0.1 on xnli dataset.

I used the following script to load an xlora version of mistral 7b with 3 pre-trained adapters:

import torch
from transformers import AutoModelForSequenceClassification, AutoConfig
from peft import XLoraConfig, get_peft_model

# Load model configuration
model_config = AutoConfig.from_pretrained("mistralai/Mistral-7B-v0.1")

# XLora Configuration
lora_config = XLoraConfig(
    task_type="SEQ_CLS",
    hidden_size=model_config.hidden_size,
    xlora_depth=2,
    adapters={
        "0": "./mistral_xnli_ckpt/de",
        "1": "./mistral_xnli_ckpt/en",
        "2": "./mistral_xnli_ckpt/fr",
    }
)

# Load and configure model
model = AutoModelForSequenceClassification.from_pretrained(
    "mistralai/Mistral-7B-v0.1",
    num_labels=3,  # XNLI has 3 labels: entailment, neutral, contradiction
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
    use_cache=False,
)

# Explicitly move the model to GPU
device = torch.device("cuda:0")
model = model.to(device)

# Apply XLora
model = get_peft_model(model, lora_config).to(device)

Executing above will result in errors:

Some weights of MistralForSequenceClassification were not initialized from the model checkpoint at mistralai/Mistral-7B-v0.1 and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
  File "/home/chenyuxu/XLMoE/mistral_xlora_ft.py", line 51, in <module>
    model = get_peft_model(model, lora_config).to(device)
  File "/opt/conda/envs/handbook/lib/python3.10/site-packages/peft/mapping.py", line 193, in get_peft_model
    return MODEL_TYPE_TO_PEFT_MODEL_MAPPING[peft_config.task_type](
  File "/opt/conda/envs/handbook/lib/python3.10/site-packages/peft/peft_model.py", line 1378, in __init__
    super().__init__(model, peft_config, adapter_name, **kwargs)
  File "/opt/conda/envs/handbook/lib/python3.10/site-packages/peft/peft_model.py", line 171, in __init__
    self.base_model = cls(model, {adapter_name: peft_config}, adapter_name)
  File "/opt/conda/envs/handbook/lib/python3.10/site-packages/peft/tuners/xlora/model.py", line 279, in __init__
    _load_adapter_into_lora_model(
  File "/opt/conda/envs/handbook/lib/python3.10/site-packages/peft/tuners/xlora/model.py", line 148, in _load_adapter_into_lora_model
    raise ValueError(
ValueError: Got unexpected keys! Please raise an issue and tag @EricLBuehler.

unexpected_keys=['model.model.score.modules_to_save.0.weight']

Expected behavior

Reading the above error message, it seems like the MistralForSequenceClassification created and initialized some extra weights aside from the ones provided by "mistralai/Mistral-7B-v0.1". Registering the newly added weights to X-LoRA should solve the issue? Any advice or feedback regarding this is greatly appreciated, thanks!

@BenjaminBossan
Copy link
Member

@EricLBuehler

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants