Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about the arguments within the get_pretrained_model() function #76

Open
hongruhu opened this issue Jul 31, 2024 · 0 comments
Open

Comments

@hongruhu
Copy link

Hi,

When I looked at some examples of getting the pretrained models, I saw:

parameters, forward_fn, tokenizer, config = get_pretrained_model(
    model_name="500M_human_ref",
    embeddings_layers_to_save=(20,),
    max_positions=32,
)
parameters, forward_fn, tokenizer, config = get_pretrained_model(
    model_name="500M_1000G",
    # Get embedding at layers 5 and 20
    embeddings_layers_to_save=(5, 20,),
    # Get attention map number 4 at layer 1 and attention map number 14
    # at layer 12
    attention_maps_to_save=((1,4), (12, 14)),
    max_positions=128,
)

Here it seems that different pretrained models have different configuration? I was wondering if you could further add detailed clarification about how to choose the embeddings_layers_to_save and max_positions. I also saw in some issues mentioning that we might need to set the max_positions to 1000? Just a bit confused and it would be the best if the authors could provide the suggested configures for each pretrained model somewhere in the tutorial or readme files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant