question about the arguments within the get_pretrained_model() function #76

hongruhu · 2024-07-31T17:43:37Z

Hi,

When I looked at some examples of getting the pretrained models, I saw:

parameters, forward_fn, tokenizer, config = get_pretrained_model(
    model_name="500M_human_ref",
    embeddings_layers_to_save=(20,),
    max_positions=32,
)

parameters, forward_fn, tokenizer, config = get_pretrained_model(
    model_name="500M_1000G",
    # Get embedding at layers 5 and 20
    embeddings_layers_to_save=(5, 20,),
    # Get attention map number 4 at layer 1 and attention map number 14
    # at layer 12
    attention_maps_to_save=((1,4), (12, 14)),
    max_positions=128,
)

Here it seems that different pretrained models have different configuration? I was wondering if you could further add detailed clarification about how to choose the embeddings_layers_to_save and max_positions. I also saw in some issues mentioning that we might need to set the max_positions to 1000? Just a bit confused and it would be the best if the authors could provide the suggested configures for each pretrained model somewhere in the tutorial or readme files.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about the arguments within the get_pretrained_model() function #76

question about the arguments within the get_pretrained_model() function #76

hongruhu commented Jul 31, 2024

question about the arguments within the get_pretrained_model() function #76

question about the arguments within the get_pretrained_model() function #76

Comments

hongruhu commented Jul 31, 2024