Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot reproduce few-shot results (Table 2) && Out of memory while creating train_comm_pair.jsonl #3

Open
belizgunel opened this issue Oct 26, 2022 · 3 comments

Comments

@belizgunel
Copy link

belizgunel commented Oct 26, 2022

Thank you so much for open-sourcing your implementation and for your great work! I have two questions, and your help would be much appreciated!

  1. I followed the decoding using pre-trained models and evaluation as described in your repo, and I was not able to reproduce your few-shot results in Table 2. There is up to ~5 points difference in BERTScore for both contrastive and common summaries and significant difference in ROUGE score up to ~4 points. Would there be any other hyperparameters I should fix? As a comparison, self-supervised results are much closer to what's reported on your paper.
  2. My process keeps getting killed while trying to create train_comm_pair.jsonl at line

    cocosum/prep.py

    Line 154 in 2a94132

    sim = np.argsort((tgt_vec @ tgt_vec.T).toarray(), axis=1)
    , and my machine has 192B RAM. I was wondering if you could share the file or help me figure out this/optimize that line?

Thank you so much.

@belizgunel belizgunel changed the title Out of memory while creating train_comm_pair.jsonl Cannot reproduce few-shot results (Table 2) && Out of memory while creating train_comm_pair.jsonl Oct 26, 2022
@lovodkin93
Copy link

Hey,
I also keep getting a similar error:

numpy.core._exceptions.MemoryError: Unable to allocate 84.6 GiB for an array with shape (11355303385,) and data type
 int64

Would appreciate any help.
Thanks!

@isomap
Copy link
Collaborator

isomap commented Jul 25, 2023

Hi @lovodkin93 , I modified the prep script to run with less memory resources (see https://github.com/megagonlabs/cocosum/blob/main/prep.py#L153-L162)
I hope now you can run the prep script with no memory error.

Thanks!

@isomap
Copy link
Collaborator

isomap commented Jul 25, 2023

For the few-shot result reproduction thing, I've already fixed the readme file to show the correct hyperparameter setting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants