Cannot reproduce few-shot results (Table 2) && Out of memory while creating train_comm_pair.jsonl #3

belizgunel · 2022-10-26T19:48:26Z

Thank you so much for open-sourcing your implementation and for your great work! I have two questions, and your help would be much appreciated!

I followed the decoding using pre-trained models and evaluation as described in your repo, and I was not able to reproduce your few-shot results in Table 2. There is up to ~5 points difference in BERTScore for both contrastive and common summaries and significant difference in ROUGE score up to ~4 points. Would there be any other hyperparameters I should fix? As a comparison, self-supervised results are much closer to what's reported on your paper.
My process keeps getting killed while trying to create train_comm_pair.jsonl at line

cocosum/prep.py

Line 154 in 2a94132

sim = np.argsort((tgt_vec @ tgt_vec.T).toarray(), axis=1)

, and my machine has 192B RAM. I was wondering if you could share the file or help me figure out this/optimize that line?

Thank you so much.

lovodkin93 · 2023-07-23T11:09:01Z

Hey,
I also keep getting a similar error:

numpy.core._exceptions.MemoryError: Unable to allocate 84.6 GiB for an array with shape (11355303385,) and data type
 int64

Would appreciate any help.
Thanks!

isomap · 2023-07-25T00:08:53Z

Hi @lovodkin93 , I modified the prep script to run with less memory resources (see https://github.com/megagonlabs/cocosum/blob/main/prep.py#L153-L162)
I hope now you can run the prep script with no memory error.

Thanks!

isomap · 2023-07-25T01:19:13Z

For the few-shot result reproduction thing, I've already fixed the readme file to show the correct hyperparameter setting.

belizgunel changed the title ~~Out of memory while creating train_comm_pair.jsonl~~ Cannot reproduce few-shot results (Table 2) && Out of memory while creating train_comm_pair.jsonl Oct 26, 2022

isomap mentioned this issue Jul 25, 2023

MemoryError while preprocessing the data #1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot reproduce few-shot results (Table 2) && Out of memory while creating train_comm_pair.jsonl #3

Cannot reproduce few-shot results (Table 2) && Out of memory while creating train_comm_pair.jsonl #3

belizgunel commented Oct 26, 2022 •

edited

Loading

lovodkin93 commented Jul 23, 2023

isomap commented Jul 25, 2023

isomap commented Jul 25, 2023

Cannot reproduce few-shot results (Table 2) && Out of memory while creating train_comm_pair.jsonl #3

Cannot reproduce few-shot results (Table 2) && Out of memory while creating train_comm_pair.jsonl #3

Comments

belizgunel commented Oct 26, 2022 • edited Loading

lovodkin93 commented Jul 23, 2023

isomap commented Jul 25, 2023

isomap commented Jul 25, 2023

belizgunel commented Oct 26, 2022 •

edited

Loading