Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Size of the test set (annotations.json) #3

Open
qishenghu opened this issue Apr 15, 2024 · 1 comment
Open

Size of the test set (annotations.json) #3

qishenghu opened this issue Apr 15, 2024 · 1 comment

Comments

@qishenghu
Copy link

Thanks for the good work.

According to the arxiv paper, there should be approximately 1,000 annotated response for FAVABENCH detection task. But seems like the 'annotations.json' file from this link (https://huggingface.co/datasets/fava-uw/fava-data/tree/main) contains only 460 records. Could you kindly help me understand which file might be the correct annotated response for FAVABENCH?

Thanks!

@khunkin
Copy link

khunkin commented Apr 18, 2024

The 'annotations.json' file from this link(submitted on Jan 15, 2024) should correspond to version 2(submitted on Jan 17, 2024) of the submission on arXiv, which states: "Our benchmark consists of about 400 responses of ChatGPT and Llama2-Chat 70B." And the version 3 (submitted on Feb 21, 2024) mentions: "...annotating approximately 1,000 responses of three widely used LMs." To date (Apr 18, 2024), the authors have not updated the data to the latest version. Hope that the authors will update the data.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants