Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kbs: update report-nbumbers.kb #66

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

fschwenn
Copy link

  • Removes duplicate entries.

  • Removes dashes from seek-terms.

  • Adds new report numbers.

Signed-off-by: fschwenn [email protected]

Copy link
Contributor

@tsgit tsgit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple patterns containing hyphens on the left side. I flagged only a few. Please revise. There is a duplicate pattern with ambiguous replacement.
I think we should have test cases along with this major update of patterns.

CDF PUB PLUG UPGR PUBLIC ---CDF-PUB-PLUG-UPGR-PUBLIC
CDF PUB PUBLIC ---CDF-PUB-PUBLIC
CDF PUB SEC VTX PUBLIC ---CDF-PUB-SEC-VTX-PUBLIC
CDF PUB SEC VTX PUBLIC ---CDF-PUB-SEC_VTX-PUBLIC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the same seek term now has 2 different replacement strings. I don't think that is well defined. probably the later one overwrites the earlier one

@@ -362,18 +372,63 @@ SLAC PUB ---SLAC-PUB
SLAC R ---SLAC-R
SLAC TN ---SLAC-TN
SLAC WP ---SLAC-WP
SLAC-CN ---SLAC-CN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hyphens should not be in the match pattern

SLAC-PEP-II-EE-NOTE ---SLAC-PEP-II-EE-NOTE
SLAC-BABAR-THESIS ---SLAC-BABAR-THESIS
SLAC-BABAR-TALK ---SLAC-BABAR-TALK
SLAC-SSRL ---SLAC-SSRL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lots of unwanted hyphens


BNL ---BNL
BNL AADD ---BNL-AADD
BNL AD-RHIC ---BNL-AD-RHIC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hyphen

 * Removes duplicate entries.

 * Removes dashes from seek-terms.

 * Adds new report numbers.

Signed-off-by: fschwenn [email protected]

*****Tokyo Inst. Tech.*****
<s999?>
<s999?-NP>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about the hyphen here, I don't think it'll match

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think either, as it should have been stripped

@tsgit
Copy link
Contributor

tsgit commented May 24, 2019

I think it would be good to have a sample original reference line for each pattern, or at least each pattern group as a list of test cases. This would confirm the pattern does what is intended to do and no other pattern matches. At least would show when hyphens are incorrect.
e.g. similar to
https://github.com/inspirehep/refextract/blob/master/tests/test_tag.py#L132-L135
with tag_report (which you'd have to define) instead of tag_arxiv
or patterned after
https://github.com/inspirehep/refextract/blob/master/tests/test_engine.py#L102-L118 and elsewhere in that test file.
instead of multiple report_numbers I would focus on one at a time, or one pattern type at a time

@tsgit
Copy link
Contributor

tsgit commented Jun 11, 2019

@fschwenn take a look at https://github.com/inspirehep/refextract/pull/67/files
there you see test-cases Melissa added for the Fermilab report number parsing

also, since that PR was merged, you need to rebase your PR

@fschwenn
Copy link
Author

fschwenn commented Jun 13, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants