Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lenient BedParser #3

Merged
merged 8 commits into from
Jun 11, 2019
Merged

Lenient BedParser #3

merged 8 commits into from
Jun 11, 2019

Conversation

dievsky
Copy link
Contributor

@dievsky dievsky commented May 30, 2019

Related to JetBrains-Research/jbr#62.

BedParser now has stringency property. In lenient mode (default), the parser skips any lines it can't parse (it also logs them with debug level). In strict mode, the parser throws an exception on any such line. This new feature is covered by tests.

Note: It's actually arguable whether we need strict mode at all. Thoughts are welcome.

dievsky added 4 commits May 30, 2019 11:14
Lenient parser logs the parsing errors and keeps going, strict parser
throws an exception.
Also added comments to parser's methods.
As per pull request review suggestion
@iromeo
Copy link
Contributor

iromeo commented May 31, 2019

At the moment the default mode is LENIENT and all errors are hidden (debug logging usually is off) so a developer cannot figure out what percent of file was actually parsed, e.g. 2 lines of 100500 or 100% lines. Is it better to add some errors counter into BedParser or change default mode to strict one?

@dievsky
Copy link
Contributor Author

dievsky commented May 31, 2019

Each line can:

  1. be successfully parsed as a BedEntry;
  2. fail to parse due to a BedFormatException;
  3. be skipped since it matches NON_DATA_LINE_PATTERN.

We can offer those three numbers as properties of BedParser, so that the user can query them if necessary.

@iromeo
Copy link
Contributor

iromeo commented May 31, 2019

We can offer those three numbers as properties of BedParser, so that the user can query them if necessary.

I think one is enough: fail to parse due to a BedFormatException

Following the pull request review
@dievsky
Copy link
Contributor Author

dievsky commented Jun 4, 2019

Added the respective property and some tests for it.

@dievsky dievsky merged commit 3552b83 into master Jun 11, 2019
@dievsky dievsky deleted the jbr-issue-62 branch June 11, 2019 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants