-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
keep track of the longest failed match to improve error messages #269
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing this much requested improvement, Sebastiaan. And welcome to the Pegged club!
pegged/peg.d
Outdated
@@ -1333,6 +1348,29 @@ template and(rules...) if (rules.length > 0) | |||
} | |||
} | |||
|
|||
// This function rewrites the ParseTree to move the failedChild into its parents children | |||
// this is done whenever an 'and' rule fails and there are such a child further down |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"there are such a child" = "there is such a child"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a nested function inside and(), to clarify that it is only called from there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially misunderstood what this function does. The way this is documented, it seemed like a descendant was being pulled up the ancestry tree to be inserted higher up. That would be very strange, of course.
pegged/peg.d
Outdated
auto firstLongestFailedMatch = result.children.countUntil!(c => c.failEnd > temp.end); | ||
if (firstLongestFailedMatch == -1) { | ||
result.children ~= temp;// add the failed node, to indicate which failed | ||
// result.end = temp.end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you don't mean to commit this line that is commented out.
pegged/peg.d
Outdated
} else { | ||
// don't add the failed node because a previous one already failed further back | ||
result.children = result.children[0 .. firstLongestFailedMatch+1]; // discard any intermediate correct nodes | ||
failedChildFixup(result.children[firstLongestFailedMatch]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe explain what needs fixing up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope this is clearer now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
@veelo Thanks for taking the time to review. I have addressed your points. I will add some unittests (hopefully later today). |
pegged/peg.d
Outdated
} else { | ||
// don't add the failed node because a previous one already failed further back | ||
result.children = result.children[0 .. firstLongestFailedMatch+1]; // discard any intermediate correct nodes | ||
failedChildFixup(result.children[firstLongestFailedMatch]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
I just found out that the fixup interferes with the memoization. No solution as of yet. |
Right, that can be tricky. Good you discovered that. Philippe just made me a repository collaborator, so I’ll be able to merge once the PR is ready. |
Great. I think I have found the culprit but will first test on our codebase. If this gets merged I suggest to make a beta tag and after some testing at least a minor version bump. The ParseTree will be different under invalid input, and people's current error handling might not expect that. |
- note how much of a literal is parsed when failed - only do fixup for children which match failEnd - recalculate failEnd in fixup - have the `or` template return a list of failed children instead of last one - propagate failEnd better in `oneOrMore` and `option` - keep failedChild in decimateTree
- wanted to iterate byGrapheme, but that won't work in ctfe - decimateTree should keep nodes with no matches when the root parsetree has failed (there may be failedChild elements in there useful for error reporting)
Are you ready for me to tag a beta release with this? |
Yes, please do, last couple of weeks everything was stable here. |
Here I am introducing the
failEnd
andfailedChild
members of the ParseTree.Currently pegged discards the longest failed match in the option?, the zeroOrMore* or the oneOrMore+ operators.
Let me illustrate:
Let us run this grammer on the following input: "abc". It will fail of course and the error message will be
expected "ghi" but found end of line
. This is silly because A did actually correctly match (except it is discarded) and the error ought to beexpected "def" but found end of line.
This also occurs with the (A B)? or the (A B)+ (after the first match).
Essentially the option, zeroOrMore and the oneOrMore rules eat errors, and that can be problematic if during those rules it actually parsed the longest failing match.
There are also more complex causes where some nested rule is correctly parsed, but it could contain a longer failed match, so it is not just about rules that are not successful. E.g.
With input "abc de" it will complain about
expected "ghi" but got "de"
. Of course rule B? is matched longer, and it should complain about that.This PR introduces additional logic to keep track of the longest failed match.
It is not in a perfect stage (it still needs tests), but I wanted to submit it already to get some feedback.