keep track of the longest failed match to improve error messages #269

skoppe · 2020-01-19T18:58:40Z

Here I am introducing the failEnd and failedChild members of the ParseTree.

Currently pegged discards the longest failed match in the option?, the zeroOrMore* or the oneOrMore+ operators.

Let me illustrate:

Thing <- (A B)* C
A <- "abc"
B <- "def"
C <- "ghi"

Let us run this grammer on the following input: "abc". It will fail of course and the error message will be expected "ghi" but found end of line. This is silly because A did actually correctly match (except it is discarded) and the error ought to be expected "def" but found end of line.

This also occurs with the (A B)? or the (A B)+ (after the first match).

Essentially the option, zeroOrMore and the oneOrMore rules eat errors, and that can be problematic if during those rules it actually parsed the longest failing match.

There are also more complex causes where some nested rule is correctly parsed, but it could contain a longer failed match, so it is not just about rules that are not successful. E.g.

Thing <- AB C
AB <- A B?
A <- "abc"
B <- "def"
C <- "ghi"

With input "abc de" it will complain about expected "ghi" but got "de". Of course rule B? is matched longer, and it should complain about that.

This PR introduces additional logic to keep track of the longest failed match.

It is not in a perfect stage (it still needs tests), but I wanted to submit it already to get some feedback.

veelo

Thanks for addressing this much requested improvement, Sebastiaan. And welcome to the Pegged club!

veelo · 2020-01-20T08:52:12Z

pegged/peg.d

@@ -1333,6 +1348,29 @@ template and(rules...) if (rules.length > 0)
    }
 }

+// This function rewrites the ParseTree to move the failedChild into its parents children
+// this is done whenever an 'and' rule fails and there are such a child further down


"there are such a child" = "there is such a child"

This could be a nested function inside and(), to clarify that it is only called from there.

I initially misunderstood what this function does. The way this is documented, it seemed like a descendant was being pulled up the ancestry tree to be inserted higher up. That would be very strange, of course.

veelo · 2020-01-20T10:22:13Z

pegged/peg.d

+                auto firstLongestFailedMatch = result.children.countUntil!(c => c.failEnd > temp.end);
+                if (firstLongestFailedMatch == -1) {
+                    result.children ~= temp;// add the failed node, to indicate which failed
+                    //  result.end = temp.end;


I assume you don't mean to commit this line that is commented out.

veelo · 2020-01-20T10:23:39Z

pegged/peg.d

+                } else {
+                    // don't add the failed node because a previous one already failed further back
+                    result.children = result.children[0 .. firstLongestFailedMatch+1]; // discard any intermediate correct nodes
+                    failedChildFixup(result.children[firstLongestFailedMatch]);


Maybe explain what needs fixing up.

I hope this is clearer now.

skoppe · 2020-01-21T14:16:04Z

@veelo Thanks for taking the time to review. I have addressed your points. I will add some unittests (hopefully later today).

veelo · 2020-01-21T20:43:29Z

pegged/peg.d

+                } else {
+                    // don't add the failed node because a previous one already failed further back
+                    result.children = result.children[0 .. firstLongestFailedMatch+1]; // discard any intermediate correct nodes
+                    failedChildFixup(result.children[firstLongestFailedMatch]);


skoppe · 2020-01-29T19:42:00Z

I just found out that the fixup interferes with the memoization. No solution as of yet.

veelo · 2020-01-29T23:17:59Z

Right, that can be tricky. Good you discovered that.

Philippe just made me a repository collaborator, so I’ll be able to merge once the PR is ready.

skoppe · 2020-01-30T10:35:20Z

Great. I think I have found the culprit but will first test on our codebase. If this gets merged I suggest to make a beta tag and after some testing at least a minor version bump. The ParseTree will be different under invalid input, and people's current error handling might not expect that.

- note how much of a literal is parsed when failed - only do fixup for children which match failEnd - recalculate failEnd in fixup - have the `or` template return a list of failed children instead of last one - propagate failEnd better in `oneOrMore` and `option` - keep failedChild in decimateTree

- wanted to iterate byGrapheme, but that won't work in ctfe - decimateTree should keep nodes with no matches when the root parsetree has failed (there may be failedChild elements in there useful for error reporting)

veelo · 2020-02-25T23:24:29Z

Are you ready for me to tag a beta release with this?

skoppe · 2020-02-26T08:26:49Z

Yes, please do, last couple of weeks everything was stable here.

veelo · 2020-02-26T14:46:03Z

@skoppe Any idea what could be the cause of the spike in compiler memory consumption? See #272.

skoppe force-pushed the better-error branch from a07e2b2 to e09db77 Compare January 19, 2020 19:00

keep track of the longest failed match to improve error messages

d4ce46b

skoppe force-pushed the better-error branch from e09db77 to d4ce46b Compare January 19, 2020 19:03

veelo requested changes Jan 20, 2020

View reviewed changes

Fix wording, remove commented out code

f8163be

Add unittests for failedChild

014394f

veelo approved these changes Jan 21, 2020

View reviewed changes

skoppe changed the title ~~keep track of the longest failed match to improve error messages~~ WIP: keep track of the longest failed match to improve error messages Jan 29, 2020

skoppe added 3 commits February 3, 2020 10:52

Fix unicode bug and fix decimateTree

9add4b5

- wanted to iterate byGrapheme, but that won't work in ctfe - decimateTree should keep nodes with no matches when the root parsetree has failed (there may be failedChild elements in there useful for error reporting)

fix expected string length bug

c6d1a03

skoppe changed the title ~~WIP: keep track of the longest failed match to improve error messages~~ keep track of the longest failed match to improve error messages Feb 26, 2020

veelo merged commit b1c5adf into dlang-community:master Feb 26, 2020

veelo mentioned this pull request Feb 26, 2020

v0.4.5-beta.1: dmd runs out of memory #272

Open

veelo mentioned this pull request Mar 25, 2020

Config travis CI or another free CI #276

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

keep track of the longest failed match to improve error messages #269

keep track of the longest failed match to improve error messages #269

skoppe commented Jan 19, 2020

veelo left a comment

veelo Jan 20, 2020

veelo Jan 20, 2020

veelo Jan 20, 2020

veelo Jan 20, 2020

veelo Jan 20, 2020

skoppe Jan 21, 2020

veelo Jan 21, 2020

skoppe commented Jan 21, 2020

veelo Jan 21, 2020

skoppe commented Jan 29, 2020 •

edited

Loading

veelo commented Jan 29, 2020

skoppe commented Jan 30, 2020

veelo commented Feb 25, 2020 •

edited

Loading

skoppe commented Feb 26, 2020

veelo commented Feb 26, 2020

keep track of the longest failed match to improve error messages #269

keep track of the longest failed match to improve error messages #269

Conversation

skoppe commented Jan 19, 2020

veelo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skoppe commented Jan 21, 2020

Choose a reason for hiding this comment

skoppe commented Jan 29, 2020 • edited Loading

veelo commented Jan 29, 2020

skoppe commented Jan 30, 2020

veelo commented Feb 25, 2020 • edited Loading

skoppe commented Feb 26, 2020

veelo commented Feb 26, 2020

skoppe commented Jan 29, 2020 •

edited

Loading

veelo commented Feb 25, 2020 •

edited

Loading