-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v0.4.9: Invalid grammar errors are not reported #335
Comments
Is there a way I can have a look at your Grammar? |
@veelo Sure, I've uploaded it. (It contains a few rules for Unicode because I needed them and didn't know if they were predefined anywhere, and the docs aren't clear on whether I could match Unicode general categories, so the grammar doesn't start until around line 3237.) Sorry about that -- if there's a better way of handling that, I'd happily do it. :) |
I finally found some time to reproduce this. Given this Ada source with Text_IO; use Text_IO;
procedure hello is
begin
Put_Line("Hello world!");
end hello; the parse tree generated by Pegged with your grammar is
So the grammar succeeds while consuming none of the input.
The latter meaning 0 or more, so it happily accepts 0 and calls it a day. You can force
With this modification, the parser now produces the failing parse tree
I haven't studied what is the problem from this point on, but it looks like whitespace is ignored. I hope this will help you further. See also the Extended Pascal example, which uses |
@veelo Thank you, I didn't know eoi was an actual rule! I'm unsure why it's failing; I still can't figure out why tracing isn't working (maybe the documentation needs to be improved on that.... It says to use the tracer subconfiguration, and I do that, but |
Also, one last question: which extended token (colon, semicolon, ...) do I need to use to effectively say "don't create an AST node for this entire node, but instead replace this node with it's children"? I.e., in the tree you posted above, you'll notice a (lot) of unnecessary stuff I want to get rid of, but there's also stuff I want to keep. Take the context item, for instance: that's a limited with clause or a nonlimited with clause. I don't need to know that a context item exists in the tree; I just need to get access to the with clauses themselves and I can ignore that, hey, there's context items, because that's obvious. Same for a basic declaration: I could care less that there's a basic declaration; I want to know about the children of that node only. The fact that it's a "basic declaration" is a bit immaterial and just creates clutter. (I might end up keeping a lot of this anyway though, I'm unsure what I'll need to get rid of and what I need to keep.) |
@veelo Okay, I think I know exactly why it's failing to match. My identifier rule is defined as fusing everything it matches. For some reason, pegged is taking that to mean "eat everything". Or something like that is going on. If you look at the parse tree:
You'll notice that it's stripped out every bit of whitespace and just gobbled up the quotes, the strings, everything, and just merged it into one element. I don't know if my delimiters are messed up or what. I really need to get tracing working... |
I am looking at the tracing issue. It appears |
I just updated the wiki with updated tracing instructions. |
Thanks, will try again and get back to you. |
So tracing is working but it really isn't all that helpful. But I think I know the problem: I'm misusing the |
Use of |
Do you see any matches of |
I'm not sure what's causing this then. Here's what it's doing:
You'll notice that, for whatever reason, it's devouring everything. First WS is stripped, and then the entire string is just consumed, leaving nothing left to parse, and an invalid input. I've defined keywords as:
Same for delimiters. Pretty much all the rules, minus a couple, use the space
But still, I don't see how that should be causing this issue. The |
I see lines like this, if your talking about this:
I'm not sure what an actual "match" would look like. |
Update: yes, separator matches:
|
Good, so it recognises the space. Then look above and below that line in the trace why it continues to consume input for the identifier. It should not. |
That was for the with clause.... For the procedure part, it's doing something a bit strange:
It looks like it's just scanning further and further ahead, consuming more and more input, because it's failing to match. I did define the separator rule as:
But I used |
I just tried debugging it in gdb and... Well, GDB didn't like all the template code. Got stuck trying to step into the main rule. Just sat there wasting CPU cycles until I killed it. |
The difference between The trace looks weird indeed. That first line should succeed, and consume the |
should be
no? |
I don't really know why it'd cause problems. I can try to read the file as UTF-32, but still.... Those code points are:
You can find the full list here (for space separators), here (for line separators), and here (for paragraph separators). |
And I'm unsure.... I think the tutorial uses Update: just tried and it completely broke.... Now it expects spaces. So |
Okay, I know the problem. This rule:
Gets translated into: static TParseTree Spacing(string s)
{
if(__ctfe)
{
return pegged.peg.defined!(pegged.peg.zeroOrMore!(pegged.peg.or!(Separator, pegged.peg.or!(pegged.peg.literal!("\r"), pegged.peg.literal!("\n"), pegged.peg.literal!(`\`), pegged.peg.literal!("f"), pegged.peg.literal!("\t"), pegged.peg.literal!(`\`), pegged.peg.literal!("v")))), "Ada.Spacing")(TParseTree("", false,[], s));
}
else
{
forgetMemo();
return hooked!(pegged.peg.defined!(pegged.peg.zeroOrMore!(pegged.peg.or!(Separator, pegged.peg.or!(pegged.peg.literal!("\r"), pegged.peg.literal!("\n"), pegged.peg.literal!(`\`), pegged.peg.literal!("f"), pegged.peg.literal!("\t"), pegged.peg.literal!(`\`), pegged.peg.literal!("v")))), "Ada.Spacing"), "Spacing")(TParseTree("", false,[], s));
}
}
static string Spacing(GetName g)
{
return "Ada.Spacing";
} This can be worked around pretty easily, but in general I think that the parser should pass-through any and all escape sequences. Otherwise you get weird things like this. It makes me wonder what else is getting screwed up in this manner. |
Are you talking about |
No, it took \f and \v and thought I meant the character sequences \ or f or \ or v. (Changing it to the code points didn't solve it....) The new trace is this.... This just gets weirder and weirder:
It's like it doesn't even try matching the others from what I can tell. It just tries separator and moves on if that dies. |
Here's the latest trace data.... Frankly I'm stumped at this point. And gdb can't debug the generated parser for some strange reason.... It gets stuck when trying to step into the function. (Search for |
One last suggestion before I log off for the day. If you think some input should be parsed by particular rules in the grammar but it is not, isolate those rules in a new grammar and feed it only that input. Then it may be easier to see why it doesn't. And if it does, then other rules must be the problem, and you can gradually extend the grammar and see when it breaks. Good luck! |
On second look I think this is OK. The tracer seems to skip built in parsers, such as |
Yeah it's difficult to understand what it's doing. |
So, since resolving #333, everything works fine, but no parse tree is reported. (I don't know if my grammar is valid; no errors are reported at all.) I tried the grammar debugging article (building with the tracer version for the pegged dependency) and that didn't print anything. (The code in the grammar debugging article also seems to be wrong, since
setTraceConditionFunction
takes a delegate with two arguments, not one).The text was updated successfully, but these errors were encountered: