-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove Leading and trailing automatic space capture #332
Comments
I'm not sure that your assesment
is correct. Let's write out your initial grammar, replacing
This is equivalent to your grammar. Now, feeding it this input
the second statement on the first line of input will consume the line ending, because In cases where spacing has semantic relevance depending on context, like this one, I recommend to not use /+dub.sdl:
dependency "pegged" version="~>0.4"
+/
import pegged.grammar;
import std.stdio;
mixin(grammar(`
MyGrammar:
File <- Statement_list eoi
Statement_list <- Statement (Separator Statement)*
Separator <- ',' / eol+
Statement <- :space* '1' :blank* '+' :blank* '2' :space*
`));
void main()
{
auto parseTree = MyGrammar("1 + 2");
assert(parseTree.matches == ["1", "+", "2"]);
writeln(parseTree);
writeln;
parseTree = MyGrammar("1 + 2, 1 + 2");
assert(parseTree.matches == ["1", "+", "2", ",", "1", "+", "2"]);
writeln(parseTree);
writeln;
parseTree = MyGrammar("1 + 2, 1 + 2\n1\n+2");
assert(parseTree.matches == ["1", "+", "2", ",", "1", "+", "2", "\n", "1", "+", "2"]);
writeln(parseTree);
writeln;
} As you see, this parses your input correctly, because As for your proposal, we cannot change the current behaviour, as that would break existing grammars. So a new shorthand would have to be introduced. I haven't studied this further, but perhaps parameterized rules may help you avoid the need for a new shorthand. Hope this helps, |
You reiterated what I described in the Thank you for the fast reply! |
Sorry about that. That wasn't clear to me when I read your code, but at least we are on the same page :-)
I accept pull requests. Beware though that some files are generated, and those should not be edited by hand. See https://github.com/PhilippeSigaud/Pegged/tree/master/pegged/dev
That is what I was thinking about with my parameterized rules suggestion. Totally not sure whether that is applicable though.
What's that? We have the existing
My pleasure! |
The behavior is completely different. What that does is drop the parent AND the child nodes. What I mean is to be able to mark a node for decimation. I tried to pull this off by altering the name the parent with a semantic action, so that it would be culled, but as it happens |
Sorry, I meant
I still don't quite understand what you want. What happens to the children of a culled node? Do they replace it? |
If you recall how decimation works, like that. The children are added to the tree. What I ended up doing is using a recursive script that renames everything with a certain prefix. |
Problem
Using
<-
containing rules that use<
will nullify the effectiveness of the former. For example, imagine you desire grammar like the following:valid input that yields 3 statements might look like
The intention of this grammar is that only one statement can occur per line. Basically statements can be separated by a number of newlines, or one comma, but not both, while the contents of those statements can be spread across multiple lines.
These rules are impossible with this more intuitive setup, because
<
captures leading and trailing white-space. There is a way to pull this off, but it requires instead setting a custom spacing rule with no newlines, and then manually specifying everywhere a newline can occur in the middle of a statement. For instance, here's an excerpt from my own code.Proposal
There is an elegant solution that should save on some spacing checks and enable the code I presented at the top to work as intended. Essentially,
Spacing
is only inserted between characters. Never at the start or end of the rule. There is one exception to this though, which is the entrypoint rule. With this the entrypoint must have non space characters at the first position of the input. The solution for that is to either make an exception where<
on the entrypoint will capture leading whitespace, or just require the user to handle it explicitly with""
or usingSpace
directly.This shouldn't cause breakages for most code, consider:
Expands to:
The only case where spacing is missed, is where
A
uses<-
, forcing2
and3
to be adjacent, or allow only what the user wants.The text was updated successfully, but these errors were encountered: