Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remake of the parser #58

Merged
merged 66 commits into from
Apr 6, 2024
Merged

Remake of the parser #58

merged 66 commits into from
Apr 6, 2024

Conversation

thi8v
Copy link
Member

@thi8v thi8v commented Feb 22, 2024

With the new log system, I'm remaking the parser, because the old one is old and crapy and I want to change the grammar of the language so kill two birds with one stone.

TODO:

  • new log / error system -> Improve the error handling in the compiler #55
    • allow errors on multiple lines
    • improve note and help to be able to point to code
    • in the parse_parenthesized_expr, add a note about the expected closing partenthesis to point to the opening one.
  • package clause
  • import declaration
  • function top level declaration
  • extern function tl. decl
  • var & const-var top level declaration
  • types
    • Primitive Types
    • Pointer & const pointer
  • expression
    • int literal
    • char literal
    • string literal
    • boolean literal
    • identifier expr
    • call expr
    • undefined expr -> see unresolved questions
    • parenthesized expr
    • member access
    • binary a + b -> Operator precedence parser
    • unary -a, !a, &a, and a.*
    • if-else expr -> a if predicate else b
    • make Block stmt an expression
  • statements
    • var & const-var declaration
    • if-else
    • block stmt
    • return
    • while
    • labeled stmts
      • while
      • block
    • break
    • continue
    • expr stmt
    • short var declaration ? -> $IdentifierList := $ExprList
    • assignment
    • post & pre -fix increment and decrement i++, i--, ++i and --i -> will potentialy be done later, the question hasn't yet been resolved
  • move operators related stuff in zom_common::token in its own module
  • add tests -> Adding Tests #6
    • in the parser
    • in the lexer
    • in the log crate

Unresolved questions

  • Is it still relevant to make the undefined expression ? With the new path Zom is taking, the path of simplicity, should instead allow variable declaration with the expr optional : const test: u32?
    -> We allow to define a variable without initializing its expression, and we remove the 'undefined' expression, and keyword.
  • Should we make := its own operator?
    -> Will not be answered here / now.
  • If we make the postfix increment / decrement we couldn't make the array concatenation operator like in Zig, a ++ b, what do we do?
    -> Will not be answered here / now.

thi8v added 28 commits January 20, 2024 15:46
-> unterminated quote literal errors now show correctly the faulty error code
-> renamed the 'location' method of LogContext to 'line_col' because before it didn't make sense
With the new error system and the remake of the parser, the old error system is useless. So I removed it.
Some method have been renamed, 'add' into 'push' and 'add_raw' into 'push_raw' because it didn't make sense.
And creation of the LogStream type.
After removing the old parser, their is the new parser with the new error system.
It is intended to be simpler than it's predecessor.
Before you could simply return 'Error' and without ever pushing the error, now you return the error(s) of the lexing functions.
While remaking the parser, I redid the language's grammar. Their is 3 new keywords 'package', 'import',
and 'as'. Now a source file must contain a package clause, and maybe some import declarations.

I also added some methods to types, and changed the ParsingResult type.
…Parse trait

Before the macro was just looking for a "parse" method but now the macro explicitly call the "parse"
method of the "Parse" trait.
…tion

Back quotes were added because in error message seeing { without quotes maybe confusing
and so for consistency used back quotes: `.
It is used in error messages when an unexpected token was found, instead of saying e.g: "found char literal,
expected identifier, keyword `fn` or comma", say "found char literal, expected expression". It's shorter
and easier for the user to understand.
Before it only checks if the EOF token was poped now it checks if the EOF has been poped or the last
token is EOF.
Allowed `unreacheable_code` and `unused_variable` in expect_token! macro because
we may use a noreturn expression as $result.
@thi8v thi8v added this to the 0.1.0 milestone Feb 22, 2024
@thi8v thi8v linked an issue Feb 22, 2024 that may be closed by this pull request
21 tasks
@thi8v thi8v mentioned this pull request Mar 1, 2024
7 tasks
@thi8v thi8v merged commit a4120a2 into main Apr 6, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Improve the error handling in the compiler Implement the base
1 participant