Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoiding possible incompatible regexp features in future development #23021

Open
zherczeg opened this issue Feb 22, 2025 · 5 comments
Open

Avoiding possible incompatible regexp features in future development #23021

zherczeg opened this issue Feb 22, 2025 · 5 comments

Comments

@zherczeg
Copy link

I am sorry if this is not the right place for such discussion. Please let me know the right place for it.

In PCRE2 regular expression engine we have been adding some new regexp features, and it would be good if we could avoid incompatible features in the future, i.e. perl wil not use the syntax of them for something else. Feature flags could still be used, but it is better if we don't need to.

  • This one is already released. I think there is a low chance of reusing it.

Syntax: (*scan_substring:(CAPTURE_LIST)PATTERN) or (*scs:(CAPTURE_LIST)PATTERN)

More about it: https://zherczeg.github.io/sljit/scan_substring.html

  • The next one has a higher chance:

The (?PARNO) recursive subpattern syntax is extended with capture list: (?PARNO:CAPTURE_LIST). The capture list is a comma separated list of capturing brackets. The value of these captures are not restored after the recursive matching is completed.

This is not released, so the syntax can be changed.

CC @NWilson

@jkeenan
Copy link
Contributor

jkeenan commented Feb 23, 2025

I am sorry if this is not the right place for such discussion. Please let me know the right place for it.

Thank you for calling our attention to these developments. Since what you are in effect requesting is for Perl to take a certain development track going forward, at this point the best place to have this discussion is on the perl5-porters mailing list (https://www.nntp.perl.org/group/perl.perl5.porters/). That's because the initial stage of this discussion has to be seen by the widest range of people concerned with Perl's development. Once we get a consensus as to Perl's policy with respect to keeping development in synch with PCRE2 is, then we can use some mixture of our PPC process and this issue tracker to guide development.

@demerphq
Copy link
Collaborator

demerphq commented Feb 23, 2025 via email

@zherczeg
Copy link
Author

Thank you for the feedback! I remember we had discussions about the syntax several years ago, but I could not find where. It would be great to continue those plans. Perhaps setting up a low-traffic mailing list for it?

It looks like I totally misunderstood the naming of (*id: constructs. I thought capital letters are reserved for verbs exclusively, and lowercase letters for generic constructs. Perl has some: (*script_run: or (*pla:. In PCRE2, we have non-atomic versions, such as (*napla:.

The (*scan_substring:(CAPTURE_LIST)PATTERN) tried to be similar to conditional blocks: (?(condition)yes-pattern|no-pattern), the ? is replaced by *scan_substring:, which represents the "command", and the condition is extended to a list. I suspect this feature is less interesting for perl, since captures are available as variables, and code blocks can be nested into patterns.

(?PARNO:(CAPTURE_LIST)) was one of the variants we were discussing to use, so we will change the syntax. Honestly, any syntax is good for me as long as it is not overly complex.

@demerphq
Copy link
Collaborator

demerphq commented Feb 23, 2025 via email

@zherczeg
Copy link
Author

We have just released the code so we have at least six months before the next one. Plenty of time to make any decisions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants