Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: replace all Range with TSNode as much as possible #674

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ofseed
Copy link

@ofseed ofseed commented Aug 29, 2024

The Range type should only be used internally. To leverage more upstream API, we should convert it into a TSNode as much as possible.

@ofseed ofseed marked this pull request as draft August 29, 2024 15:31
@ofseed
Copy link
Author

ofseed commented Aug 29, 2024

@clason we could continue our discussion in #672 (comment) here.

@ofseed
Copy link
Author

ofseed commented Aug 29, 2024

Some textobjects need the make-range directive to be defined, and then we read them in metadata, so it's impossible to convert these Range into TSNode. Luckily there is already #612 which attempts to remove usages of this directive. @ribru17 Is my understanding of this PR correct?

@ribru17
Copy link
Member

ribru17 commented Aug 29, 2024

Pretty much, except I think there was a discussion to keep the predicate definition (although now unneeded) for backwards compatibility with user queries

@ribru17
Copy link
Member

ribru17 commented Aug 29, 2024

Also I haven't done a thorough review, but is this a fully safe change? Sometimes metadata.range is needed over node:range() because the latter doesn't account for range changes from something like #offset!

@ofseed
Copy link
Author

ofseed commented Aug 30, 2024

but is this a fully safe change?

This is what I am concerned about most too.

I think we can handle the problem with #offset!. #offset! tries to change the range of a TSNode, while #make-range! tries to create a new range, which has no corresponding TSNode. We can store TSMetadata like store TSNode.

@clason
Copy link
Collaborator

clason commented Aug 30, 2024

I think we always have to check the metadata; the question is just when (do we precompute that or not); that is the same as for any other query.

The difference here is that the original code uses the "metadata" ranges as primary objects, which could come from nodes or from #make-range!. So if the latter remains supported, we can't switch to a node-centric model. And if we drop support for it, the predicate is useless, and keeping it defined (but not working) would be a worse user experience than removing it (and telling users that relied on it for non-textobjects usage to just copy the old code in their config or plugin).

So the signs are starting to point to a full breaking rewrite on main and freezing master.

@ofseed
Copy link
Author

ofseed commented Aug 30, 2024

The difference here is that the original code uses the "metadata" ranges as primary objects,

Honestly, it is a rather complicated situation. It uses ranges from TSNode primarily, and stores the range from metadata. the range in metadata is a Range4, so only when functions need a Range4(and the range in metadata exists), the range in metadata will be returned, otherwise it will always use the range from TSNode. This part of the code should be a patch added later.

So If #make-range! could be fully removed, #offset! would not be a big problem. Because there are only 15 usages of #offset! and all of them are used to exclude or include some chars, like parenthesis or spaces. This means we still could use TSNode internally instead of Range because these chars should not have an impact on sorting or comparing them.

So the signs are starting to point to a full breaking rewrite on main and freezing master.

Indeed. At least #612 must be merged before we move on.

@ofseed
Copy link
Author

ofseed commented Aug 30, 2024

Brief summary, three sources of range:

  • From TSNode:range()
  • From metadata[capture_id].range, which is created by #offset!, could be treated as modified TSNode:range()
  • From #make-range!, no relevant TSNode object.

The first two sources could be handled if we use TSNode, but the last one does not.

@clason
Copy link
Collaborator

clason commented Aug 30, 2024

Because there are only 15 usages of #offset! and all of them are used to exclude or include some chars, like parenthesis or spaces.

In our queries. Experience has taught me the hard way that user configs (and third-party plugins) can be full of all kinds of stuff...

@ofseed
Copy link
Author

ofseed commented Aug 30, 2024

Since the main branch has already broken user configurations, it wouldn't be a particularly big issue if it continues to break user queries, lol.

However, one point I want to discuss is whether, if #make-range! is removed, we can still transition from Range to TSNode while preserving all existing features. The current discussion leads me to believe that this is possible.

@clason
Copy link
Collaborator

clason commented Aug 30, 2024

Since the main branch has already broken user configurations, it wouldn't be a particularly big issue if it continues to break user queries, lol.

Well, that's not the right way of looking at this. Right now, main is a new plugin that users can but need not use. The breakage will happen when we freeze master and make main the default branch, which is what we are talking about doing since nobody wants to maintain two separate plugins (especially queries, and freezing master will make it unusable as soon as the next breaking parser update happens in nvim-treesitter).

TL;DR: We can and should do this, but it's absolutely not a "lol"ing matter.

@clason clason added the NEXT issues and PRs relating to the 1.0 refactor on `main` label Aug 30, 2024
@ofseed
Copy link
Author

ofseed commented Aug 30, 2024

Yes, that might be a joke that isn't funny, queries could not be synced between two branches after that.

But if we are certain about making this change, we could first remove #make-range in the master branch. This will likely only require changes to the query, because after this, they will become TSNode, and the code on the master branch will still be able to handle them.

@clason
Copy link
Collaborator

clason commented Aug 30, 2024

But if we are certain about making this change, we could first remove #make-range in the master branch. This will likely only require changes to the query, because after this, they will become TSNode, and the code on the master branch will still be able to handle them.

That would also get quick feedback on outside use of #make-range!. But I'm also OK with making the switch early, but that requires adding some mapping helpers to reduce the unacceptable setup boilerplate on main (which is the only thing missing before we could declare it as an "early access" replacement for master).

However, one point I want to discuss is whether, if #make-range! is removed, we can still transition from Range to TSNode while preserving all existing features. The current discussion leads me to believe that this is possible.

Yes, that is the main design decision we need to make. And I agree that given the current information, this looks possible and desirable. Personally, I think #make-range! has value but is so intrusive, it should be handled upstream (ideally tree-sitter); if textobjects don't need it, we should not provide it.

@ofseed
Copy link
Author

ofseed commented Sep 28, 2024

@clason Could you please take a look at e83e3eb? Previously, we were using #make-range!, so the corresponding range didn’t have a capture, I think this is why we used iter_matches. However, #612 removed #make-range!, switching to iter_captures should be safe since now a capture, which is a TSNode, should correspond to the text object we need. I haven’t maintained the queries in this repository, so I’m not 100% sure.

@clason
Copy link
Collaborator

clason commented Sep 28, 2024

Looks good to me; if this is incorrect, it should fail pretty badly (easy to manually test with one of the quantified captures here).

@ribru17 ?

@ofseed
Copy link
Author

ofseed commented Sep 28, 2024

Oh, I realized there is indeed an issue with this change regarding quantified captures. My previous manual tests were too simple, sorry about that.

Quantified captures don’t combine multiple nodes into a single node; instead, they match these nodes separately while keeping them within the same pattern. Therefore, we need to use iter_matches to handle them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NEXT issues and PRs relating to the 1.0 refactor on `main`
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants