Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: op log: add --bookmark option to filter operations by bookmark #5316

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

bnjmnt4n
Copy link
Member

@bnjmnt4n bnjmnt4n commented Jan 9, 2025

Checklist

If applicable:

  • I have updated CHANGELOG.md
  • I have updated the documentation (README.md, docs/, demos/)
  • I have updated the config schema (cli/src/config-schema.json)
  • I have added tests to cover my changes

@bnjmnt4n bnjmnt4n force-pushed the bnjmnt4n/push-nyrwlqqkowpz branch from 9494f02 to bb8ff6d Compare January 12, 2025 00:01
@bnjmnt4n bnjmnt4n changed the base branch from main to bnjmnt4n/push-mxksouwxwotu January 12, 2025 00:04
@bnjmnt4n bnjmnt4n marked this pull request as ready for review January 12, 2025 00:04
@bnjmnt4n
Copy link
Member Author

I've wrote a preliminary version of this feature, which allows filtering op log entries by bookmark. Unfortunately it's not very performant, because it requires doing a toposort of the whole op log before filtering out irrelevant entries. In contrast, the plain (unfiltered) op log has a lazy iterator to go through linear portions of the graph, which I think would be large portions of the graph for most.

I'm not sure if there is a way to correctly construct the graph with elided revisions in a lazy manner? My understanding (do correct me if I'm wrong) is that the regular jj log is able to construct the graph lazily due to it already knowing the items in the "input set", as well as the use of the index and generation count.

I'm also not that familiar with graphs, so perhaps there are better algorithms for this?

@bnjmnt4n bnjmnt4n force-pushed the bnjmnt4n/push-nyrwlqqkowpz branch from bb8ff6d to 824aa3c Compare January 12, 2025 00:19
@bnjmnt4n bnjmnt4n force-pushed the bnjmnt4n/push-mxksouwxwotu branch from 2738323 to 0f7cee6 Compare January 12, 2025 01:38
Base automatically changed from bnjmnt4n/push-mxksouwxwotu to main January 12, 2025 02:03
@bnjmnt4n bnjmnt4n force-pushed the bnjmnt4n/push-nyrwlqqkowpz branch from 824aa3c to 126517c Compare January 12, 2025 02:08
@yuja
Copy link
Contributor

yuja commented Jan 12, 2025

Maybe we can apply the same workaround as the lazy iterator? It just splits the graph into linear and non-linear parts, and processes non-linear part eagerly.

FWIW, we'll need opset language at some point, but adding --bookmark flag seems okay until then.

@bnjmnt4n
Copy link
Member Author

Maybe we can apply the same workaround as the lazy iterator? It just splits the graph into linear and non-linear parts, and processes non-linear part eagerly.

I was thinking the same, but I'll probably need a bit more time to go through the code to understand and adapt the algorithm.

FWIW, we'll need opset language at some point, but adding --bookmark flag seems okay until then.

I also wanted to add the ability to filter by change ID, so perhaps the opset language might come in sooner rather than later?

This will be used in a subsequent commit to allow rendering elided nodes
in the operation log.
This allows a graph to be constructed in reverse toplogical order, while
allowing nodes to be filtered out.
@bnjmnt4n bnjmnt4n force-pushed the bnjmnt4n/push-nyrwlqqkowpz branch from 126517c to c9987ae Compare January 12, 2025 10:35
@yuja
Copy link
Contributor

yuja commented Jan 12, 2025

I was thinking the same, but I'll probably need a bit more time to go through the code to understand and adapt the algorithm.

Since op log is usually linear, it's probably okay to not implement skip_transitive_edges. It might help reuse the existing code by somehow pre-processing the input op -> parents chains.

I also wanted to add the ability to filter by change ID, so perhaps the opset language might come in sooner rather than later?

It might be also okay to add minimal opset parser (something like strip_prefix("bookmarks(") ...)

@bnjmnt4n
Copy link
Member Author

Is the eventual intention to get something like jj op log --op "@ | id1..id2 working?

Since op log is usually linear, it's probably okay to not implement skip_transitive_edges. It might help reuse the existing code by somehow pre-processing the input op -> parents chains.

This is referring to the code in RevsetGraphWalk right? Without an op store index, I don't think it's possible to easily reuse that code because it depends on the input being already sorted?

Unless we intend to create an operation store index, I was thinking we could attempt to resolve any resolvable ops in the opset first (apart from bookmarks(x) for now), then iterate through the op graph and filter ops if they are not in the input set and if the bookmarks predicate doesn't apply. I think I might be able to convert the filtering to a lazy method with some time.

@yuja
Copy link
Contributor

yuja commented Jan 13, 2025

Is the eventual intention to get something like jj op log --op "@ | id1..id2 working?

Yes, it's nice if op log supports arbitrary range/filter query.

Since op log is usually linear, it's probably okay to not implement skip_transitive_edges. It might help reuse the existing code by somehow pre-processing the input op -> parents chains.

This is referring to the code in RevsetGraphWalk right? Without an op store index, I don't think it's possible to easily reuse that code because it depends on the input being already sorted?

I meant dag_walk::topo_order_reverse_lazy_ok() might be reusable by filtering the inputs. To achieve skip_transitive_edges = true, the graph iterator would have to know whether the edge is direct or not, but that wouldn't be needed for op log where the graph is usually linear so it's rare that filtered graph had redundant edges.

Unless we intend to create an operation store index, I was thinking we could attempt to resolve any resolvable ops in the opset first (apart from bookmarks(x) for now), then iterate through the op graph and filter ops if they are not in the input set and if the bookmarks predicate doesn't apply. I think I might be able to convert the filtering to a lazy method with some time.

I think most part of "opset" expression will be transformed to Fn(&Opration) -> bool function. Some common range expression (like ::<head> & ..) might be extracted to optimize range query.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants