-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizer fast path #28715
Optimizer fast path #28715
Conversation
3af68c6
to
d4d3d18
Compare
MitigationsCompleting required mitigations increases Resilience Coverage.
Risk Summary:The pull request has a moderate risk score of 73, driven by predictors such as the number of files modified, the length of the PR description, the number of function declarations within files, and the delta of statements. Historically, PRs with these predictors are 55% more likely to cause a bug than the repository baseline. Additionally, the observed bug trend in the repository is decreasing. Note: The risk score is not based on semantic analysis but on historical predictors of bug occurrence in the repository. The attributes above were deemed the strongest predictors based on that history. Predictors and the score may change as the PR evolves in code, time, and review activity. |
d4d3d18
to
030e37d
Compare
030e37d
to
89b631b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adapter code LGTM
@@ -409,7 +409,7 @@ ws-text | |||
ws-text | |||
{"query": "SELECT 1 / 0 FROM mz_sources LIMIT 1"} | |||
---- | |||
{"type":"Notice","payload":{"message":"{\n \"plans\": {\n \"raw\": {\n \"text\": \"Finish limit=1 output=[#0]\\n Project (#15)\\n Map ((1 / 0))\\n Get mz_catalog.mz_sources\\n\\nTarget cluster: mz_catalog_server\\n\",\n \"json\": {\n \"Project\": {\n \"input\": {\n \"Map\": {\n \"input\": {\n \"Get\": {\n \"id\": {\n \"Global\": {\n \"System\": 456\n }\n },\n \"typ\": {\n \"column_types\": [\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"Oid\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": {\n \"Array\": \"MzAclItem\"\n },\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n }\n ],\n \"keys\": [\n [\n 0\n ],\n [\n 1\n ]\n ]\n }\n }\n },\n \"scalars\": [\n {\n \"CallBinary\": {\n \"func\": \"DivInt32\",\n \"expr1\": {\n \"Literal\": [\n {\n \"data\": [\n 42,\n 1\n ]\n },\n {\n \"scalar_type\": \"Int32\",\n \"nullable\": false\n }\n ]\n },\n \"expr2\": {\n \"Literal\": [\n {\n \"data\": [\n 41\n ]\n },\n {\n \"scalar_type\": \"Int32\",\n \"nullable\": false\n }\n ]\n }\n }\n }\n ]\n }\n },\n \"outputs\": [\n 15\n ]\n }\n }\n },\n \"optimized\": {\n \"global\": {\n \"text\": \"t75:\\n Finish limit=1 output=[#0]\\n ArrangeBy keys=[[#0]]\\n ReadGlobalFromSameDataflow t74\\n\\nt74:\\n Map (error(\\\"division by zero\\\"))\\n Project ()\\n ReadIndex on=mz_sources mz_sources_ind=[*** full scan ***]\\n\\nTarget cluster: mz_catalog_server\\n\",\n \"json\": {\n \"plans\": [\n {\n \"id\": \"t75\",\n \"plan\": {\n \"ArrangeBy\": {\n \"input\": {\n \"Get\": {\n \"id\": {\n \"Global\": {\n \"Transient\": 74\n }\n },\n \"typ\": {\n \"column_types\": [\n {\n \"scalar_type\": \"Int32\",\n \"nullable\": false\n }\n ],\n \"keys\": []\n },\n \"access_strategy\": \"SameDataflow\"\n }\n },\n \"keys\": [\n [\n {\n \"Column\": 0\n }\n ]\n ]\n }\n }\n },\n {\n \"id\": \"t74\",\n \"plan\": {\n \"Map\": {\n \"input\": {\n \"Project\": {\n \"input\": {\n \"Get\": {\n \"id\": {\n \"Global\": {\n \"System\": 456\n }\n },\n \"typ\": {\n \"column_types\": [\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"Oid\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": {\n \"Array\": \"MzAclItem\"\n },\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n }\n ],\n \"keys\": [\n [\n 0\n ],\n [\n 1\n ]\n ]\n },\n \"access_strategy\": {\n \"Index\": [\n [\n {\n \"System\": 697\n },\n \"FullScan\"\n ]\n ]\n }\n }\n },\n \"outputs\": []\n }\n },\n \"scalars\": [\n {\n \"Literal\": [\n {\n \"Err\": \"DivisionByZero\"\n },\n {\n \"scalar_type\": \"Int32\",\n \"nullable\": false\n }\n ]\n }\n ]\n }\n }\n }\n ],\n \"sources\": []\n }\n },\n \"fast_path\": {\n \"text\": \"Explained Query (fast path):\\n Finish limit=1 output=[#0]\\n Project (#15)\\n Map (error(\\\"division by zero\\\"))\\n ReadIndex on=mz_catalog.mz_sources mz_sources_ind=[*** full scan ***]\\n\\nTarget cluster: mz_catalog_server\\n\",\n \"json\": {\n \"plans\": [\n {\n \"id\": \"Explained Query (fast path)\",\n \"plan\": {\n \"PeekExisting\": [\n {\n \"System\": 456\n },\n {\n \"System\": 697\n },\n null,\n {\n \"mfp\": {\n \"expressions\": [\n {\n \"Literal\": [\n {\n \"Err\": \"DivisionByZero\"\n },\n {\n \"scalar_type\": \"Int32\",\n \"nullable\": false\n }\n ]\n }\n ],\n \"predicates\": [],\n \"projection\": [\n 15\n ],\n \"input_arity\": 15\n }\n }\n ]\n }\n }\n ],\n \"sources\": []\n }\n }\n }\n },\n \"insights\": {\n \"imports\": {\n \"s697\": {\n \"name\": {\n \"schema\": \"mz_catalog\",\n \"item\": \"mz_sources_ind\"\n },\n \"type\": \"compute\"\n }\n },\n \"fast_path_clusters\": {},\n \"fast_path_limit\": null,\n \"persist_count\": []\n },\n \"cluster\": {\n \"name\": \"mz_catalog_server\",\n \"id\": {\n \"System\": 2\n }\n },\n \"redacted_sql\": \"SELECT '<REDACTED>' / '<REDACTED>' FROM [s456 AS mz_catalog.mz_sources] LIMIT '<REDACTED>'\"\n}","code":"MZ001","severity":"notice"}} | |||
{"type":"Notice","payload":{"message":"{\n \"plans\": {\n \"raw\": {\n \"text\": \"Finish limit=1 output=[#0]\\n Project (#15)\\n Map ((1 / 0))\\n Get mz_catalog.mz_sources\\n\\nTarget cluster: mz_catalog_server\\n\",\n \"json\": {\n \"Project\": {\n \"input\": {\n \"Map\": {\n \"input\": {\n \"Get\": {\n \"id\": {\n \"Global\": {\n \"System\": 456\n }\n },\n \"typ\": {\n \"column_types\": [\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"Oid\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": {\n \"Array\": \"MzAclItem\"\n },\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n }\n ],\n \"keys\": [\n [\n 0\n ],\n [\n 1\n ]\n ]\n }\n }\n },\n \"scalars\": [\n {\n \"CallBinary\": {\n \"func\": \"DivInt32\",\n \"expr1\": {\n \"Literal\": [\n {\n \"data\": [\n 42,\n 1\n ]\n },\n {\n \"scalar_type\": \"Int32\",\n \"nullable\": false\n }\n ]\n },\n \"expr2\": {\n \"Literal\": [\n {\n \"data\": [\n 41\n ]\n },\n {\n \"scalar_type\": \"Int32\",\n \"nullable\": false\n }\n ]\n }\n }\n }\n ]\n }\n },\n \"outputs\": [\n 15\n ]\n }\n }\n },\n \"optimized\": {\n \"global\": {\n \"text\": \"t75:\\n Finish limit=1 output=[#0]\\n ArrangeBy keys=[[#0]]\\n ReadGlobalFromSameDataflow t74\\n\\nt74:\\n Project (#15)\\n Map (error(\\\"division by zero\\\"))\\n ReadIndex on=mz_sources mz_sources_ind=[*** full scan ***]\\n\\nTarget cluster: mz_catalog_server\\n\",\n \"json\": {\n \"plans\": [\n {\n \"id\": \"t75\",\n \"plan\": {\n \"ArrangeBy\": {\n \"input\": {\n \"Get\": {\n \"id\": {\n \"Global\": {\n \"Transient\": 74\n }\n },\n \"typ\": {\n \"column_types\": [\n {\n \"scalar_type\": \"Int32\",\n \"nullable\": false\n }\n ],\n \"keys\": []\n },\n \"access_strategy\": \"SameDataflow\"\n }\n },\n \"keys\": [\n [\n {\n \"Column\": 0\n }\n ]\n ]\n }\n }\n },\n {\n \"id\": \"t74\",\n \"plan\": {\n \"Project\": {\n \"input\": {\n \"Map\": {\n \"input\": {\n \"Get\": {\n \"id\": {\n \"Global\": {\n \"System\": 456\n }\n },\n \"typ\": {\n \"column_types\": [\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"Oid\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": false\n },\n {\n \"scalar_type\": {\n \"Array\": \"MzAclItem\"\n },\n \"nullable\": false\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n },\n {\n \"scalar_type\": \"String\",\n \"nullable\": true\n }\n ],\n \"keys\": [\n [\n 0\n ],\n [\n 1\n ]\n ]\n },\n \"access_strategy\": {\n \"Index\": [\n [\n {\n \"System\": 697\n },\n \"FullScan\"\n ]\n ]\n }\n }\n },\n \"scalars\": [\n {\n \"Literal\": [\n {\n \"Err\": \"DivisionByZero\"\n },\n {\n \"scalar_type\": \"Int32\",\n \"nullable\": false\n }\n ]\n }\n ]\n }\n },\n \"outputs\": [\n 15\n ]\n }\n }\n }\n ],\n \"sources\": []\n }\n },\n \"fast_path\": {\n \"text\": \"Explained Query (fast path):\\n Finish limit=1 output=[#0]\\n Project (#15)\\n Map (error(\\\"division by zero\\\"))\\n ReadIndex on=mz_catalog.mz_sources mz_sources_ind=[*** full scan ***]\\n\\nTarget cluster: mz_catalog_server\\n\",\n \"json\": {\n \"plans\": [\n {\n \"id\": \"Explained Query (fast path)\",\n \"plan\": {\n \"PeekExisting\": [\n {\n \"System\": 456\n },\n {\n \"System\": 697\n },\n null,\n {\n \"mfp\": {\n \"expressions\": [\n {\n \"Literal\": [\n {\n \"Err\": \"DivisionByZero\"\n },\n {\n \"scalar_type\": \"Int32\",\n \"nullable\": false\n }\n ]\n }\n ],\n \"predicates\": [],\n \"projection\": [\n 15\n ],\n \"input_arity\": 15\n }\n }\n ]\n }\n }\n ],\n \"sources\": []\n }\n }\n }\n },\n \"insights\": {\n \"imports\": {\n \"s697\": {\n \"name\": {\n \"schema\": \"mz_catalog\",\n \"item\": \"mz_sources_ind\"\n },\n \"type\": \"compute\"\n }\n },\n \"fast_path_clusters\": {},\n \"fast_path_limit\": null,\n \"persist_count\": []\n },\n \"cluster\": {\n \"name\": \"mz_catalog_server\",\n \"id\": {\n \"System\": 2\n }\n },\n \"redacted_sql\": \"SELECT '<REDACTED>' / '<REDACTED>' FROM [s456 AS mz_catalog.mz_sources] LIMIT '<REDACTED>'\"\n}","code":"MZ001","severity":"notice"}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this test useful for anyone? I have no idea what it was testing, what's changed in it, or if it's correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the last paragraph of the PR description.
89b631b
to
7a3157d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks great, and it has a positive performance impact. It reduces the average query latency. As expected, the tail latency stays largely unchanged.
// whole optimizer pipeline, but just a tiny subset of it. (But we'll need to run | ||
// `create_fast_path_plan` later again, because, e.g., running `LiteralConstraints` is still | ||
// ahead of us.) | ||
let use_fast_path_optimizer = match create_fast_path_plan( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be a bit wasteful to create the fast path plan here and then run through the optimization pipeline again? Especially when we already here determine it's a constant or simple peek, we could probably use the result instead of running the optimizer at all? I'm fine with merging as-is, but we should take note.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
run through the optimization pipeline again
Do you mean "again", in the sense that we have already run the optimize_mir_local
optimizer above? Running optimize_dataflow
below still does meaningful things, i.e., not all the things that we want to happen for fast path peeks are done in optimize_mir_local
. The most important thing that we still need to do is LiteralConstraints
, which can turn a full index scan into an index lookup. (And then we need CanonicalizeMfp
again, because LiteralConstraints
can leave the MFP in a non-canonical form. And we also need FoldConstants
again, because LiteralConstraints
might have got us a constant, if it found contradicting literal constraints. And ReduceScalars
is just good to run all the time for various obscure reasons, see e.g. the last paragraph of 3. here.)
(Note that create_fast_path_plan
doesn't run any optimizations. It should be a cheaper thing than even the fast_path_optimizer
.)
A possible improvement that we could still do here is to not even run the optimize_mir_local
for peeks that look like they are gonna be fast path, but just run fast_path_optimizer
on the initial MIR plan. I tried to do this when originally working on this PR, and ran into some minor troubles, which I can't recall at the moment (maybe it was just that it would have required a somewhat larger refactoring of sequencing), so I just decided to keep this PR simple. But if the time comes when the optimizer run time becomes a significant portion of the run time of fast path peeks again, due to optimizations in other parts of the system, then I'm happy to revisit this. (I think I did some measurements with just hackily commenting out optimize_mir_local
, and it showed that the run time of optimize_mir_local
is significantly lower than the full optimizer pipeline, which this PR is eliminating.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've created an issue for eliminating optimize_mir_local
from fast path peeks, so that this doesn't get forgotten: https://github.com/MaterializeInc/materialize/issues/29501
7a3157d
to
72a8483
Compare
Edit: I've done some benchmarking in release mode now, and the performance improvement in release mode is quite small (see here), so I'm now not sure whether the complexity that this PR is introducing is actually worth it. We might want to hold off on merging this, so taking back to draft.
Edit 2: Now with @antiguru's other improvements, the speedup from this might be worth it. So marking ready for review.
This PR is about running only a tiny subset of the optimizer for fast path queries. The approach is that when we are about to call
optimize_dataflow
, we quickly check whether we already have a simple enough plan that it would go to the fast path. If yes, then we call a special, tiny optimizer instead of the whole thing.Commits:
create_fast_path_plan
, we haven't yet eliminatedmz_now()
, andcreate_fast_path_plan
might choke on that. This commit makes us ignore this error and just not take the fast path optimizer.There is a tricky reason for the change in
src/environmentd/tests/testdata/http/ws
, I'm investigating the details now. It's probably not an issue, since the actual fast path plan that we arrive at is the same, just the non-fast-path plan is slightly different, but even there the physical plan would be the same, the diff is only in the MIR plan. Edit: Figured out exactly what's happening: The reason for the diff is that in the normal optimizer pipeline we callProjectionPushdown
, which puts the MFP in a non-canonical state (Project
at the bottom), which is usually canonicalized byCanonicalizeMfp
, but hereCanonicalizeMfp
is not able to put back theProject
at the top for this query, because it callsextract_non_errors_from_expr_mut
, which gives up when it sees the error in this MFP. But this doesn't matter becausecreate_fast_path_plan
then calls a different MFP extraction method:extract_from_expression
, so the fast path plan will have the same MFP as before. (AndProjectionPushdown
is not doing anything useful here, because MFP optimization will take care of the projections.)Motivation
https://materializeinc.slack.com/archives/C0761MZ3QD9/p1722545887422389?thread_ts=1722544237.348389&cid=C0761MZ3QD9
In my simplistic preliminary benchmarks (running a SELECT with a simple literal constraint on an indexed view locally in debug mode), I'm seeing a run time reduction from 20ms to 11ms. Edit: The difference was smaller in release mode. Edit 2: We should benchmark again, now that other parts of the system have been optimized in the meantime.
Tips for reviewer
Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way), then it is tagged with aT-proto
label.