-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tables #564
Tables #564
Conversation
✅ Deploy Preview for salsa-rs canceled.
|
CodSpeed Performance ReportMerging #564 will improve performances by ×2.7Comparing Summary
Benchmarks breakdown
|
Wow, very impressive (I haven't read through the code yet). One understand question:
By salsa struct. Does this apply to both inputs and tracked structs or only tracked structs? |
It's interesting that some of the other benchmarks regress. |
|
||
#[allow(type_alias_bounds)] | ||
type ArcMemo<'lt, C: Configuration> = ArcSwap<Memo<<C as Configuration>::Output<'lt>>>; | ||
pub(super) type ArcMemo<'lt, C: Configuration> = Arc<Memo<<C as Configuration>::Output<'lt>>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the remaining use case for ArcMemo
? Aren't we now allocating the data in the pages or do the pages only store references to the arc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this PR, the pages store a reference to the Arc<Memo>
. I was debating about that, whether we could replace it with an Id
of its own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, my understanding was that the main change in this PR is that memos are no longer stored in Arc
s, instead they're stored directly in Page
s to get an arena-like allocation. I remembered that it is important to you that all page-headers have the same layout. Is this the reason why we're keeping an Arc
(or Id`) or are there other reasons for not storing the value in the pages? Or am I misunderstanding the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly. I need to write up some docs on the layout. The big advantage in this PR though is that we do (on the fast path) two array lookups instead of a hash. But the memoized data is still stored in "arcs".
You might already be aware of it: Constant queries no-longer compile: /// Salsa query to get the builtins scope.
///
/// Can return None if a custom typeshed is used that is missing `builtins.pyi`.
#[salsa::tracked]
pub(crate) fn builtins_scope(db: &dyn Db) -> Option<ScopeId<'_>> {
let builtins_name =
ModuleName::new_static("builtins").expect("Expected 'builtins' to be a valid module name");
let builtins_file = resolve_module(db, builtins_name)?.file();
Some(global_scope(db, builtins_file))
}
|
Salsa struct = interned | input | tracked |
What regresses? |
Hmm. Seems like we are missing a test. I did change how constant queries are implemented (it's a bit less efficient now than it was, as it winds up with a silly hashmap; could be fixed, just didn't). |
Here: https://codspeed.io/salsa-rs/salsa/branches/nikomatsakis:tables. For whatever reason, the comment doesn't show that change; you have to click into the report.
|
It looks like CodSpeed considers those "regressions" to be within the margin of error/noise. I don't know how sophisticated the statistics are that CodSpeed uses to make those decisions, but I'm also not sure that we should read too much into results that CodSpeed has decided are not significant. |
Gotcha! The thing I'm unsure of is whether criterion considers these to be noise. Does codespeed delegate to criterion on that front? |
Yeah, good q, I don't know... |
There's a settings page and it defaults to a 10 margin. It also supports commenting on improvements/regressions only |
My hunch is that this branch is slightly worse on some microbenchmarks (among other things, it is using array lookups instead of pure pointers), but much better in more complex scenarios by avoiding centralized hashing. |
That makes sense. A single-digit regression on those benchmarks is feels like it's fine and nothing to be concerned about. |
Is my understanding correct that this is achieved by using the salsa struct ID as ID into the query cache and is based on the assumption that a query is "dense" (likely to be called for all arguments)? |
Not exactly. Each tracked function is assigned a |
This also retools a tiny bit how deletion works. We will reuse ids faster than before, actually.
This returns the memos attached to a given slot. Not all slots have affiliated memos, so return an `Option`.
This will allow us to invoke callbacks when deleting a memo with `Arc<dyn Any>` values.
The goal here is that ALL `Id` values come from a `Table`
We want to ensure that accessing the memos only occurs in revision R after the struct is created.
`Id` values are used in a very tailored way now, no reason to let people construct arbitrary ones.
Rebased at @davidbarsky's request. Once the CI stuff re-runs I'll merge this, presuming it looks good. |
This is what I see on codspeed: |
thanks for rebasing! those numbers look good for now; i think this branch is fine to land for now? things can be improved after. |
Hmm, I think there's still the issue with const queries not compiling. |
@MichaReiser made an issue: #565 |
This PR modifies the way we store salsa struct to use a central table/page system. The results of tracked functions are now stored in memos/syncs attached to a particular slot.
This lays the foundation for serialization/deserialization as well as speculative execution by using copy-on-write pages, but it also avoids a ton of centralized hashing. Tracked functions that take a single salsa struct as argument now avoid hashtable lookups altogether.
In the future I'd like to improve the way that tracked functions with >1 argument work by extending the memo table to optionally store a per-struct hashmap, but that can be a separate PR.