-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CachedViews #146
base: main
Are you sure you want to change the base?
Add CachedViews #146
Conversation
Awesome; this is a super cool idea! I’d love to incorporate this. Here are some initial high-level impressions:
I don’t think we need to worry about py2 support; we should officially drop that if we haven’t already. |
Thanks for giving this a look!
Will do.
There is a slight confusion in the naming as I mentioned in the first TODO. The "invalid" state in the code today means that the underlying view itself has been invalidated - the key doesn't exist due to some structural modification. In this case, it is not possible to get any value from the current handle, no matter the template. We have to raise an exception in this case. See unit tests for some examples. Do note that this is will be a pretty uncommon occurance- the code is here for the sake of completeness. For cases when the view does remain valid, we keep the value as How about renaming the current
I did explore this API initially, but as you noticed, it will require keeping a mapping from templates to values inside a view. The major issue here would be hashing the template on every call to
I see that the Build CI has failed on the Py2.7 setup because of the new type annotation syntax. Could you please specify what versions of Python would you like to target going forward? (I'd prefer >=3.6 due to the type syntax changes) |
Aha, thank you for explaining! I obviously read your original explanation of the "invalid" state too quickly. This makes sense! I suppose I like the idea of renaming the state. Maybe something like "missing" would be most evocative? (I know this is a stretch, but as someone who thinks about cache coherence protocols in my day job, I tend to think of "invalid" as meaning merely "I don't have the data locally, but I know where to go get it if anybody needs it." Not that this is the normal usage in the real world, of course…) And good point about the CI and Python 2. Yes, the new minimum should be 3.6 (matching beets). I'll work on updating the docs, metadata, and Actions config… |
Also removes the ConfigHandleInvalidatedError since it is the same as NotFoundError.
I realized that Do we want to support providing a default value for this case as an arg to |
That unification of exception types seems like a good idea to me! If I understand correctly, I think it's probably not necessary to add a second "layer" of defaults… that is, hopefully, you can get all the default behavior you want by using appropriate templates and fallback sources. Maybe put differently, it seems like the "law" should be that |
This matches the behaviour of view.get(template)
That does sound very clean. I have updated the code to call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great!! Here are a few comments/thoughts… one of which is somewhat philosophical; I apologize if that one went off the deep end.
confuse/cache.py
Outdated
_INVALID = object() | ||
_MISSING = object() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe some short comments on these to explain what they mean would be helpful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
# keep track of all the handles from this view | ||
self.handles: List[CachedHandle] = [] | ||
# need to cache the subviews to be able to access their handles | ||
self.subviews: Dict[str, CachedConfigView] = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this is a good idea, but just to try it out here: would it simplify things at all to keep the list of handles only on the RootView
? That is, any time any data changes anywhere in the entire configuration, we'd invalidate everything. I know this is possibly a bit more inefficient, but it would centralize the tracking of handles and avoid the need for all views to keep track of their subviews for invalidation purposes.
The reason I'm slightly nervous (perhaps unfoundedly) about the latter thing is that we have never before required that subviews be unique… that is, doing config['foo']
twice in "normal" Confuse could give you back two different objects (equivalent objects, but different ones). These objects were short-lived and disposable; they generally did not stick around. Now we're saying that they must stick around, and in fact, that doing config['foo']
twice always gives you back exactly the same memoized view object. I'm just a tad worried that this means that, if there were some other way of constructing a view object onto the same "place" in the configuration hierarchy (such as via iteration?), it would not enjoy the same invalidation benefits. So this seems to place a few more restrictions on the way that views can be created, stored, and used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should mention that I could perhaps try prototyping this idea if it seems plausible to you!
Two things-
|
@sampsyo Do we want to resume this? |
Motivation
view.get(template)
in the global scope once when the module loads.Insight
The main pattern to notice is that we are extremely likely to use the same template to parse any given view over time. Once we decide the specific template for a view, we want to keep calling
get
with the same template over and over again (in case the value has changed).Method
This calls for the introduction of a caching layer that can store the values for expensive templates. In most cases, the config
get
s are much more frequent thanset
s, so it will benefit from such caching.API
ConfigView
, called theCachedConfigView
.get_handle(self, template)
method. Instead of computing the value of the view according to the given template eagerly, it returns aCachedConfigHandle
object that can be used as a proxy for the value.get
on aCachedConfigHandle
, the template is applied to the view, and the value is stored internally. All subsequentget
s can re-use this cached value.get
on them will cause a re-computation of the value.get
directly on aview
for cases when the caching doesn't make sense (cheap to compute values).Example
To view this in action, see the
ipc
branch of my app.You can also get an overview of the API from the unit tests.
TODO
This is just a draft PR, tailored to my specific use-case. I am sure I will have missed some edge cases.
invalidate
andunset
is misleading, and does not fit the usual meanings - an invalidated cache is usually supposed to be fixable by performing a recomputation, but that isunset
here. Theinvalidated
here means the underlying view itself has been rendered invalid, probably because some of its parents' values changed without preserving the structure. Need to find better names for these. (Can't tackle both the hard things in Computer Science at once 😮💨)get_handle
on theRootView
?I will be happy to clean it up after an initial review from your end.
Related
This PR is kind of the other half of #130. Once this is done, I probably will need to solve the "persist changes to disk" problem too.