-
Notifications
You must be signed in to change notification settings - Fork 648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NEW] OBSERVE
command for enhanced observability in Valkey
#1167
Comments
I like the concepts and directionality here. I think it would be profitable to split this into two subsections. One section would be to get more specific on the events that would feed into the observability framework. A second section would focus on the representation and processing of those events. I mention this because there's quite a bit of overlap in the functionality of the second section and any implementation of a timestream processing module. In other words, can we get both timestream processing of the observability event stream (part 1) and more generic timestream data processing capabilities in the same development effort or conversely split the development effort into two parts that cooperate? |
@allenss-amazon Thank you for the insightful feedback! We’re definitely on the same page about maintaining flexibility for future enhancements, particularly around timestream processing and event-based observability. For now, our approach is to keep the implementation streamlined and tailored to Valkey’s current state of the codebase. By initially focusing on passing command execution details directly as the input to our observability pipeline, we’ll establish a foundation that can be then adapted for timestream events in the future. If and when we introduce events, we could replace or extend the input system for our pipeline to accommodate event-based data, allowing for similar processing while offering additional data input types. Keeping a direct dependency between command execution and the observability implementation will help us maintain a simple architecture and deliver a more focused solution in the short term. I’ve also included a simplified diagram of this first-approach implementation to illustrate the flow we envision. I’d love to hear your thoughts on whether this approach makes sense. What do you think? Thanks again for the input—we really appreciate the foresight around extensibility here! |
I think it's important to provide an architecture that's lightweight enough that it could be enabled nearly everywhere -- maybe even by default. That means that the data collection needs to be fast. I think it's important to get more specific about the kinds of data events you're going to rely on and how this data is generated. |
Absolutely. Performance is one of the primary goals. My initial approach is to check whether any observe functionality needs to do something after command executes. Currently, this creates an typedef struct observeUnit {
int command_id;
robj **argv;
size_t argv_len;
size_t response_size_bytes;
long long duration_microseconds;
} observeUnit; If the observe functionality is disabled or unconfigured, there’s no impact on command execution performance—it’s just a simple check of a boolean flag. If it’s enabled, I construct this unit on the stack and attempt to process it through the pipeline. (Note that the pipeline processing part hasn’t been implemented yet.)
At this early stage, the structs feel lightweight and execution should be 'fast', but I anticipate that 'data gathering' may need additional complexity. Currently, @allenss-amazon What do you think about my approach? Do you have any suggestions for enhancing 'data gathering' for observability? |
I'm skeptical of the "one size fits all" interface style here. For example, using the end of call as your insertion point will miss all client commands that block. But that clearly can be fixed by tapping into the unblocking machinery also. This also misses all non-command related activity -- cluster bus, evictions, etc. Which I think could also be quite interesting. I'd propose that we implement a mechanism to self-monitor the existing metrics in the core. For example, having a periodic task to execute an "info" command and collect the various values into samples to feed into the machinery could also be quite valuable. The automatic info scheme has a low enough overhead (you adjust the frequency of collection to match your CPU wallet ;-)) that it could reasonably be left on in any production environment. It also creates an incentive to increase instrumentation in the core. |
The primary goal is to design observability pipelines with enough flexibility to support a variety of input sources. In my initial implementation, I focused on integrating the first input source as 'command executions.' However, with the right design, we should be able to expand this model to support additional sources or even rework the data push mechanism to accommodate a new system based on event streams. Here’s a diagram illustrating this approach: The concept is that multiple sources can feed into the Observe Units Processor, which will process them through the pipelines. Additionally, the Observe Units Processor should allow us to easily implement new input sources. (I may need some guidance on how best to structure this in the code.)
You’re right that the initial input source doesn’t cover several internal cluster activities (and blocking commands), but the design should allow us to extend the list of sources in future iterations. Structuring this properly may require some guidance, particularly to ensure compatibility with all potential Valkey input streams. Output from the INFO command is already accessible to the Clients, so I wonder—if we limited ourselves to just these results, would it be valuable enough to build out the entire observability pipeline? It’s feasible for users to set up a custom client to periodically fetch INFO data and compute time-series metrics. To bring real value to the observability pipeline, I aimed to start with something less/not accessible in the current Valkey feature set, while aligning with Google's 4 Golden Signals. Hence, I opted to implement this at the command level (though we could debate if this approach is ideal). What do you think? Are there any other arguments for limiting ourselves to the INFO results besides simplicity / incentives? |
@allenss-amazon said:
This is an excellent idea. @mwarzynski said:
@mwarzynski I like your approach to Allen's concept … IIRC, you want to split (a) the data collection (which can include filters for efficiency) and (b) the trigger from (c) what you term "Observe Units Processing". Then the "data collection" can come (i) command post processing like you're prototyping, or (ii) triggered by another command's execution including but not limited to the INFO command, or (iii) even timers for fully internal sampling. This frees us up to focus on multiple different parts of the implementation. We can discuss the programmable pipeline implementation while keeping the "data collection" part open for significant extension as we more use cases emerge. @mwarzynski said:
I think @allenss-amazon's idea is not to limit us to INFO but rather to make INFO command execution be able to feed the processing pipeline as a data source in addition to what you have already been working on. I gather from Allen's comment that the flexibility to be able to feed data (e.g. INFO or other commands) would motivate developers to feed more data into the observability pipeline and potentially have it "always on". Overall, I see the conceptual two stages "collection" and "programmable processing" as a powerful combination. Also note that pulling detailed command results (e.g. from INFO) out to the client is very expensive which might limit use cases. However, as Allen hints, having verbose output instead go into a server-side pipeline, especially with filters, could make it feasible to run "always on" observability. @allenss-amazon Please correct as appropriate … I don't want to go in a direction that you didn't intend. Overall, I think we have immediate use cases for memory sizing that require data collection from observePostCommand so I'd appreciate if we can start with that and add support for (x) INFO (among other commands) as input, (y) time/event/notifications triggers as a second step. Mostly, I just want to lock down the initial scope so that @mwarzynski can start work on the programmable pipeline part which is where we will discover how tricky this business is. |
Yes, multiple sources of data feeding the analysis engine. I believe a time-series processing module is in Valkey's future (likely with fidelity to the Redis time series module) and that this observability proposal should use that module as it's analysis engine rather than something unique. Thus the discussion could bifurcate into two threads, one about time-series processing and this thread which focuses on data collection mechanisms. I proposed a source of data, which is a periodic self-sampling of the "INFO" metrics. An initial implementation of this would trivially be built by having a periodic timer recursively invoke the INFO command and parse the results. Long term, I envision this as driving a re-architecting of stat collection within the Valkey universe to avoid the serialize/deserialize overhead of this approach, gaining efficiency and therefore usability. This would also provide a degree of uniformity in format and semantics for info stats as well as a reflection mechanism (i.e., command getkeysandflags for info stats) that could drive more generic tools like a grafana connector. @mwarzynski proposed a source of data which is to tap info the command processing and invoke a LUA script with the command, it's arguments and execution time. This is simple, but expensive in that it's going to duplicate a lot of the work that the core already does for you. For example, rather than a single tap-in point for all commands, why not have a per-command tap-in point? I mean the ability to establish a separate LUA script to be invoked for each command.. This would avoid needless LUA execution for commands that aren't of interest. Also the per-command LUA scripts run faster because the command parsing is already completed. With that thought in mind there are other potential tap-in points. For example, leveraging the ACL infrastructure would allow you to tap into core code that validates read and write access for keys independent of the commands., again something that reduces redundant parsing overhead. I'm sure there will be more points that would prove profitable. If we're in a world of multiple data sources and LUA scripts, then we should think about how those different LUA environments interact. Is there a single global LUA environment for all of OBSERVE or is there a need for multiple environments? |
TLDR, I propose to improve observability for Valkey, like built-in RED time-series metrics
Overview
This proposal outlines a new
OBSERVE
command to improve Valkey’s observability capabilities. By enabling advanced time-series metrics, custom gathering pipelines, and in-server data aggregation,OBSERVE
will equip Valkey users with first-class monitoring commands for granular insight into server behavior and performance.Background
After discussions with Irfan Ahmad, an attendee at the '24 Valkey Summit, I developed this initial proposal to introduce native observability pipelines within Valkey. Currently, Valkey lacks comprehensive, customizable observability tools embedded directly within the server, and this proposal aims to fill that gap.
Note: This proposal is a work in progress. Feedback on the overall approach and any preliminary design concerns would be greatly appreciated.
Current Observability Limitations in Valkey
Currently, Valkey’s observability relies on commands like
MONITOR
,SLOWLOG
, andINFO
.While useful, these commands have limitations:
MONITOR
: Streams every command, generating high data volume that may overload production environments.SLOWLOG
: Logs only commands exceeding a set execution time, omitting quick operations and general command patterns.INFO
: Provides server statistics but lacks detailed command- and key-specific insights.These commands lack the flexibility for in-depth, customizable observability exposed directly within the valkey-server instance,
such as filtering specific events, sampling data, executing custom processing steps, aggregating metrics over time windows.
Feature proposal
Problem statement and goals
The proposed
OBSERVE
command suite will bring observability as a core Valkey feature. Through user-defined “observability pipelines,” Valkey instances can produce detailed insights in a structured, efficient manner. These pipelines will be customizable to support diverse use cases, providing users with foundational building blocks for monitoring without overwhelming server resources. This new functionality could be enhanced with integration with tools like Prometheus and Grafana for visualization or alerting, though its fully customizable and primary purpose is in-server analysis.Proposed solution -- Commands
The
OBSERVE
command set introduces the concept of observability pipelines — user-defined workflows for collecting, filtering, aggregating, and storing metrics.Core Commands
OBSERVE CREATE <pipeline_name> <configuration>
Creates an observability pipeline with a specified configuration. Configuration details, specified in the next section, define steps such as filtering, partitioning, sampling, and aggregation.
Pipeline and it's configuration is persisted in the runtime memory (i.e. user needs to re-create the pipeline after server restart).
OBSERVE START <pipeline_name>
Starts data collection for the specified pipeline.
OBSERVE STOP <pipeline_name>
Stops data collection for the specified pipeline.
OBSERVE DELETE <pipeline_name>
Deletes the pipeline and its configuration.
OBSERVE RETRIEVE <pipeline_name>
Retrieves collected data. Alternatively, GET could potentially serve for this function, but further design discussion is needed.
OBSERVE LOADSTEPF <step_name> <lua_code>
Allows defining custom processing steps using Lua, for cases where built-in steps do not meet needed requirements.
Pipeline configuration
Pipelines are configured as chains of data processing stages, including filtering, aggregation, and output buffering. Format is similar to the Unix piping.
Key stages in this pipeline model include:
filter(f)
: Filters events based on defined conditions (e.g., command type).partition(f)
: Partitions events according to a function (e.g., by key prefix).sample(f)
: Samples events at a specified rate.map(f)
: Transforms each event with a specified function.window(f)
: Aggregates data within defined time windows.reduce(f)
: Reduces data over a window via an aggregation function.output(f)
: Directs output to specified sinks.Example configuration syntax:
Output
The goal is to capture time-series metrics within the defined pipeline outputs, f.e. for the pipeline above it would be structured as follows:
It remains uncertain whether storing output data in a format compatible with direct retrieval via GET (or another existing command) will be feasible. Consequently, we might need to introduce an
OBSERVE RETRIEVE <since_offset>
command for clients polling results data. This command would provide:Here, offset represents the sequence number of items produced by the pipeline, including any items removed due to buffer constraints. This approach allows clients to poll for results while adjusting their polling frequency based on the lag_detected flag. If lag_detected is true, clients would be advised to increase polling frequency to reduce data loss.
Use-Case Examples
Below are examples of how the proposed
OBSERVE
command and pipeline configurations could be used to address variousobservability needs.
Counting Specific Commands Per Minute with Buffer Size
Use Case: Count the number of
GET
commands executed per minute.Pipeline Creation:
Explanation: This pipeline filters for
GET
commands, counts them per every minute, and stores the countsin a time-series key
get_command_count
with a buffer size of 1440 (e.g., one day's worth of minute-level data).Hot Key Analysis
Use Case: Identify and monitor the most frequently accessed keys within a certain time window, allowing for proactive load management and identification of potential bottlenecks.
Pipeline Creation:
Explanation: This pipeline filters for sampled 0.5% of GET commands, partitions events by the accessed key, and aggregates their counts in one-minute intervals.
The map_top_keys(10) step then selects the top 10 most frequently accessed keys in each interval along with the access counts.
The result is stored as a time-series in hot_keys with a buffer size of 60, retaining one hour of hot key data.
Average Latency Per Time Window with Buffer
Use Case: Monitor average latency of
SET
commands per minute.Pipeline Creation:
Explanation: This pipeline filters for
SET
commands, extracts their latency, aggregates the average latency everyminute, and stores it with a buffer size of 720 (e.g., 12 hours of minute-level data).
Client Statistics
Use Case: Gather command counts per client for
GET
andSET
commands, sampled at 5%.Pipeline Creation:
Explanation: This pipeline filters for
GET
andSET
commands, samples 5% of them, extracts client information, counts commands per client every minute, and stores the data under
client_stats
with a buffer size of 1440.Error Tracking
Use Case: Monitor the number of errors occurring per minute.
Pipeline Creation:
Explanation: This pipeline filters events of type 'error', counts them every minute, and stores the totals in
tota l_errors
with a buffer size of 1440.TTL Analysis
Use Case: Analyze the average TTL of keys set with
SETEX
command per minute.Pipeline Creation:
Explanation: This pipeline filters for
SETEX
commands, extracts the TTL values, calculates the average TTL everyminute, and stores it in
average_ttl
with a buffer size of 1440.Distribution of Key and Value Sizes
Use Case: Create a histogram of value sizes for
SET
commands.Pipeline Creation:
Explanation: This pipeline filters for
SET
commands, extracts the size of the values, aggregates them into histogram buckets every minute, and stores the distributions with a buffer size of 1440.
Feedback Request
Feedback is requested on the following points:
OBSERVE
command align with your vision for Valkey’s observability?Let's first reach the consensus for the 'Feature Scope'. If the answer is yes, we can discuss the designs.
I am ready to commit to building this feature as soon as the designs are accepted, even in draft form.
Thank you for your time and consideration. I look forward to discussing this proposal further.
The text was updated successfully, but these errors were encountered: