-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logging enhancements #19580
Comments
I'll be honest and frank - I'm skeptical we need to change the logging levels at runtime. It sounds like a problem with querying and processing the logs. If we're collecting the logs properly and storing them - why would the level they are stored at matter. What if you deployed more sophisticated log querying solution in the container - it feels like the right way to sort through noisy logs. It feels like maybe Loki -> {Alloy or Promtail} -> Grafana might be an option - I'm not familiar with this space and I might be speaking nonsense but the options look promising to my naive eyes? Writing docs for integrating these tools and adapting our logs for richer metadata parsing would seem like a better use of developer time than runtime changing the log levels? The UI would probably be better and it would be more generally useful and a more familiar workflow for admins and developers. |
Thanks for the feedback. Honest and frank is always good. The issue isn't with querying and processing logs, the issue is that not enough logging is being done for us (AnVIL et al) to debug problems when Galaxy is running in Kubernetes. We just spent months figuring out a problem with data managers (the By tons of logging statements I mean adding
|
This is a planning and discussion issue for enhancements I would like to make to Galaxy's logging system.
TL;DR
I want to add
logger.trace()
calls to the Galaxy codebase, but before doing that I would like to get some infrastructure in place.What is proposed
trace()
method that is already present ingalaxy.util.custom_logging
galaxy.yml
or a separate logging config fileWhat is not proposed
Proof of concept
I have a proof of concept running that implements the first three items (everything but the admin panel). The code is available in the logging-enhancements branch of my Galaxy fork. See this draft PR for details.
However, before I start submitting PRs there are a number of design decisions to resolve.
The trace() method
My proof of concept is based on the collective wisdom of StackOverflow.
If we monkeypatch Python's
logging
library early enough in the code loading process then thetrace()
method is automagically available to all loggers. There are ~750 configured loggers in a running Galaxy instance.This works well in a running Galaxy instance, but may present problems for Galaxy packages meant to be used as libraries. Calling
galaxy.util.logging.addTraceLoggingLevel()
in the package.init.py may be sufficient. The goal is to make it as easy as possible for developers to addtrace()
calls to their code and not require the use of a special logger class.Logging configuration
The proof of concept implementation adds a
logging_levels
section to thegalaxy.yml
file. This could also be done in a separate configuration file. The purpose of thelogging_levels
section is to provide an easier way to set the logging levels for groups of loggers in a single statement.Changes at runtime
Adding
trace()
level logging statements has the potential to blow up log files and Galaxy should not be configured to useTRACE
as the default logging level. Therefore we will need some way to enableTRACE
level logging at runtime. Options include, but are not limited to:These are not mutually exclusive options, but in some situations, e.g. AnVIL, users may not have access to the configuration files.
The proof of concept provides a simple API that allows the level to be changed at runtime. Adding a config watcher can be added at a later date if needed.
For brevity I have omitted setting the API key. Only administrators can list or change the logging levels.
The first
GET
method returns JSON containing information for all currently configured loggers. The secondGET
method returns information for a single logger or all loggers in a particular package. ThePOST
method can be used to set the logging level for a single logger or all the loggers in a package.TODO
Authorization
All API endpoints specify
require_admin=True
in the@router
decorator. Is this sufficient? I was thinking of adding checks in relevantgalaxy.util.logging
methods, but is this necessary?Admin GUI
I have not started on an admin panel yet, but I envision it would be a new option in the Server section of the admin panel. The logging panel would contain a table and/or a tree view of all the loggers and their current levels and the ability to set the levels for loggers. I will need help and guidance with this task.
Naming
I've used that I thought were reasonable names for the names of packages, API endpoints, function names, and configuration options, but I'm open to suggestions. This also applies to general code layout and structure.
The text was updated successfully, but these errors were encountered: