-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Registry manifest and Schema diff #400
base: main
Are you sure you want to change the base?
Conversation
# Conflicts: # crates/weaver_semconv/src/group.rs
# Conflicts: # .clippy.toml # Cargo.toml # crates/weaver_semconv_gen/src/lib.rs # src/registry/search.rs # src/registry/stats.rs # src/registry/update_markdown.rs
docs/schema-changes-use-cases.md
Outdated
```yaml | ||
# Version n+1 | ||
groups: | ||
- id: registry.network.deprecated | ||
type: attribute_group | ||
attributes: | ||
- id: net.peer.name | ||
type: string | ||
brief: Deprecated, use `server.address` on client spans and `client.address` on server spans. | ||
deprecated: | ||
type: conditionally_renamed | ||
forward: > | ||
switch span_kind { | ||
case 'client' => attributes['server.address'] = attributes['net.peer.name'], | ||
case 'server' => attributes['client.address'] = attributes['net.peer.name'] | ||
} | ||
backward: > | ||
switch span_kind { | ||
case 'client' => attributes['net.peer.name'] = attributes['server.address'], | ||
case 'server' => attributes['net.peer.name'] = attributes['client.address'] | ||
} | ||
stability: experimental | ||
examples: ['example.com'] | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this example, the attribute has been deprecated in the attribute_group but the condition is related to its usage in a span. Should the forward / backward instructions be on the span definition where net.peer.name
is referenced? There may be different instructions for different spans that use net.peer.name
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. It's a good topic for discussion for our next Semantic Conventions Tooling SIG.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following the discussion in the last SIG meeting, it was decided to start with a more minimalist approach initially. As a result, I have updated this PR accordingly, both in the code and the main documentation (not this doc).
However, I am keeping this exploration document in place so that we can revisit this topic later if needed.
docs/schema-changes-use-cases.md
Outdated
```yaml | ||
# Version n+1 | ||
groups: | ||
- id: registry.db.deprecated | ||
type: attribute_group | ||
stability: experimental | ||
attributes: | ||
- id: db.instance.id | ||
type: string | ||
brief: 'Deprecated, no general replacement at this time. For Elasticsearch, use `db.elasticsearch.node.name` instead.' | ||
deprecated: | ||
type: conditionally_deprecated | ||
forward: > | ||
if attributes['db.system'] == 'elasticsearch' then attributes['db.elasticsearch.node.name'] = attributes['db.instance.id'] | ||
else drop attributes['db.instance.id'] | ||
backward: > | ||
if attributes['db.system'] == 'elasticsearch' then attributes['db.instance.id'] = attributes['db.elasticsearch.node.name'] | ||
stability: experimental | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like my previous comment. When the instructions involve other attributes then the change is dependent on the presence of those attributes in that signal definition. Perhaps defining the instructions on the attribute_group means it's universal for all uses via references? Maybe this can be overridden with further instructions on the span, for example, to deviate from this default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in the attribute_group
, the setting is intended to be universal. However, I agree that we should demonstrate how this setting can be overridden when dealing with a specific signal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my previous comment.
}, | ||
/// A top-level telemetry object from the baseline registry was marked as deprecated in the head | ||
/// registry. | ||
Deprecated { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this its own change or should it be attached to other changes?
I.e. is it just an implication of change?
I think this was called out verbally, but it's the one I'm least sure of belonging with other "semantic" changes, especialyl given "uncategorized" as an option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Responded. Once updated, this PR LGTM.
Structurally/rust-wise you have all the pieces I'd look for. It's just naming/surface syntax at this point.
docs/schema-changes.md
Outdated
- `renamed`: A top-level telemetry object from the baseline registry was renamed in the head registry. | ||
- `deprecated`: A top-level telemetry object from the baseline registry was marked as deprecated in the head registry. | ||
- `updated`: One or more fields in a top-level telemetry object have been updated in the head registry. | ||
- `removed`: A top-level telemetry object from the baseline registry was removed in the head registry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For semconv specifically, we definitely don't want to allow this, instead we deprecate.
Also - my concern with "deprecated" is that when we rename, we're efecctively deprecating the old.
I'm reading this and think "deprecated" is too generic and too much of a catch-all. I'd rather use "uncategorized", where deprecation is a consequence of the change vs. the change itself.
I.e. we almost need a "removed" where we mark the type as deprecated and prevent further usage but don't remove our knowledge it once existed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deprecated
type is indeed probably too much of a catch-all. However, I believe these three types are truly distinct, and I probably didn’t do a great job explaining them.
Currently, the general concept of deprecation is used for several types of changes in semantic conventions (renaming, “soft” removal, and other exotic changes). I propose refining my initial suggestion and the corresponding definitions as follows:
- Rename the change type
deprecated
toobsoleted
to clearly indicate that this change corresponds to an attribute or a signal that is discontinued without a valid replacement. - In my view,
removed
should exist at the Weaver level, if only to identify that there has been an actual deletion in a registry under validation. This type of change should never be issued for a published registry, but it is clearly a transitional change that can occur during the development of a registry. We could even build a policy leveraging this type of change in the future. uncategorized
is the catch-all change type representing all complex types of changes that we haven’t precisely codified. The idea of this type, as you mentioned during the meeting, is that we should gradually eliminate it from the registry.
Do we agree on this definition of things?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like moving deprecated
to obsoleted
where deprecated
can remain as a catch-all for "we changed this thing in a way" and obsoleted
implies "do not use anymore, here for legacy reasons".
I agree we need to actually model removed in some way. obsoleted
as soft-delete works for me.
So yes, I agree on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few doc nits. Plus a schema suggestion to avoid obscuring the original brief for the item.
/// complex reasons (split, merge, ...) which are currently not precisely define | ||
/// in the supported deprecation reasons. | ||
/// | ||
/// The `brief` field should contain the reason why the field has been obsoleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// complex reasons (split, merge, ...) which are currently not precisely define | |
/// in the supported deprecation reasons. | |
/// | |
/// The `brief` field should contain the reason why the field has been obsoleted. | |
/// complex reasons (split, merge, ...) which are currently not precisely defined | |
/// in the supported deprecation reasons. | |
/// | |
/// The `brief` field should contain the reason for this uncategorized deprecation. |
}, | ||
{ | ||
"type": "object", | ||
"description": "The telemetry object containing the deprecated field has been deprecated for complex reasons (split, merge, ...) which are currently not precisely define in the supported deprecation reasons.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"description": "The telemetry object containing the deprecated field has been deprecated for complex reasons (split, merge, ...) which are currently not precisely define in the supported deprecation reasons.", | |
"description": "The telemetry object containing the deprecated field has been deprecated for complex reasons (split, merge, ...) which are currently not precisely defined in the supported deprecation reasons.", |
|
||
Variant 1 | ||
```yaml | ||
brief: <text explaining the reason of the renaming> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be clearer for the deprecated
object to have its own note to explain the reason for deprecation. Then the existing brief can remain intact? e.g.
brief: Here is the original brief for the attribute
deprecated:
explanation: <text explaining the reason of the renaming>
reason: renamed
renamed_to: <name of the telemetry object>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we currently replace original brief with something like "Deprecated, use 'another.attribute' instead, but nothing stops us from keeping the original brief or adding deprecating details. There is also attribute note and we can decide to keep brief around and add deprecation details into the note.
I'd prefer not to add new properties (since we already have breif and note), but it's not a strong opinion.
@@ -41,6 +41,48 @@ | |||
} | |||
}, | |||
"$defs": { | |||
"Deprecated": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please also update https://github.com/open-telemetry/weaver/blob/main/schemas/semconv-syntax.md - an informal and reader-friendly version of it?
|
||
Variant 1 | ||
```yaml | ||
brief: <text explaining the reason of the renaming> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we currently replace original brief with something like "Deprecated, use 'another.attribute' instead, but nothing stops us from keeping the original brief or adding deprecating details. There is also attribute note and we can decide to keep brief around and add deprecation details into the note.
I'd prefer not to add new properties (since we already have breif and note), but it's not a strong opinion.
registry_attributes: | ||
- name: http.server_name # attribute name | ||
type: obsoleted # change type | ||
note: This attribute is deprecated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the note populated from note or brief?
Note: The scope of this PR has been reduced to focus only focus on the schema diff feature. Github issues have been created to track the features that have been postponed #482, #483.
This PR implements the command
registry diff
, see the following example:In this example, the diff is displayed in markdown format. The following formats are supported: json, markdown, ansi, ansi_stats. YAML format will be supported once PR #525 is finalized.
A detailed description of the schema diff data model and the diffing process is visible here.
Notes:
weaver_otel_schema
is not essential for this PR; it was initially included as part of the preparations for theregistry schema-update
command. We have decided to implement this command in a future PR. However, for simplicity, I prefer to keep the preparation code in place instead of removing it. Same thing forall_changes
inweaver_version
.List of modifications to apply to the semantic conventions repository after the release of the Weaver containing the current PR:
registry-manifest.yaml
file with the version of the next release.Closes: #186
The following command comparing the versions 1.29 and 1.30
produces the following markdown output:
Registry Attributes
New registry_attributes:
Deprecated registry_attributes:
code.column.number
)code.function.name
instead)code.line.number
instead)cassandra.consistency.level
instead.)cassandra.coordinator.dc
instead.)cassandra.coordinator.id
instead.)cassandra.query.idempotent
instead.)cassandra.page.size
instead.)cassandra.speculative_execution.count
instead.)azure.client.id
instead.)azure.cosmosdb.connection.mode
instead.)cosmosdb.consistency.level
instead.)azure.cosmosdb.operation.contacted_regions
instead.)azure.cosmosdb.operation.request_charge
instead.)azure.cosmosdb.request.body.size
instead.)azure.cosmosdb.response.sub_status_code
instead.)elasticsearch.node.name
instead.)db.operation.parameter
instead.)db.system.name
instead.)gen_ai.request.seed
.)network.connection.state
instead.)Metrics
New metrics:
Deprecated metrics:
Spans
New spans: