Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make k8sattributesprocessor enrich based on profile's internal attributes #37269

Open
ilyamor opened this issue Jan 14, 2025 · 11 comments
Open
Labels
data:profiles Profiles related issues processor/k8sattributes k8s Attributes processor

Comments

@ilyamor
Copy link

ilyamor commented Jan 14, 2025

Is your feature request related to a problem? Please describe.
Currently, the k8sattributeprocessor enriches data based on high-level attributes. In the case of profiling data, the relevant attributes come from ResourceProfiles resources, which only represent host-level attributes. Since the OpenTelemetry profiler agent runs as a DaemonSet on a host and generates profile data, we can’t rely on connection-based pod_association. Instead, we must use resource_attribute.
Describe the solution you'd like
Enrich resources based on internal attributes
https://github.com/open-telemetry/opentelemetry-proto/blob/ffade295895a2be5a6e7931eb0cda1c72bc4c0f6/opentelemetry/proto/profiles/v1development/profiles.proto#L207

Describe alternatives you've considered
Maybe change the format and upgrade some attributes to high level?

@mx-psi
Copy link
Member

mx-psi commented Jan 14, 2025

Hey @ilyamor, thanks for your issue!

If your proposal is something that can be done with the current format, I will transfer this to opentelemetry-collector-contrib. However, if it would require changes in the data model, this would need to be changed in opentelemetry-proto. Could you clarify which of the two you are proposing?

@ilyamor
Copy link
Author

ilyamor commented Jan 16, 2025

i am actually not sure, i think a change in the k8s k8sattributesprocessor(it's part of this repo) is maybe a better option.
@dmathieu @florianl, what do you think?

@florianl
Copy link
Contributor

It will be recommended to deploy OTel eBPF profiler as a daemon set. In such a case, ResourceProfiles.resource in the OTel profiling signal holds only general information about the node the daemon set is running on.
As a whole system profiler, OTel eBPF profiler extracts and processes stack traces/profiles from pods on the node. More specific pod information are attached in the OTel profiling signal as attribute in message Sample.

From a user perspective, it can be interesting to enrich these pod specific information with components like k8sattributesprocessor. But as these pod specific information are not part of ResourceProfiles.resource components like k8sattributesprocessor can not enrich/process them easily.

Hope this adds some context to the request.

@dmathieu
Copy link
Member

My understanding of this issue is that the k8sattributesprocessor only looks for resources when processing profiles (or any other signal).

func (kp *kubernetesprocessor) processProfiles(ctx context.Context, pd pprofile.Profiles) (pprofile.Profiles, error) {
rp := pd.ResourceProfiles()
for i := 0; i < rp.Len(); i++ {
kp.processResource(ctx, rp.At(i).Resource())
}
return pd, nil
}

However, at least for profiles, it should also be looking at attributes, as resources are only host-level, and in the case of profiles with a daemonset architecture, some of the resources the processor needs will be available through the profile's attribute table.

So yes, I think this issue belongs in collector-contrib.
This profiling issue may also warrant a fix for the other signals though, as for all of them, we only look for resources, never for attributes.

@mx-psi mx-psi transferred this issue from open-telemetry/opentelemetry-collector Jan 16, 2025
@mx-psi mx-psi added the processor/k8sattributes k8s Attributes processor label Jan 16, 2025
Copy link
Contributor

Pinging code owners for processor/k8sattributes: @dmitryax @fatsheep9146 @TylerHelmuth @ChrsMark. See Adding Labels via Comments if you do not have permissions to add labels yourself. For example, comment '/label priority:p2 -needs-triaged' to set the priority and remove the needs-triaged label.

@ChrsMark
Copy link
Member

ChrsMark commented Jan 16, 2025

However, at least for profiles, it should also be looking at attributes, as resources are only host-level, and in the case of profiles with a daemonset architecture, some of the resources the processor needs will be available through the profile's attribute table.

So yes, I think this issue belongs in collector-contrib.
This profiling issue may also warrant a fix for the other signals though, as for all of them, we only look for resources, never for attributes.

That would be doable. So far the pod association rules only work with connection and resource_attributes but I guess that can be extended to also work with attributes as well.

Do we have specific, attributes based, association rules that we would like to support?

@mx-psi mx-psi added the data:profiles Profiles related issues label Jan 16, 2025
@florianl
Copy link
Contributor

florianl commented Jan 16, 2025

Just a follow up thought:
Besides detecting and processing the attributes on the correct level, I think, the challenge will also be to enrich the data correctly.
If I follow the code corretly, this is currently done with setResourceAttribute:

func setResourceAttribute(attributes pcommon.Map, key string, val string) {
attr, found := attributes.Get(key)
if !found || attr.AsString() == "" {
attributes.PutStr(key, val)
}
}

However, the OTel Profiling signal does not use an attribute map, of type repeated opentelemetry.proto.common.v1.KeyValue, in message Sample but it uses indices in message Sample to point to the respective attribute in the shared attribute table in message Profile.

Indices for attribute lookup in message Sample:

https://github.com/open-telemetry/opentelemetry-proto/blob/ffade295895a2be5a6e7931eb0cda1c72bc4c0f6/opentelemetry/proto/profiles/v1development/profiles.proto#L389-L390

Shared attribute table in message Profile for all samples:

https://github.com/open-telemetry/opentelemetry-proto/blob/ffade295895a2be5a6e7931eb0cda1c72bc4c0f6/opentelemetry/proto/profiles/v1development/profiles.proto#L206-L207

@Gandem
Copy link

Gandem commented Jan 23, 2025

Alternatively, since the profile holds data for the entire host, what about having a list of ResourceProfiles.resource per Resource profile instead of one?

One way we could model this, is for the profiler to create placeholder ResourceProfiles.resource which contains only the cgroup name as an attribute.

Then in each sample, the profiler would add a resource_id attribute, to point each sample to a single reference in the list of resource profiles.

In that case, on the processor side, the enrichment would only need to iterate on the top level list of ResourceProfiles.resource, get the cgroup name and fill out the rest of attributes (without having to add the information to each individual sample).

Edit: We'll likely also need a way to distinguish samples coming from containerized workloads, and samples coming from services running directly on the host. Could likely be solved by having the first element of the ResourceProfiles.resource list always refer to the host attributes?

@korniltsev
Copy link

It would be nice to make pod association based on cgroup work.
Currently ebpf profiler reports something like this

/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod88920755_ce59_4ef8_9bce_616ae21b0172.slice/cri-containerd-0dba2bf2144e2497c340ea5ba71693bc3bc9b7a4fdcbf686252e509ae97126bf.scope

And kube client is likely returning something like

containerd://0dba2bf2144e2497c340ea5ba71693bc3bc9b7a4fdcbf686252e509ae97126bf

or

0dba2bf2144e2497c340ea5ba71693bc3bc9b7a4fdcbf686252e509ae97126bf

@povilasv
Copy link
Contributor

povilasv commented Feb 17, 2025

I would avoid changing to slice of resources for profiling signal. Or changing k8sattributes to dig into internal attributes.

I think collector assumes the current resource attribute model in too many places, examples:

Maybe instead we could use transform processor with flatten data, similiar to what we have for logs:

transform:
  flatten_data: true
  log_statements:
    - set(resource.attributes["k8s.pod.id"], profile.attributes["k8s.pod.id"])

This would debatch, move attributes to resource attributes and then batch profiles based on resource attributes

Ref: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/transformprocessor/README.md#transformflattenlogs

@Gandem
Copy link

Gandem commented Feb 19, 2025

@povilasv It is unclear to me whether it would be possible to have a transform processor that would allow us to both stay efficient and at the same time be compatible with the k8sattributesprocessor.

I wrote a problem statement: open-telemetry/opentelemetry-proto#628 that should hopefully highlight why (whether we do this transformation at the profiler level, or a processor level). Curious to have your thoughts! 🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data:profiles Profiles related issues processor/k8sattributes k8s Attributes processor
Projects
None yet
Development

No branches or pull requests

8 participants