-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Demo - remove root namespace from system-specific attributes and metrics #1711
base: main
Are you sure you want to change the base?
Conversation
| [`db.namespace`](/docs/attributes-registry/db.md) | string | The name of the Elasticsearch cluster which the client connects to. [10] | `customers`; `test.users` | `Recommended` | ![Release Candidate](https://img.shields.io/badge/-rc-mediumorchid) | | ||
| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [11] | `2`; `3`; `4` | `Recommended` | ![Release Candidate](https://img.shields.io/badge/-rc-mediumorchid) | | ||
| [`db.query.text`](/docs/attributes-registry/db.md) | string | The request body for a [search-type query](https://www.elastic.co/guide/en/elasticsearch/reference/current/search.html), as a json string. [12] | `"{\"query\":{\"term\":{\"user.id\":\"kimchy\"}}}"` | `Recommended` [13] | ![Release Candidate](https://img.shields.io/badge/-rc-mediumorchid) | | ||
| [`elasticsearch.node.name`](/docs/attributes-registry/elasticsearch.md) | string | Represents the human-readable identifier of the node/instance to which a request was routed. [14] | `instance-0000000001` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious what Elastic folks think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elasticsearch.node.name
seems ok to me. I think for Elasticsearch the proposed change would not make a significant difference, because it naturally falls under the db
domain - so there was no discrepancy there anyways. E.g. the redis example in the PR description shows the problem very well - I'd say we haven't had this issue with Elasticsearch - definitely not to that extent.
So, having said that, I'd say, for consistency (if we go with this proposal and drop it due to other use-cases, e.g. redis), it'd be ok to drop db.
for the Elasticsearch specific attributes.
| [`db.cosmosdb.regions_contacted`](/docs/attributes-registry/db.md) | string[] | List of regions contacted during operation in the order that they were contacted. If there is more than one region listed, it indicates that the operation was performed on multiple regions i.e. cross-regional call. [3] | `["North Central US", "Australia East", "Australia Southeast"]` | `Conditionally Required` If available. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | ||
| [`db.cosmosdb.request_charge`](/docs/attributes-registry/db.md) | double | Request units consumed for the operation. | `46.18`; `1.0` | `Conditionally Required` when available | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | ||
| [`db.cosmosdb.sub_status_code`](/docs/attributes-registry/db.md) | int | Cosmos DB sub status code. | `1000`; `1002` | `Conditionally Required` when response was received and contained sub-code. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | ||
| [`cosmosdb.connection_mode`](/docs/attributes-registry/cosmosdb.md) | string | Cosmos client connection mode. | `gateway`; `direct` | `Conditionally Required` [1] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious what CosmosDB folks would think about this (removing db
in front of cosmosdb
everywhere)
| [`messaging.batch.message_count`](/docs/attributes-registry/messaging.md) | int | The number of messages sent, received, or processed in the scope of the batching operation. [2] | `0`; `1`; `2` | `Conditionally Required` [3] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | ||
| [`messaging.destination.name`](/docs/attributes-registry/messaging.md) | string | The message destination name [4] | `MyQueue`; `MyTopic` | `Conditionally Required` [5] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | ||
| [`messaging.kafka.message.tombstone`](/docs/attributes-registry/messaging.md) | boolean | A boolean that is true if the message is a tombstone. | | `Conditionally Required` [6] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | ||
| [`kafka.message.tombstone`](/docs/attributes-registry/kafka.md) | boolean | A boolean that is true if the message is a tombstone. | | `Conditionally Required` [2] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious what messaging folks think (about removing messaging
prefix from system-specific attributes and metrics)
| [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [9] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Conditionally Required` If available. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | ||
| [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [8] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Conditionally Required` If available. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | ||
| [`servicebus.disposition_status`](/docs/attributes-registry/servicebus.md) | string | Describes the [settlement type](https://learn.microsoft.com/azure/service-bus-messaging/message-transfers-locks-settlement#peeklock). | `complete`; `abandon`; `dead_letter` | `Conditionally Required` if and only if `messaging.operation` is `settle`. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | ||
| [`servicebus.message.delivery_count`](/docs/attributes-registry/servicebus.md) | int | Number of deliveries that have been attempted for this message. | `2` | `Conditionally Required` [9] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as the owner of the corresponding instrumentations, I'd be really excited about this change - I struggle with trying to fit everything about specific messaging lib/system into messaging
namespace.
| [`gen_ai.openai.request.seed`](/docs/attributes-registry/gen-ai.md) | int | Requests with same seed value more likely to return same result. | `100` | `Conditionally Required` if the request includes a seed | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | ||
| [`gen_ai.openai.request.service_tier`](/docs/attributes-registry/gen-ai.md) | string | The service tier requested. May be a specific tier, default, or auto. | `auto`; `default` | `Conditionally Required` [4] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | ||
| [`gen_ai.openai.response.service_tier`](/docs/attributes-registry/gen-ai.md) | string | The service tier used for the response. | `scale`; `default` | `Conditionally Required` [5] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | ||
| [`openai.request.response_format`](/docs/attributes-registry/openai.md) | string | The response format that is requested. | `json` | `Conditionally Required` if the request includes a response_format | ![Experimental](https://img.shields.io/badge/-experimental-blue) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious what gen_ai folks think about removing gen_ai
prefix from system-specific conventions like openai.
E.g. what good does gen_ai
prefix does for something like file upload - would we want to report gen_ai.client.openai.uploaded_files
vs openai.client.uploaded_files
metric?
@open-telemetry/semconv-genai-approvers @xrmx @aabmass @codefromthecrypt @trentm @lzchen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm gut feel is that we want the dimension/correlation of the source type vs embed it inside a naming convention, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
above comment was about the attribute in general, but now noticing your question was limited to if it should be prefixed or not (nothing else changes except the prefixing).
Answer is similar, I think the attribute name should be owned by the blob upload component, not gen_ai. However, if this is about bascially all openai related properties becoming top-level and not being under gen_ai, main concern is it crowds the top-level namespace.
- This puts a lot more pressure to pick names top-level names that are not ambiguous across domains. This might work for trademarked names better than non-trademarked ones.
- This can help when other types of attributes are considered, like do we feel like saying GENAI_OPENAI_API_VERSION vs OPENAI_API_VERSION
- semconv ownership is easier to figure out when it is prefixed, e.g. this team owns openai attributes, however, this is maybe not very useful to end users vs us
Gut feel is that removing the top-level namespaces for trademarked names, ones that don't have registry clash risk, sounds like it helps more than hurts.
I know it is a different issue, but in any text, most of these attributes are about the api or library in use, not necessarily the LLM provider (e.g xAI provider has openai compat support and would use these attributes for those traffic). We would need to keep that in mind when attributing to what's currently called "system". This clarification may not end up in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a question whether gen_ai
prefix add any value for any attribute that describes something specific to a certain library/system/api/provider/whatever. If something is specific to openai API, we'd still use openai
somewhere in the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does add some value in hinting that the span or event contains information pertinent to the topic of generative AI. Without the category prefix, it is harder to do things like "Show users all spans that involved the use of GenAI".
| [`rpc.grpc.status_code`](/docs/attributes-registry/rpc.md) | int | The [numeric status code](https://github.com/grpc/grpc/blob/v1.33.2/doc/statuscodes.md) of the gRPC request. | `0`; `1`; `2` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | ||
| [`rpc.grpc.request.metadata.<key>`](/docs/attributes-registry/rpc.md) | string[] | gRPC request metadata, `<key>` being the normalized gRPC Metadata key (lowercase), the value being the metadata values. [1] | `rpc.grpc.request.metadata.my-custom-metadata-attribute=["1.2.3.4", "1.2.3.5"]` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | ||
| [`rpc.grpc.response.metadata.<key>`](/docs/attributes-registry/rpc.md) | string[] | gRPC response metadata, `<key>` being the normalized gRPC Metadata key (lowercase), the value being the metadata values. [2] | `rpc.grpc.response.metadata.my-custom-metadata-attribute=["attribute_value"]` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | ||
| [`grpc.status_code`](/docs/attributes-registry/grpc.md) | int | The [numeric status code](https://github.com/grpc/grpc/blob/v1.33.2/doc/statuscodes.md) of the gRPC request. | `0`; `1`; `2` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsuereth what would be your thoughts on getting rid of rpc.
prefix from rpc.grpc
? (and other similar changes in PR)
how big of a problem do we think this is, and are there other mitigations we could consider? as we are looking at tradeoffs, it feels like there's some tradeoff to promoting all of these (many) database systems to top-level namespaces, over keeping them scoped under |
I think consistent naming is a problem regardless of the top-level namespace. We need to make a choice between |
A potential concern with moving specific systems attributes outside the prefix of the general category to which they belong: it makes it difficult to know which attributes are topically relevant. For example, let's suppose that you are implementing the display of telemetry collection data and you want to give users the ability to filter down to information that is focused on "Databases" or "GenAI" or etc. Being able to map that to a prefix filter on "db." or a prefix filter on "gen_ai." makes it possible to provide such a topic-focused experience. If specific vendor implementation of a general concept are hoisted up a level where they are no longer prefixed with the category, then it makes it harder to match attributes related to the topic. A potential way to address this concern might be to record in the OTel SemConv metadata information about the attributes that can connect them to the topics that they are about. For example:
That would (albeit with significantly more effort) at least make it technically achievable to provide this kind of topical filtering. But it would still involve much more effort than having the category be a consistent prefix. |
The filtering can also be applied in the following way:
Qualifying an attribute as relevant to a domain is not trivial (and which domains):
The fact that attribute name starts with To summarize pros and cons: Having the
I'd argue that those are more important than topical filtering. Given that we still provide consistency for |
@lmolkova is it guaranteed that the value in My concern is mainly having to keep abreast of all possible vendor-specific systems that belong to some domain. I agree that categorization is hard, but it would be more scalable if contributors to the conventions self-categorize. If there were metadata about each attribute specifying relevant categories, it may be possible to specify more than one category, which would eliminate the concerns about having to choose the ideal category. I don't think it is necessarily a blocker to the proposed renaming, but I did want to raise it to attention that the naming prefixes do provide a useful way of topical filtering that is a tradeoff/loss as a result of this transformation (and that, out of band of this change, it may be worth thinking about how topical filtering might be enabled scalably). |
Demonstrates #1708 (comment) and #1618 (comment)
jvm.*
and notruntime.jvm.*
db.cosmosdb
and notcosmosdb
messaging.kafka
and notkafka
gcp
and notcloud.gcp
cache.redis
,db.redis
andmessaging.redis
?eventhubs
(oraz.eventhubs
) makes more sense.This PR shows how the world would look like if we (in general) removed domain prefix from all system-specific things.
The naming guidance would be:
db.namespace
ordb.client.operation.duration
)jvm.cpu.time
,cosmosdb.client.request_charge
)