Virtual GenAI
Virtual GenAI represents the Generative AI service nodes detected by server agents’ plugins. The performance metrics of the GenAI operations are from the GenAI client-side perspective.
For example, a Spring AI plugin in the Java agent could detect the latency of a chat completion request. As a result, SkyWalking would show traffic, latency, success rate, token usage (input/output), and estimated cost in the GenAI dashboard.
Data Sources
Virtual GenAI metrics are derived from distributed tracing data. SkyWalking OAP can ingest and analyze trace data adhering to GenAI semantic conventions from the following sources:
- Native SkyWalking Traces via SkyWalking Java Agent
- OpenTelemetry format trace
- Zipkin format Traces
Span Contract
The GenAI operation span should have the following properties:
- It is an Exit span
- Span’s layer == GENAI
- Tag key =
gen_ai.provider.name, value = The Generative AI provider, e.g. openai, anthropic, ollama - Tag key =
gen_ai.response.model, value = The name of the GenAI model, e.g. gpt-4o, claude-3-5-sonnet - Tag key =
gen_ai.usage.input_tokens, value = The number of tokens used in the GenAI input (prompt) - Tag key =
gen_ai.usage.output_tokens, value = The number of tokens used in the GenAI response (completion) - Tag key =
gen_ai.server.time_to_first_token, value = The duration in milliseconds until the first token is received (streaming requests only) - If the GenAI service is a remote API (e.g. OpenAI), the span’s peer should be the network address (IP or domain) of the GenAI server.
Provider Configuration
SkyWalking uses gen-ai-config.yml to map model names to providers and configure cost estimation.
When the gen_ai.provider.name tag is present in the span, it is used directly. Otherwise, SkyWalking matches the model name
against prefix-match rules to identify the provider. For example, a model name starting with gpt is mapped to openai.
To configure cost estimation, add models with pricing under the provider:
providers:
- provider: openai
prefix-match:
- gpt
models:
- name: gpt-4o
input-estimated-cost-per-m: 2.5 # estimated cost per 1,000,000 input tokens
output-estimated-cost-per-m: 10 # estimated cost per 1,000,000 output tokens
Metrics
The following metrics are available at the provider (service) level:
gen_ai_provider_cpm- Calls per minutegen_ai_provider_sla- Success rategen_ai_provider_resp_time- Average response timegen_ai_provider_latency_percentile- Latency percentilesgen_ai_provider_input_tokens_sum / avg- Input token usagegen_ai_provider_output_tokens_sum / avg- Output token usagegen_ai_provider_total_estimated_cost / avg_estimated_cost- Estimated cost
The following metrics are available at the model (service instance) level:
gen_ai_model_call_cpm- Calls per minutegen_ai_model_sla- Success rategen_ai_model_latency_avg / percentile- Latencygen_ai_model_ttft_avg / percentile- Time to first token (streaming only)gen_ai_model_input_tokens_sum / avg- Input token usagegen_ai_model_output_tokens_sum / avg- Output token usagegen_ai_model_total_estimated_cost / avg_estimated_cost- Estimated cost
Requirement
Version
SkyWalking Java Agent version >= 9.7
Semantic Conventions and Compatibility
The tag keys used in Virtual GenAI follow the OpenTelemetry GenAI Semantic Conventions. SkyWalking OAP identifies GenAI-related spans based on the following criteria depending on the data source:
- SkyWalking Native Agent: Requires an Exit span with
SpanLayer == GENAIand relevantgen_ai.*tags. - OTLP / Zipkin Traces: Any span containing the
gen_ai.response.modeltag will be identified as a GenAI operation.
Note on OTLP / Zipkin Provider Identification: To ensure broad compatibility with different OpenTelemetry instrumentation versions, SkyWalking OAP identifies the GenAI provider using the following prioritized logic:
gen_ai.provider.name: SkyWalking first looks for this tag (the latest OTel semantic convention).gen_ai.system: If the above is missing, it falls back to this legacy tag for backward compatibility with older instrumentation (e.g., current OTel Python auto-instrumentation).- Prefix Matching: If neither tag is present, SkyWalking attempts to identify the provider by matching the model name against the
prefix-matchrules defined in thegen-ai-config.yml.