Advanced Topics
Nginx Reverse Proxy
To use Nginx as a reverse proxy in front of the gRPC service, disable the transport filter on both proxy and agent:
When using Nginx as a reverse proxy, disable the transport filter
on both Proxy and Agent:
java -jar prometheus-proxy.jar --tf_disabled
java -jar prometheus-agent.jar --tf_disabled --config myconfig.conf
Or via environment variable:
TRANSPORT_FILTER_DISABLED=true
Or via config:
proxy.transportFilterDisabled = true
agent.transportFilterDisabled = true
Delayed disconnect detection
With transportFilterDisabled, agent disconnections are not immediately detected.
Agent contexts on the proxy are removed after the inactivity timeout
(default: 60 seconds, controlled by proxy.internal.maxAgentInactivitySecs).
Example Nginx and proxy configuration files are available in the repository:
Prometheus Federation
Scrape an existing Prometheus instance via the /federate endpoint:
agent {
pathConfigs: [
{
name: "Federated Prometheus"
path: federated_metrics
url: "http://prometheus-server:9090/federate?match[]={__name__=~\"job:.*\"}"
}
]
}
This leverages Prometheus's built-in federation support, allowing you to pull metrics from another Prometheus server through the proxy.
A complete federation config is available at:
examples/federate.conf
Consolidated Mode
By default, each scrape path is owned by a single agent. If a second agent tries to register the same path, it displaces the first agent.
In consolidated mode, multiple agents can register the same path for redundancy:
When a scrape request arrives for a consolidated path, the proxy selects one of the available agents. If one agent disconnects, the remaining agents continue serving the path.
Use cases:
- High availability -- multiple agents serving the same endpoints
- Load distribution -- spread scrape load across agents
- Rolling upgrades -- new agent registers before old one deregisters
gRPC Reflection
gRPC Reflection is enabled by default, allowing tools like grpcurl to inspect the service:
# List available gRPC services:
grpcurl -plaintext localhost:50051 list
# Output:
# ProxyService
# grpc.health.v1.Health
# grpc.reflection.v1alpha.ServerReflection
# Describe the ProxyService:
grpcurl -plaintext localhost:50051 describe ProxyService
# Note: If using TLS, omit the -plaintext flag and provide cert flags.
# Disable reflection:
java -jar prometheus-proxy.jar --ref_disabled
Note
When using grpcurl with the -plaintext option, ensure the proxy is running
without TLS. When TLS is enabled, provide the appropriate certificate flags.
Performance Tuning
Concurrent Scraping
Increase the number of parallel scrapes for high-throughput scenarios:
# Increase concurrent scraping capacity:
java -jar prometheus-agent.jar \
--max_concurrent_clients 10 \
--client_timeout_secs 30 \
--chunk 64 \
--config myconfig.conf
# Tune HTTP client cache:
java -jar prometheus-agent.jar \
--max_cache_size 200 \
--max_cache_age_mins 60 \
--max_cache_idle_mins 20 \
--config myconfig.conf
agent {
http {
maxConcurrentClients = 10 // Parallel scrape limit
clientTimeoutSecs = 30 // HTTP client timeout
clientCache {
maxSize = 200 // More cached clients
maxAgeMins = 60 // Longer cache lifetime
maxIdleMins = 20 // Longer idle tolerance
cleanupIntervalMins = 10 // Less frequent cleanup
}
}
chunkContentSizeKbs = 64 // Larger chunks for big payloads
minGzipSizeBytes = 256 // Compress more aggressively
scrapeTimeoutSecs = 30 // More time for slow targets
}
Key Tuning Parameters
| Parameter | Default | Guidance |
|---|---|---|
maxConcurrentClients |
1 | Increase for many endpoints or slow targets |
clientTimeoutSecs |
90 | Lower for fast-failing scrapes |
chunkContentSizeKbs |
32 | Increase for large payloads to reduce chunk count |
minGzipSizeBytes |
512 | Lower to compress more aggressively |
scrapeTimeoutSecs |
15 | Increase for slow targets |
clientCache.maxSize |
100 | Increase if many unique auth credentials are used |
gRPC Keepalive Tuning
Fine-tune gRPC keepalive behavior for specific network environments:
// Proxy gRPC keepalive settings:
proxy.grpc {
keepAliveTimeSecs = 7200 // Interval between PING frames
keepAliveTimeoutSecs = 20 // Timeout for PING ack
permitKeepAliveWithoutCalls = false
permitKeepAliveTimeSecs = 300 // Min interval between client PINGs
maxConnectionIdleSecs = -1 // -1 = unlimited
maxConnectionAgeSecs = -1 // -1 = unlimited
}
// Agent gRPC keepalive settings:
agent.grpc {
keepAliveTimeSecs = -1 // -1 = use server default
keepAliveTimeoutSecs = 20
keepAliveWithoutCalls = false
}
See the gRPC keepalive guide for detailed tuning advice.
Stale Agent Cleanup
The proxy periodically checks for inactive agents and evicts them:
proxy.internal {
staleAgentCheckEnabled = true
maxAgentInactivitySecs = 60 // Evict after 60s of inactivity
staleAgentCheckPauseSecs = 10 // Check every 10s
scrapeRequestTimeoutSecs = 90 // Timeout for scrape requests
}
Info
When transportFilterDisabled is true, stale agent cleanup is automatically
force-enabled, regardless of the staleAgentCheckEnabled setting. This ensures
leaked agent contexts are eventually cleaned up.
Zipkin Tracing
Both proxy and agent support distributed tracing via Zipkin: