Understanding <code>query_response</code> in Prometheus Blackbox's tcp prober
Prometheus Blackbox is somewhat complicated to understand. One of its fundamental abstractions is a 'prober', a generic way of probing some service (such as making HTTP requests or DNS requests). One prober is the 'tcp' prober, which makes a TCP connection and then potentially conducts a conversation with the service to verify its health. For example, here's a ClamAV daemon health check, which connects, sends a line with "PING", and expects to receive "PONG":
clamd_pingpong:
prober: tcp
tcp:
query_response:
- send: "PING\n"
- expect: "PONG"
The conversation with the service is detailed in the query_response
configuration block (in YAML). For a long time I thought that this
was what it looks like here, a series of entries with one directive
per entry, such as 'send', 'expect', or 'starttls' (to switch to
TLS after, for example, you send a 'STARTTLS' command to the SMTP
or IMAP server).
However, much like an earlier case with Alertmanager, this is not actually what the YAML syntax is.
In reality each step in the query_response YAML array can have
multiple things. To quote the documentation:
[ - [ [ expect: <string> ],
[ expect_bytes: <string> ],
[ labels:
- [ name: <string>
value: <string>
], ...
],
[ send: <string> ],
[ starttls: <boolean | default = false> ]
], ...
]
When there are multiple keys in a single step, Blackbox handles
them in almost the order listed here: first expect, then labels
if the expect matched, then expect_bytes, then send, then
starttls. Normally you wouldn't have both expect and expect_bytes
in the same step (and combining them is tricky). This order is not
currently documented, so you have to read prober/query_response.go
to determine it.
One reason to combine expect and send together in a single step
is that then send can use regular expression match groups from
the expect in its text. There's an example of this in the example
blackbox.yml file:
irc_banner:
prober: tcp
tcp:
query_response:
- send: "NICK prober"
- send: "USER prober prober prober :prober"
- expect: "PING :([^ ]+)"
# cks: note use of ${1}, from PING
send: "PONG ${1}"
- expect: "^:[^ ]+ 001"
The 'labels:' key is something added in v0.26.0, in #1284. As shown in the example blackbox.yml file, it can be used to do things like extract SSH banner information into labels on a metric:
ssh_banner_extract:
prober: tcp
timeout: 5s
tcp:
query_response:
- expect: "^SSH-2.0-([^ -]+)(?: (.*))?$"
labels:
- name: ssh_version
value: "${1}"
- name: ssh_comments
value: "${2}"
This creates a metric that looks like this:
probe_expect_info {ssh_comments="Ubuntu-3ubuntu13.14", ssh_version="OpenSSH_9.6p1"} 1
At the moment there are some undocumented restrictions on the
'labels' key (or action or whatever you want to call it). First,
it only works if you use it in a step that has an 'expect'. Even
if all you want to do is set constant label values (for example to
record that you made it to a certain point in your steps), you need
to expect something; you can't use 'labels' in a step that
otherwise only has, say, 'send'. Second, you can only have one
labels in your entire query_response section; if you have
more than one, you'll currently experience a Go panic when checking
reaches the second.
This is unfortunate because Blackbox is currently lacking good
ways to see how far your query_response steps got if the probe
fails.
Sometimes it's obvious where your probe failed, or irrelevant, but
sometimes it's both relevant and not obvious. If you could use multiple labels,
you could progressively set fixed labels and tell how far you got
by what labels were visible in the scrape metrics.
(And of course you could also record various pieces of useful information that you don't get all at once.)
Sidebar: On (not) condensing expect and send together
My personal view is that I normally don't want to condense 'expect'
and 'send' together into one step entry unless I have to, because
most of the time it inverts the relationship between the two. In
most protocols and protocol interactions, you send something and
expect a response; you don't receive something and then send a
response to it. In my opinion this is more naturally written in the
style:
query_response:
- expect: "something"
- send: "my request"
- expect: "reply to my request"
- send: "something else"
- expect: "reply to something else"
Than as:
query_response:
- expect: "something"
send: "my request"
- expect: "reply to my request"
send: "something else"
- expect: "reply to something else"
What look like pairs (an expect/send in the same step) are not actually pairs; the 'expect' is for a previous 'send' and then 'send' pairs with the next 'expect' in the next step. So it's clearer to write them all as separate steps, which doesn't create any expectations of pairing.