Understanding <code>query_response</code> in Prometheus Blackbox's tcp prober

utcc.utoronto.ca/~ckscks2026年01月24日 02:54

Prometheus Blackbox is somewhat complicated to understand. One of its fundamental abstractions is a 'prober', a generic way of probing some service (such as making HTTP requests or DNS requests). One prober is the 'tcp' prober, which makes a TCP connection and then potentially conducts a conversation with the service to verify its health. For example, here's a ClamAV daemon health check, which connects, sends a line with "PING", and expects to receive "PONG":

  clamd_pingpong:
    prober: tcp
    tcp:
      query_response:
        - send: "PING\n"
        - expect: "PONG"

The conversation with the service is detailed in the query_response configuration block (in YAML). For a long time I thought that this was what it looks like here, a series of entries with one directive per entry, such as 'send', 'expect', or 'starttls' (to switch to TLS after, for example, you send a 'STARTTLS' command to the SMTP or IMAP server).

However, much like an earlier case with Alertmanager, this is not actually what the YAML syntax is. In reality each step in the query_response YAML array can have multiple things. To quote the documentation:

 [ - [ [ expect: <string> ],
       [ expect_bytes: <string> ],
       [ labels:
         - [ name: <string>
             value: <string>
           ], ...
       ],
       [ send: <string> ],
       [ starttls: <boolean | default = false> ]
     ], ...
 ]

When there are multiple keys in a single step, Blackbox handles them in almost the order listed here: first expect, then labels if the expect matched, then expect_bytes, then send, then starttls. Normally you wouldn't have both expect and expect_bytes in the same step (and combining them is tricky). This order is not currently documented, so you have to read prober/query_response.go to determine it.

One reason to combine expect and send together in a single step is that then send can use regular expression match groups from the expect in its text. There's an example of this in the example blackbox.yml file:

  irc_banner:
    prober: tcp
    tcp:
      query_response:
      - send: "NICK prober"
      - send: "USER prober prober prober :prober"
      - expect: "PING :([^ ]+)"
        # cks: note use of ${1}, from PING
        send: "PONG ${1}"
      - expect: "^:[^ ]+ 001"

The 'labels:' key is something added in v0.26.0, in #1284. As shown in the example blackbox.yml file, it can be used to do things like extract SSH banner information into labels on a metric:

  ssh_banner_extract:
    prober: tcp
    timeout: 5s
    tcp:
      query_response:
      - expect: "^SSH-2.0-([^ -]+)(?: (.*))?$"
        labels:
        - name: ssh_version
          value: "${1}"
        - name: ssh_comments
          value: "${2}"

This creates a metric that looks like this:

probe_expect_info {ssh_comments="Ubuntu-3ubuntu13.14", ssh_version="OpenSSH_9.6p1"} 1

At the moment there are some undocumented restrictions on the 'labels' key (or action or whatever you want to call it). First, it only works if you use it in a step that has an 'expect'. Even if all you want to do is set constant label values (for example to record that you made it to a certain point in your steps), you need to expect something; you can't use 'labels' in a step that otherwise only has, say, 'send'. Second, you can only have one labels in your entire query_response section; if you have more than one, you'll currently experience a Go panic when checking reaches the second.

This is unfortunate because Blackbox is currently lacking good ways to see how far your query_response steps got if the probe fails. Sometimes it's obvious where your probe failed, or irrelevant, but sometimes it's both relevant and not obvious. If you could use multiple labels, you could progressively set fixed labels and tell how far you got by what labels were visible in the scrape metrics.

(And of course you could also record various pieces of useful information that you don't get all at once.)

Sidebar: On (not) condensing expect and send together

My personal view is that I normally don't want to condense 'expect' and 'send' together into one step entry unless I have to, because most of the time it inverts the relationship between the two. In most protocols and protocol interactions, you send something and expect a response; you don't receive something and then send a response to it. In my opinion this is more naturally written in the style:

      query_response:
      - expect: "something"
      - send: "my request"
      - expect: "reply to my request"
      - send: "something else"
      - expect: "reply to something else"

Than as:

      query_response:
      - expect: "something"
        send: "my request"
      - expect: "reply to my request"
        send: "something else"
      - expect: "reply to something else"

What look like pairs (an expect/send in the same step) are not actually pairs; the 'expect' is for a previous 'send' and then 'send' pairs with the next 'expect' in the next step. So it's clearer to write them all as separate steps, which doesn't create any expectations of pairing.