What timestamps you get back along with Prometheus query results

January 10, 2021

When you make a Prometheus query in PromQL, the result has both values and timestamps for those values (as covered in the API documentation). This is the case for both instant queries and range queries. Usually tools ignore the timestamp on instant queries and use it to order the values for graphing or otherwise displaying the results of ranged queries.

(In Grafana, the timestamp is one of the fields you can display in a table. For reasons that we'll cover, the timestamp of typical queries is usually uninteresting and you routinely hide it from being displayed.)

Simplifying somewhat, the result of most PromQL expressions is what we can consider to be an instant vector, which is to say that there are a bunch of metric points, their values, and an associated timestamp. This is true both for instant queries and for range queries; for range queries, the PromQL expression is evaluated at each query step and then all of those individual query results are put together and returned (where Grafana or the Prometheus console will generally pick them apart to generate a graph).

For normal PromQL queries that result in these instant vectors, the timestamp associated with each value generated by the query is the time at which the query ran. For an instant query, this is 'right now' (or whenever you set the query to be at), even if you used offset in the expression. For a ranged query, the time the query ran is the time of that particular query step. As I found out before, this time can be surprising for subqueries because Prometheus rounds off the time. This timestamp is emphatically not the time of the metric (or metrics) that the query is using, and we can see the gap by looking at the results of a query like:

time() - timestamp( node_load1 )

(How big the difference can be obviously depends on how frequently Prometheus pulls the metric.)

However, a PromQL expression that uses a range vector selector on a simple metric to return a range vector as the result is different. As I described in how to extract raw time series data, such a query returns a set of values and timestamps where the timestamp is the underlying timestamp of the metric point in Prometheus's time series database (TSDB), the same value that timestamp() would give you, and you get as many elements in the range vector as Prometheus actually pulled from the metrics source over the time range and has in its TSDB. Right now (and probably in the future), such PromQL queries must be instant queries; if you try to make a range query with a PromQL expression that returns a range vector, you will get an error from Prometheus.

A PromQL expression with a 'bare' subquery (one not reduced down by aggregation operators and so on) can also return a range vector from an instant query, but the timestamps of the values behave as if you made a range query; they are the evaluation time of each query step (as altered and rounded by Prometheus). Effectively a subquery acts as a range query, and I believe it's more or less implemented as that inside Prometheus.

PS: Technically PromQL expressions can also return a scalar or a string, per the API documentation. I believe that you get scalars from PromQL expressions that don't involve any metrics, for example just 'time()'. I'm not sure what PromQL expression could give you just a string. Both scalar and string results have timestamps, and for scalar results the time is definitely the time the expression was evaluated at.

(This is easy to see by making a query for 'time()'; the result has the same number for the timestamp and the value.)

Written on 10 January 2021.
« How to extract raw time series data from Prometheus
Thinking through why you shouldn't use plaintext passwords in authentication, even inside TLS »

Page tools: View Source, Add Comment.
Search:
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Sun Jan 10 00:35:41 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.