Query of the month: Flatline and Spike alert

Use this Humio query to get an alert when events are outside of normal

September 3rd, 2020

Max Aguirre and Jesper Makholm Byskov


We love hearing from customers in our public Slack channel @meethumio.slack.com. Our engineers in the US and Europe participate in discussions and answer questions, and other customers jump in to help solve problems and share ideas. If you’re not already a member, take a minute to join!

Earlier this month, a customer asked if Humio has the flatline and spike alert functionality like the one in ElastAlert. Because of Humio’s flexible query language, something like this can be easily made into a Humio query.


Humio engineer Jesper Makholm Byskov looked into it, and shares this query to make it doable in Humio.

 
"start new query"
| bucket(buckets=100)
| @timestamp := _bucket
| stats([head(100), min(_bucket, as=firstBucketStart), max(_bucket, as=lastBucketStart)])
| middle := (lastBucketStart + firstBucketStart)/2
| diffFromMiddle := _bucket - middle
| case { diffFromMiddle < 0 | group := "first"
* | group := "second"
}
| groupBy(group, function=sum(_count))
| transpose(header = group)
| factor := second / first
| factor > 2

  • "start new query”: The first line is the filter where you specify what events you are interested in. This can be a more complex filter using more lines than the above.

  • | bucket(buckets=100): The next line, "bucket(buckets=100)", puts these events into 100 time buckets and counts the number of events in each bucket. For an alert, that is running a live query, this in reality produces 101 buckets where the last one is being "filled" as the query is running.

  • | @timestamp := _bucket | stats([head(100), min(_bucket, as=firstBucketStart), max(_bucket, as=lastBucketStart)]): The next two lines exclude this last bucket so that the remaining query only looks at the 100 already filled buckets.

  • | middle := (lastBucketStart + firstBucketStart)/2 | diffFromMiddle := _bucket - middle | case { diffFromMiddle < 0 | group := "first" * | group := "second" }: The lines up until the "groupBy" puts a "group" field on the buckets, whether the bucket belongs to the first or second group, and the "groupBy" then sums up all the buckets in each group.

  • | groupBy(group, function=sum(_count)) | transpose(header = group): The next two lines make it possible to calculate the factor of events in one group compared to the other.

  • | factor > 2: The last line defines when you want your alert to trigger.

  • You need to specify a search interval that covers the two time periods that you want to compare. So if you want to compare the last hour to the hour before, you need to set the search interval to two hours.

The result of the query will only update once a bucket is full, which happens when the search interval divided by 100 has elapsed. If you want it to update more or less frequently, you can change the number 100 in the query. Remember to change it both places it occurs, and it must be an even number. Beware, large numbers mean that more memory is used to store the counts for the buckets.