Recap of Observability and Monitoring from AWS Services Workshop

See how our internal DevOps teams gets the most out of Humio and AWS Cloudwatch

July 28th, 2020

Humio excels as an observability tool by collecting data from cloud native systems in real time to provide an overview of complex distributed AWS systems.

To give us a closer look at Humio’s role interacting with AWS, Head of Infrastructure Engineer Grant Schofield takes us behind the scenes in our Streaming observability and monitoring from AWS services workshop. He explains how Humio gets observability into its own cloud systems. He shares his deeper understanding of how the system is configured, and offers tips you may find helpful in your own system.

"You can choose really good tools that make your system observable. You have to know what’s important to you."

To fully understand how Humio makes AWS observable, we take a look at the types of data generated in AWS Services, and we share an overview of Humio’s monitoring goals.

Overview of AWS Data

AWS has hundreds of optional services, most of which provide abundant free logs and metrics. Most of them send their data to CloudWatch as an aggregation point, but some can send directly to S3. Data is sent out of CloudWatch by using serverless Lambdas, and it can be sent in a variety of ways.

Options for getting data out of CloudWatch:

  • Send data straight to S3

  • Send to CloudWatch log stream which gets written to S3

  • Create events for services

CloudWatch Events

CloudWatch events can be used to automate a response to security or performance issues.

They can trigger actions based on rule patterns or they can be scheduled to run on a regular basis. They typically function by triggering a Lambda that can add to an SQS messaging queue or send an email. It can send alerts such as an email notifying you when your site is scaling up.

AWS CloudWatch Logs

Because of the decentralized nature of cloud services, logs are not simply written to a single file. Every application or service is going to belong to a log group, and that log group is a collection of log streams. Grant describes the architecture of CloudWatch:

Distributed systems are fairly complex, with data coming from all over the world that is getting streamed into CloudWatch. So there isn’t one log file that is getting appended to, there are log streams that are getting appended from the application. The way you look at those in Humio is from the Log Groups.

Overview of Humio

Based on index-free logging, Humio is a SaaS solution designed to meet the needs of high-volume data sources. It’s designed to handle inputs of TBs of data a day while returning search results in just a few seconds.

Humio is built to accept data from any major data shipper. This minimizes setup time. If you’re transitioning to using Humio from another logging solution, you can keep your current file shipper; you just need to set Humio as the destination.

Once data arrives in Humio, it is parsed. Common formats such as JSON are easier to configure, but Humio will work with any form of structured or unstructured data. The most important factor is that each piece of data has an associated timestamp. If there is no timestamp with incoming data, Humio will give it one based on ingestion time.

Unique aspects about Humio’s CloudWatch structure

Having disparate data sources is important for us to have a total view of our systems.
Grant Schofield

Director of Infrastructure at Humio

Currently, we have two Lambdas exporting data — one for our logs and metrics and one for our events. Since Humio runs Kubernetes extensively, we use a Prometheus CloudWatch exporter for our metrics. This isn’t required, but it was easy to set it up this way because we use Prometheus for other tasks.

Prometheus runs as a Java process, so we expose CloudWatch metrics to Prometheus clients. Pods scrape data automatically and ship it using Filebeat. To configure automated CloudWatch events, we use Terraform, but using the UI is also perfectly acceptable.

One data-generating AWS service we use extensively is GuardDuty. It is inexpensive and it provides extensive event data about both external security threats and internal irregularities. We use GuardDuty event data to construct dashboards and send alerts in Humio of any event above a severity rating of 4.

Tips for execution

It’s hard to improve on the reliability and availability of using Humio alongside AWS services. The real opportunity to optimize using the two together is to make improvements to reduce costs. To accomplish this, you can actually use Humio to monitor billing, because CloudWatch events include billing events. You can ship them from CloudWatch and construct a billing dashboard that will keep track of all you’re spending.

To control costs, keep in mind:

  • Using logs may be cheaper than metrics due to API calls - Each API call incurs a small cost. While not much at first, if your system is set up incorrectly, you can increase your bill by hundreds of dollars if it is making millions of unexpected API calls.

  • Many metrics don’t update every minute - Reduce costs by not polling more often than they are updated.

  • Use AWS Organizations if available for a service - AWS reduces the number of Lambdas to manage one for all users rather than one for each user.

  • Be mindful of the costs, only get the metrics you need - Humio has the bandwidth to accept unlimited amounts on incoming data, but your budget might not be unlimited.

When setting up CloudWatch, we recommend keeping track of your bill for a week or two and make sure you aren’t incurring unexpected additional costs.

Watch the full workshop

See examples of GuardDuty dashboards, parsing schema, and more CloudWatch configurations by watching the full Streaming observability and monitoring from AWS services workshop on-demand. In the workshop, Grant answers additional questions about shipping metrics to third-party tools, observability, and migrating to Humio from Elastic.

To get a personalized look into using Humio for cloud services, request a live demo.

To learn more about Grant’s work at Humio, read about how his team achieved an ingest rate of over 100 TB/ day on only 25 Humio nodes.