5 ways to increase efficiency of logging in AWS

Use Humio to provide real-time visibility of AWS cloud data

July 21st, 2020

The race to adopt cloud-native technologies presents a challenge for engineers tasked with monitoring. When monitoring is considered an afterthought, keeping stock of all of your application layers and endpoints in a distributed cloud environment can be, quite frankly, a mess.

In an environment where customers demand a real-time response and cybersecurity threats can exfiltrate data can occur in seconds, it’s no longer acceptable to ignore monitoring until after a problem hits. Don’t wait for a disaster to find out that your logging practices have been inefficient. Take the proactive approach for streamlining your monitoring and you can begin to get positive business value and insights out of your logs with minimal maintenance.

The main premise behind our 5 ways to increase the efficiency of logging in Amazon Web Services (AWS) is to know all of the features Amazon offers and choose solutions that optimize how you manage the flow of data. Consolidating and centralizing log data at several points in the process will make maintenance easier, improve security, and speed up performance across your cloud systems.

1. Use multiple AWS organizations and manage them with the same tools and processes

AWS Organizations allow you to control access and policies set in place in a variety of ways. You can create organizational units (OUs) based on departments, applications, environments, or projects. Once configured, you no longer have to individually configure access or create a custom script to use for each account. New policies can be implemented on a system-wide organizational unit or individual scale.

2. Use AWS GuardDuty organizations to centralize the management of GuardDuty CloudWatch events

AWS GuardDuty provides some automation of security monitoring of AWS user accounts. It consolidates log data associated with all AWS users behavior including VPC Flow Logs, AWS CloudTrail event logs, and DNS logs. Comparing user activity against intelligence feeds that feature malicious identifying information such as IPs, GuardDuty then provides alerts through CloudWatch.

3. Understand object lifecycles in S3 and consider using different storage classes when appropriate

Amazon provides a variety of options for reducing costs by moving data from high availability zones to less available zones. By choosing a slower archival form of data, you can reduce storage costs.

In order to understand what archive schedule best works for your organization, conduct an audit to determine which data you’ll need when. Ask yourself:

  1. How long do you need data to be active and searchable? - This will inform active storage.

  2. How long do you need to store data for compliance reasons? - This will inform archive storage.

  3. Is there a time after which it is permissible to delete data? - Keep in mind modern log management with high rates of compression may reduce the costs of long term retention to a negligible rate.

Explore the various rules and restrictions involved by reading Amazon’s lifecycle transition general considerations.

If using Humio log management, you can optimize for longer storage of more than six months by increasing file sizes of stored logs. By storing logs in larger files, you’ll have fewer overall files. To see how to do this, visit Performance Tuning for Long Retention Settings.

4. Optimize transfer rates

Start by selecting servers used for your monitoring servers on AWS in the same region as your other servers.

If you are still wanting to have faster speed of data going to S3, and your servers are not in EC2, you can use Amazon’s DirectConnect to shorten the transmission and prevent having to communicate across the internet.

If dealing with high volumes of log data, managing concurrency settings becomes important to ensure there are no latency delays. As data volumes increase, any latency delay becomes multiplied. Even with a delay of only milliseconds becomes unacceptable if millions of objects are generating logs.

In Humio, storage chunk size is by default set at max. You can adjust the concurrent number of uploading or downloading files.

If you’re undertaking a bulk ingest event, it’s likely managing concurrency settings alone will not account for latency issues. In these circumstances, consider using Amazon Snowball for bulk shipping.

5. Upgrade third-party log management tools

When looking at logging data from AWS, you can often be forced to look at a siloed version of your data. For AWS Lamda functions you can only look at corresponding CloudWatch data for one Lamda function at a time. To get a better understanding of the holistic view of your system, you have to use an external tool to aggregate all log data.

Use a third-party tool in order to analyze data from multiple invocations at once. Humio provides you with a way to aggregate data and observe metrics like latency and garbage collection times across all services and functions.

Made without indexes, Humio has built-in efficiency that provides sub-second latency for search results. Legacy log management tools may slow down monitoring because they take sometimes minutes to ingest data. With Humio’s millisecond ingestion, the slowest part of accessing data will come from the internet lag, not the monitoring software.

To learn more about Humio, sign up for a free demo and see how it brings speed and efficiency to AWS.

Attend our Streaming observability and monitoring from AWS services workshop to see how to expand cloud service monitoring to include Kubernetes and additional application logs.

Watch our workshop to see how to set up Humio in AWS for real-time visibility.