Amazon Web Services (AWS) comprises more than 90 services and covers everything from computing and storage to analytics and Internet of Things tools. Using these services to build applications at scale requires constantly monitoring the entire software stack to make sure the wheels keep turning. But it’s when issues arise, and the wheels come off, that every developer puts their AWS logging setup to the test.
In this article, we’ll cover a mixture of tips for AWS’s own logging services as well as how to get the most out of third-party log management tools, so that no matter what your logging setup looks like, it gives you the data you need to keep your applications running.
Running applications on AWS can often involve using many services, each of which produces logs that need to be monitored. Amazon CloudWatch Logs, a service that collects and stores logs from your application and infrastructure running on AWS, provides the same features expected of any log management tool: real-time monitoring, searching and filtering, and alerts.
Because CloudWatch Logs sits directly on the AWS platform, it integrates easily with other AWS services and provides a simple way to monitor AWS logs. For developers and sysadmins who want to get their logging up and running quickly, CloudWatch Logs is a compelling choice.
Creating actionable logs involves writing all contextual information that could be useful for diagnosing issues in the future. This can include sensitive data that would present a security issue if leaked outside of your organization. To prevent that, you need to encrypt your logs.
CloudWatch Logs encrypts log data in transit and at rest by default. If you need more control over exactly how the data is encrypted, CloudWatch Logs allows you to encrypt log data using an AWS Key Management Services customer master key (CMK). For even tighter control, you can use IAM roles to list which users and services can access those keys and decrypt log data.
If you want to collect event messages in CloudWatch Logs but need to use other tools to analyze them, it’s possible to forward logs to an Amazon Simple Storage Service (S3) bucket. Storing all logs in one place with S3 makes it easier to find the log file you’re looking for, and it means analysis tools only need access to a single location.
If you’re running a large enterprise environment, restricting where logs are stored can help you to meet auditing and compliance requirements. And if you need to collate logs from multiple accounts, there’s a way to do that with CloudWatch Logs, too.
The CloudWatch Logs Agent runs on your instances (whether that’s Windows or Linux) and handles sending EC2 logs to CloudWatch Logs.
As you collect logs for more and more instances, you can use log groups to easily locate related data, such as all the logs from applications deployed to staging servers.
To correlate your logs in CloudWatch from multiple EC2 instances, use the log_group_name option in your CloudWatch Logs Agent configuration file. This option specifies the name of the log group to store logs in.
Another idea for improving the utility of logs in CloudWatch Logs is to specify a custom timestamp_format option. If you don’t provide one yourself, the time that the log was ingested into CloudWatch Logs is used, rather than the time the event actually occurred. This can make troubleshooting more difficult because it’s impossible to correlate incidents with the data in your logs.
AWS has several ways to store and monitors logs, but if you’re using an external tool such as Fluentd, SolarWinds® Loggly®, or SolarWinds Papertrail™, you can use Amazon Lambda to forward them. Amazon Lambda is a serverless application framework that automatically allocates whatever resources are needed to execute code, and that makes it ideal for executing short functions.
Because Lambda uses an event-driven architecture, you can use it to read from CloudWatch Logs—where all your AWS logs have been aggregated—and forward them a third-party centralized log service. Many Lambda functions exist to forward logs to log managers.
There is a caveat to be aware of when using AWS Lambda to forward logs—you may hit the account level concurrent execution limit. This limit places an upper bound on the number of concurrent executions for a given AWS region, and currently, that limit is 1000.
It’s possible to suddenly run into this limit when executing Lambda functions that trigger logs to be pushed to CloudWatch, since each function that triggers a push causes a second Lambda function to execute to forward the log data; 500 functions executing concurrently is all that’s needed to hit this limit.
To work around this issue, you first need to stream your logs from CloudWatch Logs to Amazon Kinesis Data Streams, Amazon’s scalable real-time data streaming service. Once you’ve sent your log data to one or more shards, the data from those shards is consumed by Lambda functions. Crucially, only one Lambda function is executed to process a shard, so you can control the number of concurrent Lambda functions by controlling the number of shards.
Similar to the way you can group logs in CloudWatch Logs using the log_group_name option, you can store logs from related instances together with some log management products. This makes it easier to analyze issues and their impact when you can see, for example, that an issue is only affecting instances in the staging group and that production machines are running fine.
Papertrail is one log management tool that can map log senders to groups. In fact, Papertrail provides more than one way to establish the mapping for instances running on ECS, and can select a group by changing the log destination or by using the group operations available in the REST API.
Most AWS services provide a mechanism for running scripts and automated configuration tasks after an instance starts. If your log management setup requires you to access a REST API shortly after booting (to add your instance to a group, for example), this is the ideal place to do it.
With EC2, you can add your scripts in the launch wizard as base64-encoded text, and Elastic Beanstalk has a commands configuration option to specify which commands to run immediately after an instance boots.
While any decent log format will include machine identifiers, it’s a good idea to include AWS-specific details for applications running on AWS. These extra details help you not only identify the common traits of instances affected by an issue (matching ami-id strings can help diagnose which images exhibit the problem) but also understand the scale of problems and whether all instances in a specific region or availability zone (availability-zone) are impacted.
Collecting and storing logs with CloudWatch Logs isn’t right for every application, especially if your software also runs on infrastructure that lives outside of AWS. Fortunately, Amazon ECS instances can be paired with a dedicated logspout container to send logs directly to a log management tool. Logspout is a highly-configurable log router for Docker that blindly forwards logs to another destination, and the only change required to the ECS instance is to use the journald logging driver.
A step-by-step guide for using logspout on ECS is available in the Papertrail knowledge base.
Amazon CloudTrail continuously monitors and logs API calls made to your AWS account. Those log files contain a detailed access history which comes in handy during security and forensic investigations when you need to provide log files that have not been tampered with. With CloudTrail, you can validate log file integrity and detect whether log files have been modified or deleted, and even which log files were delivered to your account during a specific time period.
Using SHA-256 hashing and digital signing algorithms, CloudTrail creates a hash of every log file delivered to your account and stores it in a signed digest file. You can enable this feature using the AWS Management Console, AWS CLI, or CloudTrail API.
The collection of platforms under the Amazon Web Services name provide everything you need to build world-class services that scale with demand, and that includes logging tools such as CloudWatch Logs and CloudTrail. For developers and sysadmins that also run their applications outside of AWS, AWS log management also involves collecting logs in a centralized place. But whether you use AWS’s logging solutions or your own, the 11 tips in this article will help you get the most out of AWS logging.