Make Your Logs Work for You

The days of logging in to servers and manually viewing log files are over. SolarWinds® Papertrail™ aggregates logs from applications, devices, and platforms to a central location.

View Technology Info

FEATURED TECHNOLOGY

Troubleshoot Fast and Enjoy It

SolarWinds® Papertrail™ provides cloud-based log management that seamlessly aggregates logs from applications, servers, network devices, services, platforms, and much more.

View Capabilities Info

FEATURED CAPABILITIES

Aggregate and Search Any Log

SolarWinds® Papertrail™ provides lightning-fast search, live tail, flexible system groups, team-wide access, and integration with popular communications platforms like PagerDuty and Slack to help you quickly track down customer problems, debug app requests, or troubleshoot slow database queries.

View Languages Info

FEATURED LANGUAGES

TBD - APM Integration Title

TBD - APM Integration Description

TBD Link

APM Integration Feature List

TBD - Built for Collaboration Title

TBD - Built for Collaboration Description

TBD Link

Built for Collaboration Feature List

Blog > Announcing search alert minimum thresholds

Announcing search alert minimum thresholds

Posted by By Troy Davis on February 11, 2013

As users know, Papertrail has offered painless log search alerts for some time. Notify a Campfire chat room, email a team, or kick off a PagerDuty escalation process when something important happens. Get periodic summaries about less-critical issues.

Notifications are inherently binary, though: either an alert fires or it doesn’t. You want to know or you don’t. There’s often a gray area in monitoring, though, where two of something is fine, 20 is not.

Papertrail now lets you decide that gray area. Papertrail search alerts can now include a minimum number of events that must occur (during the alert interval) for the alert to be invoked. The default and existing behavior is a minimum of 1.

What’s possible with a minimum threshold?

  • Allow for endemic problems. With some issues, a constant undercurrent occurs all the time. Internet-facing problems like 404s, brute-force attacks, and site scrapers are the obvious examples. I don’t care that they happen, but I do care when they cross above what I consider noise.
  • Allow for known issues. If you’re short on RAM, the fact that Linux’s “oom_killer” killed a process once to save RAM probably isn’t severe. If it kills 5, it is. This is particularly handy with code exceptions (known bugs) and services where log output can’t easily be modified.
  • Detect when less-critical issues become trends. For example, I’d like to know when more than 10 slow queries occur across all of my database servers. That’s not a slow query, that’s a trend.
  • Combine these. Take velocity into account when deciding what the alert should do. When 50 of something happen in a 10 minute period, it’s urgent; tell me in HipChat. Separately, email a summary to my team every day, regardless of volume.

How to use it

Use this in combination with log filtering and “Show related logs‘ links to ignore the events that don’t matter, react to those that appear to be important, and link directly from your admin dashboard to those that you know are relevant.

Perhaps the best part of Papertrail’s search alert threshold is that it comes without a ton more time or effort on your behalf. There’s not a zillion controls because there doesn’t need to be; there’s the one that matters.

As product designers, we always try find the knobs that solve the most problems – expose the most power – with the least, and least visible, complexity. There’s not many better examples of it than this.