Saturday, July 29, 2023

Finding Log Volume Ingestion Anomalies in Splunk

 

This is for my man Destry who I met recently in person. He was giving me a bit of good-natured fun at not posting more frequently. So Destry, this is for you!

I’m doing a Splunk tips & tricks workshop this week with some folks who, among other things, had asked for a query to identify log volume anomalies. Ahh volume anomalies. So many variations of this. Several apps can be found on Splunkbase which have been developed by the user community. One might ask why Splunk hasn’t incorporated more of this sort of thing in the Monitoring Console /shrug.

My normal recommendation to folks is run a few queries to capture log volume (internal index license log) and event counts (tstats) in a ‘summary index’ for long term retention and quicker analysis. Some of that is likely found in the introspection index but I’ve not done a deep dive there TBH. The workshop I’m doing is with folks in a multi-tenant environment where each would like to do their own quick analysis.

So let’s define a few goals

  • When a host is sending abnormally more or less of a data type compared to other hosts
  • When a host is sending abnormally more or less of a data type compared to itself
  • One query to do both comparisons to keep compute down and not have intermediate steps (like populating or reading from a lookup) for simplicity