Saturday, November 23, 2013

Splunk trick to display a one to many relationship in a table with granular and aggregate values

Another oil change and another blog post. Good thing there isn’t always a relationship between those two things or my cars wouldn’t be running. Figured I would spend a minute talking about one of my new favorite Splunk tricks. I think I ran across this in the Implementing Splunk book but then could never find again. Was pleased that it showed up in the Advanced Searching and Reporting class I took at .conf2013 (great class BTW). The trick relates to formatting data and covers a variety of use cases mostly related to displaying a one to many relationship at various levels of granularity. 

Let’s say you want to generate a list of the hosts in each index and the number of events associated with each over some period of time. You could start with a search like the following:

index=* | stats count by index host

but the index field because visually redundant. My training area only has one index and 2 hosts but for that search returns the following 

What I’d really prefer is to have an output that more resembles using the values command as in the following


index=* | stats count values(host) by index









The problem with this search though is it is showing the total events for both hosts but how many are associated with each? What I really want is the aggregate and eaches in the same view. The solution is a two step process where you first collect the information by host and the second is using the stats list command. The search is as follows

index=* | stats count by host index | stats dc(host) as Unique_Hosts sum(count) as Total_Events list(host) as Hosts list(count) as Events by index







There are a couple caveats to this. The first is if you want to do some sorting based on data collected by the first stats command you need to do that before the second stats command otherwise it will throw off results from the count field (Events in the case above) if you throw a …. | sort –Hosts at the end of the search. I’ve also seen a minor formatting issue when you generate a PDF. The last issue is if you export the data to csv you end up with a space delimited list of all the items in the cell (in this case the Hosts and Events field).


Hope you find this helpful. I’ve used this in all kinds of visualizations – especially since you can sort/filter on the aggregate values.

No comments:

Post a Comment