Mark Runals' Blog: Monitor

Showing posts with label Monitor. Show all posts

Saturday, February 7, 2015

Gaining visibiliy to ad-hoc data exports from Splunk

Along the same lines of understanding how your users are using Splunk and dovetailing into are users abusing their access to data in Splunk is taking a periodic look into what data they might be exporting. By that I mean exporting to a csv or maybe generating a pdf of a dashboard. Ideally you would like to know, for example, if this Mark character has exported something, what format was it in, what was the search, and how many records or results were included in the download.
There are a couple challenges

Search results (result count, events searched, etc) are in the internal search completion logs while the search parameters are in the internal search initiation logs.

Those logs are separate from the web logs that indicate someone has performed one of the export actions.

The various Splunk commands you might use to merge all of this data has some limitations that you will need to keep in mind. For example to use a subsearch to get something like search_id and pass it to a parent search is limited by default to a 60s runtime and/or 10k results. A join or append is limited to a 60s runtime and/or 50k records, again by default. If you have even a moderately sized deployment over the course of several days you have thousands of searches being run when you factor in your users, scheduled content, and internal Splunk processes. I suppose one way to mitigate this is to review the detection query output every day but that seems a little too frequent to me.

Splunk DateParserVerbose logs - Part 2

In part 1 of this subject we talked about what Splunk's DateParserVerbose internal logs are and I gave an example query that at its heart attempts to rollup and summarize timestamp related issues. In this post I'll present a query for taking the sourcetypes Splunk is having issues with from a timestamp perspective and display the relevant props configs. What we've done is thrown both queries into the same dashboard to make things easier to work though. I should note a couple things here. The first is the foreach command is only available in Splunk 6 (I believe). The second is the REST endpoint I'm getting the config data from is likely only available in 6.

With that out of the way here is the query:

A cappella log management

A curious thought hit me the other day in church; hopefully I can articulate it well enough to be understood – even by me. For those that aren’t familiar there is often full congregational signing as part of the worship service. As you might expect this includes musicians playing various instruments, several folks up on the stage signing as part of the choir, and in the olden days a hymnal in your hands that has the words. I’m going to murder terms but to translate it in my head you have the technical musicians translating symbols on a page of music into sounds on their instruments, the next layer up are the signers who combine the lyrics with the symbols to know how to sing the words, and then you have a presentation layer where the rest of the congregation is able to see the words and take their singing cues from the music and choir. More often than not these days those words are presented on a projector screen without the musical score which works for me since I can’t read that bit anyway. At any rate there is a cumulative layering effect that allows folks like me who aren’t musically inclined or trained to participate. (Is there a musical OSI equivalent?)

The difference a couple weeks ago was the music was done a cappella and only a small number of the choir was singing. While I could see the words on the projector I was much slower to catch on to the “tune.” I and many others didn’t sing as loudly that day (this is a good thing in my case).

So what’s the link you ask? Going through the process of leaving my job to pursue another I have been somewhat introspective in thinking of what I could have done differently to try to explain the need for log management and where that fits into larger discussions of security strategy, incident detection, incident investigation, monitoring, etc. By and large I think the message wasn’t understood the higher you go up the management chain. And while I’m somewhat saddened and disappointed by that fact I’m not going to beat myself up over it. I tried several times and several different ways to communicate the need and why certain paths were better than others not only for immediate needs but also for where that positioned us 6 months, a year, several years down the road.

If you are reading this blog you, like me, can at least at a conceptual level look at something like an incident detection use case and probably see all or at least a lot of what must go into what is required for the alert to be tripped from a log collection perspective, what needs to happen once it goes off, how that use case fits into the larger picture, etc - much like my wife can look at a sheet of music and hear the music in her head. The challenge I pose to us all is - are we communicating our message a cappella in that our audience is perhaps just seeing “the words.” Something to think about the next time you do a presentation perhaps.

Tuesday, June 26, 2012

Statistical vs Rule based Threat Detection

A number of different discussions have led me to think about the difference between log management and SIEM when it comes to their use and play in threat detection. Of the many items that could be discussed what I came to is the difference between statistical and rule based threat detection.

An oft used analogy, even referenced in the Verizon DBIR, is the difference between looking at haystacks and needles in haystacks. A statistical detection methodology might be to review the top N of X activity within Y timeframe. The point of this exercise is twofold. The first is simply to look at the “curve” of the numbers involved in the top 5 relative to the top 10 relative to the top 30 or just a spike in a line graph showing the volume of logs collected. All of which might indicate something has happened and might be worth diving deep on. The second is to help dial in your tools by whitelisting systems performing normal activity. If you are looking for outbound SMTP traffic sorted by volume in a day, you should be able to easily spot your email gateways. Whitelist them and the next time the report is run the top of the list might contain compromised systems. By and large many log management systems should be able to accomplish this sort of activity.

At some point though you will want to focus in on specific threat activity. Take today’s SANs Diary update on Run Forest Run or Sality, Tidserv, whatever. In this case you have specific information and want to receive an alert when one of your internal systems hits an IP, URL, or multi-step pattern of activity. This is your rule based needle finding capability. Generally speaking this requires a rule engine of varying level of sophistication located within some point solution or product. Statistical detection won’t really be able to get you this.

The challenge is many point solutions, by design or omission, aren’t able to factor in the larger view in reviewing the needles found. MSSPs are notorious (too strong?) for this but then so are things like IPS. “We saw this and this so we wrapped it up in a pretty bow for you” ….great, but I need more context. The SIEM technology space, in general, was supposed to fill this gap. Not only can you develop specific rules to find needles in the overall stream of log consciousness but to a greater or lesser extent, based on vendor/tool/administrator, use them in a more statistical way. I think a lot of “Mah SIEM sux” mentality comes from how you approach your SIEM relative to this overall issue. That is a rabbit hole that I don’t want to go down though.

Especially when you first start out, if you dive directly to a rules based approach you will have a harder time seeing the forest for the trees and depending on the tool used will be frustrated that you can’t move that lens backward. In other words dealing with individual infections IS key but if you are so focused on the individual detections you lose sight of the bigger picture you can do yourself a disservice. On the other hand if you just do a statistical approach and never grow you are going to miss the needles you need to find.

I would argue there is a direct correlation between shop maturity and the ability to full leverage rule based technology. If you are just starting out I suggest you will see more value with a log management, statistical threat detection methodology. This will allow you to get to know your data – as strange as that might sound – which will allow you to better dial in your rule based solutions.

Wednesday, March 7, 2012

Treating the symptom

Today in the car I heard a credit card debt consolidation commercial that sort of drove me crazy. While there is a place for those companies the last line got to me - “If credit card debt is the problem, we are the solution.” Heads up there high speed - the real problem is you can't stop buying stuff you can't afford! Your credit card debt is just the visible symptom. Treating the symptom instead of the problem only lands you back in the same spot. This goes hand in hand with the diet pills that basically say take this magical pill to lose weight and that you don't even have to change your daily habits....like over eating and getting no exercise...which is what got you where you are.

News flash – IP addresses aren’t computers

Crazy thought I know but it isn’t hard not to get caught up in that mentality. I was trying to think of a way to tell the story of resources/logs needed to be able to be able to identify sources of badness in incident response of one flavor or another. Visually I was drawing that out somewhat like IP ~ Computer Name ~ User Name all at the top level and branching under that you have various logs like DNS, DHCP, asset management, authentication, etc. All of which play a part in being able to answer questions related to what computers are infected and which users are doing ‘bad things.’ Anyway, it wasn’t until I put that down on the whiteboard that the thought hit me in that IP addresses are a supporting factor in identifying a particular computer and not equal to it. Funny what tool limitations will do to your thinking.

If none of that makes sense to anyone other than me I blame the Nyquil.

Wednesday, February 1, 2012

The unrecognized APT story....

2 hobbits sneaking into Mordor.

Somehow I don't see me using that analogy in the executive boardroom though.

Wednesday, October 19, 2011

What's the value of chasing alerts and other musings

I don’t know what your thoughts on this are but I’m trying to work out an illustration on how the churn related to the hamster wheel of your run of the mill incident detection and response doesn’t really lead to a whole lot of increased security posture or reduced risk – at least not directly or by itself. Don’t get me wrong, that work needs to be done and isn’t a trivial component of your overall program. At the same time I think this is one of those things where activity doesn’t necessarily indicate/translate/equal accomplishment. Great – you cleaned X machines with Y malware. Next week it will be N machines with Z malware. Soooo….does that mean you are more or less secure than the month before?

I’m of the general opinion that an improved security posture/reduced risk/reduced exposure is a by-product of doing analysis on the information gathered from your incidents and using that to drive change in either various configuration settings or policies (or both). Not rocket science or an original thought really. Hopefully that makes some sense.

Monitoring business segments

I think we have all heard the ‘best practice’ that speaks to monitoring portions of your network. The question is how do you do that even from a strictly network traffic view given the volume of traffic you are likely to see. I mean we can all talk ethereally about taking things piecemeal/not trying to boil the ocean, establishing baselines, monitor for anomalies because AV as a whole isn’t as trustworthy as we would like it to be from a detection standpoint. My question is how are people doing this?

In my old job I used ArcSight to create an approach that while worked and addressed a number of areas I’m not sure it was the best. Perhaps sharing this will spark some discussion or if nothing else help someone else get a step farther down the road if they are trying to do the same thing.

Monitor vs Alert

Several weeks ago now I took some money out of the ATM and the bottom of the transaction receipt showed a balance that seemed a bit low to me. I brought it up to my wife at lunch and watched her mentally parse a multitude of variables to include the day of the month, what bills had been paid, and the times and amounts of money that move out of checking and into sub accounts. About 5 seconds later she said the amount sounded about right. This is the difference between monitoring and alerting.

Don't get me wrong. My wife isn't the type that pulls up the account information every day to check the ebb and flow of cash but by doing the bills and interacting with the accounts she had a familiarity for how and when the money does flow. The trick in my mind with a SIEM is to figure out the balance for both types of actions. Not only do you have to balance how and where analysts get data but also things like how much should be simply monitored vs alerted on, if you create reports is there too much or too little data in them to be actionable, if the report is on a particular data source can you give it some historical context or cross reference it against other data sources.

To many of those ends my new favorite report template in ArcSight is one that comes out of the box. 4 charts on one page and then a table. Most of the daily reports we have been creating were of the one table variety with a level of detail in the information - source, destinations, times, counts, etc. While we humans are good at visually picking out patterns and nuance things out of spreadsheets putting what amounts to a rolled up summary of top information with the multiple charts at the top of the report is great. Now within the report not only have you framed aspects of the data but since more detailed information is in the table section readers can pull out an IP/user name/whatever from the top page and then search for it within the report.

Tuesday, May 11, 2010

Monitoring logs and SIEMs

So there are several SIEM articles that tackle SIEMs vs Log Management, reasons you might want to use one in your environment, and what you might hope to see with one. I’m kinda a simple guy though and while I have recently tried to wrap my head around the nuances between monitoring log management style vs “just” alerting and how you might use a SIEM to both ends I was hit with a rather simplistic realization - the SIEM is already monitoring the logs.

See, I said it was simplistic. Here is what I mean though. I currently have multiple tickets open with ArcSight support and having reviewed multiple sets of manager logs one of the techs was kind enough to open an additional ticket based on some things she saw. I had three takeaways:
1. I should monitor the ArcSight logs. (I’m insightful aren’t I)
2. If I had been monitoring them I would have picked up on the warnings/errors/whatever that was seen
3. My manager logs rolled every 2 hours – used to anyway.

This, to me, makes the manager logs a perfect candidate for insertion into the SIEM. Granted part of the “perfect” is that the logs are being generated by something designed to suck in logs so it doesn’t seem to make a whole lot of sense to have something other than itself actually monitor…itself. The other side of this is while there is value in actually looking at the logs periodically I think this is also a perfect case of being able to just set up alerts.

The real takeaway then is maybe an adjustment to the argument that SIEMs are an “analyst in a box.” While they aren’t really an analyst they are at least some/part/multiple/an FTE who is, or at least capable of, monitoring logs. The next takeaway is according to the 2009 Verizon Data Breach Investigations Report we should be looking at application logs anyway. I mean the OS logs I am pulling in from the server hosting the ESM certainly aren’t telling me anything about these “guardoverflow” events that I should apparently search for in my logs in order to fix some data monitors.

Mark Runals' Blog