Friday, April 20, 2012

Vulnerability Management Evolution: Core Technologies

Interesting Read on Securosis blog about Vulnerability Management

As we discussed in the last couple posts, any VM platform must be able to scan infrastructure and scan the application layer. But that’s still mostly tactical stuff. Run the scan, get a report, fix stuff (or not), and move on. When we talk about a strategic and evolved vulnerability management platform, the core technology needs to evolve to serve more than merely tactical goals – it must provide a foundation for a number of additional capabilities. Before we jump into the details we will reiterate the key requirements. You need to be able to scan/assess:

  1. Critical Assets: This includes the key elements in your critical data path; it requires both scanning and configuration assessment/policy checking for applications, databases, server and network devices, etc.
  2. Scale: Scalability requirements are largely in the eye of the beholder. You want to be sure the platform’s deployment architecture will provide timely results without consuming all your network bandwidth.
  3. Accuracy: You don’t have time to mess around, so you don’t want a report with 1,000 vulnerabilities, 400 of them false positives. There is no way to totally avoid false positives (aside from not scanning at all) so accuracy is a key selection criteria.

Yes, that was pretty obvious. With a mature technology like vulnerability management the question is less about what you need to do and more about how – especially when positioning for evolution and advanced capabilities. So let’s first dig into the foundation of any kind of strategy platform: the data model.

Integrated Data Model


What’s the difference between a tactical scanner and an integrated vulnerability/threat management platform? Data sharing, of course. The platform needs the ability to consume and store more than just scan results. You also need configuration data, third party and internal research on vulnerabilities, research on attack paths, and a bunch of other data types we will discuss in the next post on advanced technology. Flexibility and extensibility are key for the data schema. Don’t get stuck with a rigid schema that won’t allow you to add whatever data you need to most effectively prioritize your efforts – whatever data that turns out to be.

Once the data is in the foundation, the next requirement involves analytics. You need to set alerts and thresholds on the data and be able to correlate disparate information sources to glean perspective and help with decision support. We are focused on more effectively prioritizing security team efforts, so your platform needs analytical capabilities to help turn all that data into useful information.

When you start evaluating specific vendor offerings you may get dragged into a religious discussion of storage approaches and technologies. You know – whether a relational backend, or an object store, or even a proprietary flat file system; provides the performance, flexibility, etc. to serve as the foundation of your platform. Understand that it really is a religious discussion. Your analysis efforts need to focus on the scale and flexibility of whatever data model underlies the platform.

Also pay attention to evolution and migration strategies, especially if you plan to stick with your current vendor as they move to a new platform. This transition is akin to a brain transplant, so make sure the vendor has a clear and well-thought-out path to the new platform and data model. Obviously if your vendor stores their data in the cloud it’s not your problem, but don’t put the cart in front of the horse. We will discuss the cloud versus customer premises later in this post.

Discovery


Once you get to platform capabilities, first you need to find out what’s in your environment. That means a discovery process to find devices on your network and make sure everything is accounted for. You want to avoid the “oh crap” moment, when a bunch of unknown devices show up – and you have no idea what they are, what they have access to, or whether they are steaming piles of malware. Or at least shorten the window between something showing up on your network and the “oh crap” discovery moment.

There are a number of techniques for discovery, including actively scanning your entire address space for devices and profiling what you find. That works well enough and tends to be the main way vulnerability management offerings handle discovery, so active discovery is still table stakes for VM offerings. You need to balance the network impact of active discovery against the need to quickly find new devices. Also make sure you can search your networks completely, which means both your IPv4 space and your emerging IPv6 environment. Oh, you don’t have IPv6? Think again. You’d be surprised at the number of devices that ship with IPv6 active by default and if you don’t plan to discover that address space as well, you’ll miss a significant attack surface. You never want to hold a network deployment while your VM vendor gets their act together.

You can supplement active discovery with a passive capability that monitors network traffic and identifies new devices based on network communications. Depending on the sophistication of the passive analysis, devices can be profiled and vulnerabilities can be identified, but the primary goal of passive monitoring is to find new unmanaged devices faster. Once a new device is identified passively, you could then launch an active scan to figure out what it’s doing. Passive discovery is also helpful for devices that use firewalls to block active discovery and vulnerability scanning.

But that’s not all – depending on the breadth of your vulnerability/threat management program you might want to include endpoints and mobile devices in the discovery process. We always want more data, so we are for determining all assets in your environment. That said, for determining what’s important in your environment (see the asset management/risk scoring section below), endpoints tend to be less important than databases with protected data, so prioritize the effort you expend on discovery and assessment.

Finally, another complicating factor for discovery is the cloud. With the ability to spin up and take down instances at will, your platform needs to both track and assess cloud resources, which requires integrating with cloud consoles to make sure your platform knows about new devices and can assess them appropriately. This is an emerging capability, but realistically you’ll see a lot more private and public cloud-based resources in your environment.

Asset Management and Risk Scoring


The key capability of the evolved vulnerability management platform is its ability to help you you prioritize efforts, so any calculation of a risk score largely depends on 1) the ‘importance’ of the asset and 2) how ‘exposed’ it is to attack at any given point in time. Evaluating what’s important is really an asset management function. Of course many operations teams run extensive asset management efforts. The VM platform can and should take advantage of any existing resources and integrate with those tools. But many organizations don’t have an existing asset database (scary as that sounds), so the VM platform may need to serve as the authoritative registry of IT assets. Either way, the platform needs to store and/or access asset information.

Once you have the assets defined in the system the next step is to tag, group and/or categorize them. The more flexible the system the better; every organization groups their assets differently so your platform should support the way you categorize assets – not force you to fit your assets into their vendor-defined buckets. Assign an (admittedly subjective) importance to each group or category of assets. We suggest a simple approach, with 3 or 5 levels of importance. Really important means someone would be fired, while some assets are simply unimportant. You don’t need complexity or fine precision, but you at least need to identify devices which hold (or have access to) critical data.

As you evaluate the vulnerability of each asset through the platform’s various tests you can determine a risk score to drive prioritization.

The point here is flexibility. You want to group assets in a way that makes sense for your organization. You want to derive a risk score based on your calculation of risk, not an black box calculation that may or may not be relevant for your organization. And you need the ability to change everything the next time a significant technology or organizational disruption happens, like cloud computing or a big M&A deal.

To Cloud or Not to Cloud


The next aspect of the core technology underlying the evolved vulnerability/threat management platform is the cloud buzzword. If you thought people got religious about data models and engines, ask a cloud vendor about an on-premise solution or vice-versa. That’s always fun. At the end of the day, this cloud discussion involves two things.

  1. Scale: You will hear a lot from cloud-based providers about infinite scale and the limitations of customer premise-based offerings. It is true that scalability is the vendor’s problem in a cloud scenario. That offers some advantages, but any solution can scale with a suitable deployment architecture.
  2. Technology Updates/Change: The other big message you’ll hear from cloud bigots is that cloud platform handle software updates more quickly and transparently than on your own gear. Again, there is truth to this, but every vulnerability management vendor has been sending new rules and tests to its devices for years, so it’s not like they haven’t figured out software distribution.

These two objections to customer premise-based solutions are really much ado about nothing. The ‘decision’ isn’t really a decision at all – what is ‘cloud’ and isn’t nowadays is largely a matter of semantics. Let’s get back to your requirements. You need to be able to test your environment from outside – most attackers are outside your perimeter. That works best with a cloud service. But you also need a presence within your perimeter to scan internal devices, especially those on protected networks. So every cloud service must include an on-site component for internal scans.

That on-site component might be a dedicated appliance, a virtual machine, a dynamic instance downloaded to a device inside the network at scan time, or a combination. Ultimately the deployment model is beside the point – choose the model that best fits your operational processes. There is no point in getting religious about deployment models, so the leading platform vendors will offer hybrid approaches to meet your specific needs. If it’s easier to provision the device once and let ops deal with it, then opt for the internal scanning appliance or dedicate a VM to scanning in your virtual data center.

But don’t get caught up in hype. You need an external component to test your environment from the outside and an internal component for testing inside your perimeter.

To Agent or Not to Agent. To Credential or not to Credential.


You will also hear a lot about agent vs. agent-less scanning. This is also mostly hyperbole and semantics. In order to do any kind of granular scan of a device, you need a persistent agent on the device, the ability to download a temporary agent, or full administrator rights (credentials) to the device to remotely poll it for the things you are looking for (configurations, patches, logs, etc.). As usual, the answer is all of the above. There are advantages to a temporary agent in terms of having to manage software distribution to devices you worry about. But ultimately the scanning model you choose depends on your access to the device, the type of device, and what kinds of data it has access to.

When thinking about credentialed vs. non-credentialed scans, the answer is also both. Non-credentialed scans give you the external attacker’s view, but of course there are limits detail that can be gleaned from a non-credentialed scan. So to gain a full understanding of the security posture of a device you also want a credentialed scan with full access to configurations, patch levels, logs, entitlements, applications, etc.

Keep in mind that you cannot actively scan certain devices. Think brittle control systems which fall over under the onslaught of a vulnerability scan. So it’s probably not in your best interest to scan those devices. Above we mentioned passively discovering assets by monitoring the network. A similar approach can be used to find vulnerabilities on devices you can’t actively scan. Obviously it doesn’t provide the same detail as a credentialed scan, but if the alternative is knocking down the device any data is better than no data.

Security Research


Finally, any vulnerability/threat management platform needs to be driven by research. Things move fast in the attack space and your threat management tools need to stay current. So your vendor needs to make considerable investments in a dedicated team to track the field, observe and analyze new attacks, figure out how to search for those attacks using their tools, ensure the quality of their tests to minimize false positives, and finally get the tests into your hands as quickly as possible. For a more granular view into the process of analyzing attacks and malware check out the Analyze Malware subprocess in Malware Analysis Quant. It provides an idea of what’s involved in profiling malware files and figuring out how to find them in your environment.

To compare research groups evaluate the sophistication of their analysis. Do you understand how to remediate issues that your scanner finds? Can you determine the seriousness of the attack? Do you believe them? Is the data just coming from the vendor or do they integrate third party data? And most importantly, do they provide coverage for the assets in your environment. You know, the OSes, databases, and critical applications that drive your business.

There is a lot to the evolved vulnerability/threat management platform. But we see these capabilities as table stakes. A lot of innovation is happening in this space, and advanced – and in some cases adjacent – technologies will be the focus of the next post. We will dive into capabilities such as attack path analysis, penetration testing, and benchmarking.

- Mike Rothman