Thursday, February 23, 2012

Implementing DLP- Ongoing Management

Managing DLP tends to not be overly time consuming unless you are running off badly defined policies. Most of your time in the system is spent on incident handling, followed by policy management.



To give you some numbers, the average organization can expect to need about the equivalent of one full time person for every 10,000 monitored employees. This is really just a rough starting point- we’ve seen ratios as low as 1/25,000 and as high as 1/1000 depending on the nature and number of policies.



Managing Incidents



After deployment of the product and your initial policy set you will likely need fewer people to manage incidents. Even as you add policies you might not need additional people since just having a DLP tool and managing incidents improves user education and reduces the number of incidents.



Here is a typical process:



Manage incident handling queue



The incident handling queue is the user interface for managing incidents. This is where the incident handlers start their day, and it should have some key features:




  • Ability to customize the incident for the individual handler. Some are more technical and want to see detailed IP addresses or machine names, while others focus on users and policies.

  • Incidents should be pre-filtered based on the handler. In a larger organization this allows you to automatically assign incidents based on the type of policy, business unit involved, and so on.

  • The handler should be able to sort and filter at will; especially to sort based on the type of policy or the severity of the incident (usually the number of violations- e.g. a million account numbers in a file versus 5 numbers).

  • Support for one-click dispositions to close, assign, or escalate incidents right from the queue as opposed to having to open them individually.



Most organizations tend to distribute incident handling among a group of people as only part of their job. Incidents will be either automatically or manually routed around depending on the policy and the severity. Practically speaking, unless you are a large enterprise this cloud be a part-time responsibility for a single person, with some additional people in other departments like legal and human resources able to access the system or reports as needed for bigger incidents.



Initial investigation



Some incidents might be handled right from the initial incident queue; especially ones where a blocking action was triggered. But due to the nature of dealing with sensitive information there are plenty of alerts that will require at least a little initial investigation.



Most DLP tools provide all the initial information you need when you drill down on a single incident. This may even include the email or file involved with the policy violations highlighted in the text. The job of the handler is to determine if this is a real incident, the severity, and how to handle.



Useful information at this point is a history of other violations by that user and other violations of that policy. This helps you determine if there is a bigger issue/trend. Technical details will help you reconstruct more of what actually happened, and all of this should be available on a single screen to reduce the amount of effort needed to find the information you need.



If the handler works for the security team, he or she can also dig into other data sources if needed, such as a SIEM or firewall logs. This isn’t something you should have to do often.



Initial disposition



Based on the initial investigation the handler closes the incident, assigns it to someone else, escalates to a higher authority, or marks it for a deeper investigation.



Escalation and Case Management



Anyone who deploys DLP will eventually find incidents that require a deeper investigation and escalation. And by “eventually” we mean “within hours” for some of you.



DLP, by it’s nature, will find problems that require investigating your own employees. That’s why we emphasize having a good incident handling process from the start since these cases might lead to someone being fired. When you escalate, consider involving legal and human resources. Many DLP tools include case management features so you can upload supporting documentation and produce needed reports, plus track your investigative activities.



Close



The last (incredibly obvious) step is to close the incident. You’ll need to determine a retention policy and if your DLP tool doesn’t support retention needs you can always output a report with all the salient incident details.



As with a lot of what we’ve discusses you’ll probably handle most incidents within minutes (or less) in the DLP tool, but we’ve detailed a common process for those times you need to dig in deeper.



Archive



Most DLP systems keep old incidents in the database, which will obviously fill it up over time. Periodically archiving old incidents (such as anything 1 year or older) is a good practice, especially since you might need to restore the records as part of a future investigation.



Managing Policies



Anytime you look at adding a significant new policy you should follow the Full Deployment process we described above, but there are still a lot of day to day policy maintenance activities. These tend not to take up a lot of time, but if you skip them for too long you might find your policy set getting stale and either not offering enough security, or causing other issues due to being out of date.



Policy distribution



If you manage multiple DLP components or regions you will need to ensure policies are properly distributed and tuned for the destination environment. If you distribute policies across national boundaries this is especially important since there might be legal considerations that mandate adjusting the policy.



This includes any changes to policies. For example, if you adjust a US-centric policy that’s been adapted to other regions, you’ll then need to update those regional policies to maintain consistency. If you manage remote offices with their own network connections you want to make sure policy updates pushed out properly and are consistent.



Adding policies



Brand new policies will take the same effort as the initial polices, other than you’ll be more familiar with the system. Thus we suggest you follow the Full Deployment process again.



Policy reviews



As with anything, today’s policy might not apply the same in a year, or two, or five. The last thing you want to end up with is a disastrous mess of stale, yet highly customized and poorly understood polices as you often see on firewalls.



Reviews should consist of:




  • Periodic reviews of the entire policy set to see if it still accurately reflects your needs and if new policies are required, or older ones should be retired.

  • Scheduled reviews and testing of individual policies to confirm that they still work as expected. Put it on the calendar when you create a new policy to check it at least annually. Run a few basic tests, and look at all the violations of the policy over a given time period to get a sense of how it works. Review the users and groups assigned to the policy to see if they still reflect the real users and business units in your organization.

  • Ad-hoc reviews when a policy seems to be providing unexpected results. A good tool to help figure this out is your trending reports- any big changes or deviations from a trend are worth investigating at the policy level.

  • Policy reviews during product updates, since these may change how a policy works or give you new analysis or enforcement options.



Updates and tuning



Even effective policies will need periodic updating and additional tuning. While you don’t necessarily need to follow the entire Full Deployment process for minor updates, they should still be tested in a monitoring mode before you move into any kind of automated enforcement.



Also make sure you communicate and noticeable changes to affected business units so you don’t catch them by surprise. We’ve heard plenty examples of someone in security flipping a new enforcement switch or changing a policy in a way that really impacted business operations. Maybe that’s the goal, but it’s always best to communicate and hash things out ahead of time.



If you find a policy really seems ineffective then it’s time for a full review. For example, we know of one very large DLP user who had unacceptable levels of false positives on their account number protection due to the numbers being too similar to other numbers commonly in use in regular communications. They solved the problem (after a year or more) by switching from pattern matching to a database fingerprinting policy that checked against the actual account numbers in a customer database.



Retiring policies



There are a lot of DLP policies you might use for a limited time, such as a partial document matching policy to protect corporate financials before they are released. After the release date, there’s no reason to keep the policy.



We suggest you archive these policies instead of deleting them. And if your tool supports it, set expiration dates on policies… with notification so it doesn’t shut down and leave a security hole without you knowing about it.



Backup and archiving



Even if you are doing full system backups it’s a good idea to perform periodic policy set backups. Many DLP tools offer this as part of the feature set. This allows you to migrate policies to new servers/appliances or recover policies when other parts of the system fail and a full restore is problematic.



We aren’t saying these sorts of disasters are common; in fact we’ve never heard of one, but we’re paranoid security folks.



Archiving old policies also helps if you need to review them while reviewing an old incident as part of a new investigation or a legal discovery situation.



Analysis



Analysis, as opposed to incident handling, focuses on big picture trends . We suggest three kinds of analysis:



Trend analysis



Often built into the DLP server’s dashboard, this analysis looks across incidents to evaluate overall trends such as:




  • Are overall incidents increasing or decreasing?

  • Which policies are having more or less incidents over time?

  • Which business units experience more incidents?

  • Are there any sudden increases in violations by a business unit, or of a policy, that might not be seen if overall trends aren’t changing?

  • Are a certain type of incidents tied to a business process that should be changed?



The idea is to mine your data to evaluate how your risk is increasing or decreasing over time. When you’re in the muck of day to day incident handling it’s often hard to notice these trends.



Risk analysis



A risk analysis is designed to show what you are missing. DLP tools only look for what you tell them to look for, and thus won’t catch unprotected data you haven’t built a policy for.



A risk analysis is essentially the Quick Wins process. You turn on a series of policies with no intention of enforcing them, but merely to gather information and see if there are any hot spots you should look at more in depth or create dedicated policies for.



Effectiveness analysis



This helps assess the effectiveness of your DLP tool usage. Instead of looking at general reports think of it like testing to tool again. Try some common scenarios to circumvent your DLP to figure out where you need to make changes.



Content discovery/classification



Content discovery is the process of scanning storage for the initial identification of sensitive content and tends to be a bit different than network or endpoint deployments. While you can treat it the same, identifying policy violations and responding to them, many organizations view content discovery as a different process, often part of a larger data security or compliance project.



Content discovery projects will off turn up huge amounts of policy violations due to files being stored all over the place. Compounding the problem is the difficulty in identifying the file owner or business unit that’s using the data, and why they have it. Thus you tend to need more analysis, at least with your first run through a server or other storage repository, to find the data, identify who uses and owns it, the business need (if any), and alternative options to keep the data more secure.



Troubleshooting



We’ve covered most non-product-specific troubleshooting throughout this series. Problems people encounter tend to fall into the following categories:




  • Too many false positives or negatives, which you can manage using our policy tuning and analysis recommendations.

  • System components not talking to each other. For example, some DLP tools separate out endpoint and network management (often due to acquiring different products) and then integrate them at the user interface level. Unless there is a simple network routing issue, fixing these may require the help of your vendor.

  • Component integrations to external tools like web and email gateways may fail. Assuming you were able to get them to talk to each other previously, the culprit is usually a software update introducing an incompatibility. Unfortunately, you’ll need to run it down in the log files if you can’t pick out the exact cause.

  • New or replacement tools may not work with your existing DLP tool. For example, swapping out a web gateway or using a new edge switch with different SPAN/Mirror port capabilities.



We really don’t hear about too many problems with DLP tools outside of getting the initial installation properly hooked into infrastructures and tuning policies.



Maintenance for DLP tools is relatively low, consisting mostly of five activities (two of which we already discussed):




  • Full system backups, which you will definitely do for the central management server, and possibly any remote collectors/servers depending on your tool. Some tools don’t require this since you can swap in a new default server or appliance and then push down the configuration.

  • Archiving old incidents to free up space and resources. But don’t be too aggressive since you generally want a nice library of incidents to support future investigations.

  • Archiving and backing up policies. Archiving policies means removing them from the system, while backups include all the active policies. Keeping these separate from full system backups provides more flexibility for restoring to new systems or migrating to additional servers.

  • Health checks to ensure all system components are still talking to each other.

  • Updating endpoint and server agents to the latest versions (after testing, of course).



Reporting



Ongoing reporting is an extremely important aspect of running a Data Loss Prevention tool. It helps you show management and other stakeholders that you, and your tool, are providing value and managing risk.



At a minimum you should produce quarterly, if not monthly, rollup reports on trends and summarizing overall activity. Ideally you’ll show decreasing policy violations, but if there is an increase of some sort you can use that to get the resources to investigate the root cause.



You will also produce a separate set of reports for compliance. These may be on a project basis, tied to any audit cycles, or scheduled like any other reports. For example, running quarterly content discovery reports showing you don’t have any unencrypted credit card data in a storage repository and providing these to your PCI assessor to reduce potential audit scope. Or running monthly HIPAA reports for the HIPAA compliance officer (if you work in healthcare).



Although you can have the DLP tool automatically generate and email reports, depending on your internal political environment you might want to review these before passing them to outsiders in case there are any problems with the data. Also, it’s never a good idea to name employees in general reports- keep identifications to incident investigations and case management summaries that have a limited audience.