Incident Response & Disaster Recovery

Security Core Concept 5: Incident Response (IR) & Disaster Recovery (DR)

A little background first;

My Core Concept series is broken down 50/50 into mostly technical, and mostly business concepts;

  1. Risk Assessment (RA) & Business Impact Analysis (BIA) Business
  2. Security Control Selection & ImplementationTechnical
  3. Security Management SystemsTechnical
  4. Governance & Change ControlBusiness
  5. Incident Response (IR) & Disaster Recovery (DR) Technical
  6. Business Continuity Management (BCM) & Business As Usual (BAU) Business

I’ve done this because the majority of consultants fall [mostly] into one or the other camp. There are very few true generalists, even though the majority of regulations require just that.

PCI for example requires a fairly in-depth knowledge of everything from policies to encryption, and from software development to access control. If no one consultant can possibly know all of this stuff – in a depth sufficient to provide true guidance – , why do you more often than not only get one assessor?

A Consultant is not the same as a Subject Matter Expert (SME), and these should not be confused. A consultant knows enough about everything to tell you what else you need. Or the old, but VERY relevant cliche; I don’t know, but I know someone who does.

I have taken DR / IR out of Business Continuity Management, so that it can be addressed by the relevant technical SMEs.

OK, enough background, what is Incident Response and Disaster Recovery?

Incident Response can be defined as; “The reaction to an incident that could lead to loss of, or disruption to, an organisation’s operations, services or functions.”

Disaster Recovery therefore is; “The recovery from an incident that caused loss of, or disruption to, an organisation’s operations, services or functions.”

What does this mean in reality? It means that whatever your business, you must know enough about its processes that anything out of the ordinary is either prevented outright, or detected soon enough to stop the incident from becoming a disaster. While you absolutely must have formalised DR capability, your IR should be robust enough to – hopefully – negate its use. In theory…

In practice, it does not work that way. Organisations generally do not have sufficient knowledge of the normal workings of their systems (infrastructure and applications) to detect when things go wrong. Or if they do, it’s probably too late to do anything about it except initiate DR.

The whole point of the Security Core Concept series is to help you stay in business, otherwise, why bother? The first 4 Core Concepts help bring your environment into a baseline that can, and must, be maintained;

  1. The Risk Assessment told you what was most important to you, and put a value on it;
  2. The controls you put in place mitigated the risk from threats;
  3. The ISMS forces you to continually optimise your systems in a way that supports their baseline functions; and
  4. Governance hopefully removes (or at least reduces) the internal threats.

What IR does is force you standardise, centralise, and simplify.

You will only have the ability to baseline your systems if you have a few ‘known good’ templates. If you have 10 flavours of Windows, and 10 more *nix, all configured differently, you really don’t stand a chance of baselining anything. You must therefore develop standard templates for all systems, wherever possible.

How do you manage 1,000 devices if not centrally? You don’t. Without a way to centralise the management and monitoring of your disparate systems, again, you will never have a baseline.

IR becomes self-explanatory in the face of known baselines; anything NOT within the baseline is an event to be investigated. The process for this investigation must be rapid, comprehensive, and above all PRACTICED! You can have 11 of the best football players in the world and still lose if they don’t play as a team. That’s the simplification.

As for DR, that too is fairly simple, IF, and ONLY if you know what your limits are. Back to the e-commerce example; If you need 100% up-time, forget it, it’s not possible, but 99.9% should be. However, going from 99% – 99.9% is exponentially more expensive, so you need to understand the VALUE of your business assets to define what is acceptable downtime for your business. That’s what the Risk Assessment is supposed to do; provide the input into your IR/DR plans and components.

OK, so I really haven’t given you anything to work on, have I? But like most aspects of security, there are no standards / frameworks / good practices that will fit YOUR business exactly. Everything that’s written down for you to follow can only EVER be a beginning, the rest is up to you.

Your business is unique in some way (probably in many ways), so you must take only the parts that are appropriate from each of the guidance frameworks or you’ll wind up with a security program that is unsustainable, and most likely ignored. Your security program becomes as unique as your business, and even saying that is ‘based on’ something is probably stretching it to the point that Hollywood bases its movies on books.

Security is simple, it’s not easy, but it is simple. Your IR and DR processes must be just that if you hope to stay secure.

[If you liked this article, please share! Want more like it, subscribe!]

If you think I'm wrong, please tell me why!

This site uses Akismet to reduce spam. Learn how your comment data is processed.