Wednesday, May 7, 2014

Audit 101 For DevOps: Resource Guide for “The Phoenix Project” (Part 3: Correctly Scoping IT Using GAIT and GAIT-R) [feedly]

  

----
Audit 101 For DevOps: Resource Guide for "The Phoenix Project" (Part 3: Correctly Scoping IT Using GAIT and GAIT-R)
// IT Revolution

This blog article continues the description of the "body of knowledge" that underpins "The Phoenix Project," which started in Part 1: Reading Lists and Part 2: Kanban And DevOps.

Bad things happen to organizations when we scope incorrectly the IT portions of an audit, regardless of the type of audit (e.g., financial reporting such as SOX-404, contractual obligations such as PCI, and so forth).

In "The Phoenix Project", scoping errors are the reason why John, the Chief Information Security Officer (CISO), was perceived by his peers to be shrill, hysterical, and constantly focused on technical minutia, sucking the will to live out of everyone he interacted with. By insisting that the organization focus on controls that didn't help the organization prevent bad things from happening, or controls that enabled quicker detection and recovery, John became marginalized by the organization.

However, John seemingly undergoes a miraculous transformation when he correctly scopes his security and compliance programs. By doing so, he not only reduces unnecessary security work, but he becomes perceived by others to take prudent (or even bold) risks that genuinely help the organization achieve it's objectives.

To illustrate the problems created with incorrectly scoped audits, I'll first describe how and why the initial years of complying with the IT portions of Section 404 of the Sarbanes-Oxley Act of 2002 (SOX-404) were so wasteful and destructive, and how we developed GAIT to solve this problem.

What I love about the auditing body of knowledge is its incredibly precise language. I'll show you how to use the GAIT and GAIT-R tools to correctly scope IT audits, which will enable us to start thinking through how we substantiate and demonstrate to auditors that we actually have effective controls that can detect, prevent and recover from errors.

(This blog post will be relevant for any DevOps shop dealing with auditors who are freaking out over the absence of the standard controls such as separation of duty or a change approval process. Although this article is about audit scoping, I'll be writing more about substantiating the effectiveness of controls in a DevOps workstream later. In the meantime, join the Google+ Community we've created to help create the DevOps Audit Defense Toolkit. For a sneak peek of how we build an effective DevOps control environment, look at Principle 4c in the "GAIT-R Principles" section.)

The SOX-404 IT (And Resulting Business) Problem

For most IT professionals, the early years of SOX-404 will likely bring back feelings of horror and dismay. Why? Because every IT organization within publicly-held companies was hit with a dizzying list of IT control weaknesses by the external auditors (i.e., accounting firms), resulting in seemingly pointless, but horrendously painful, manual and soul-killing work.

My most jaw-dropping memory from this period: at one company, someone was hired to look at operating system log files all day, with very little training. This person was required to sign off daily that they actually reviewed the logs. What were they looking for? Why didn't they automate that task? Did they ever find anything that was worth acting upon? No one could answer those questions.

No wonder that in 2005, ComputerWorld ran a headline that called SOX-404 the "biggest IT time waster." So what went wrong?

The SOX-404 legislation was the countermeasure to the unprecedented and spectacular wave of internal control failures inside U.S. corporations: in 2001, Enron failed ($63B market capitalization), and in 2002, WorldCom failed ($117B market capitalization). These corporation-destroying failures were caused by some serious shenanigans going on at the executive level, where the financial balance sheets were tinkered with to make their companies appear to be doing far better than they actually were. These financial reporting errors went undetected until it was too late, when the firms were discovered that they had to be liquidated.

In response, SOX-404 required that all CEOs and CFOs of all publicly traded U.S. corporations personally attest to the accuracy of the financial statements. To help the SEC (Securities Exchange Commission) enforce this legislation, Congress established the PCAOB (Public Corporation Accounting Oversight Board) to "audit the auditors."

So if SOX-404 was intended to enforce these behaviors of CEOs and CFOs, how was it resulting in such carnage for IT professionals?

Auditors Gone Wild

"We all know that Enron wasn't caused by an Oracle database change. So why are all these controls being put around DBAs?"

By 2005, which was the second year of SOX enforcement, we all started to suspect that SOX-404 was resulting in some surprising and undesired outcomes. That year, KPMG released a fantastic study (the full PPT can be found here).

What KPMG found was that IT control issues dominated the deficiencies, significant deficiencies, and material weaknesses identified through the SOX-404 assessment.

Category Of Material Weakness % of Total Findings
IT Control Holy cow!!! —> 23%
Financial Reporting And Close 14%
Procure To Pay 12%

These are incredible statistics! Although SOX-404 was intended to ensure that controls existed to prevent undetected material errors on the financial statements, the vast majority of audit issues being found were IT-related (e.g., a firewall rule change, database ghost account, etc.). Could failures in IT controls actually result in an undetected material financial reporting error? Extremely unlikely!

In 2006, Intel released an even better illustration of the problem. They found that IT deficiencies outnumbered non-IT deficiencies by at least 8:1, and those deficiencies were often being found in applications that weren't even financially significant.

In other words, IT findings were being generated in areas that shouldn't have even been in-scope for testing during the SOX-404 audit!

Incorrectly Scoping The IT Portions Of The Audit

"Auditing amateurs think about IT controls. Auditing professionals think about IT scoping." — Adapted from Carl von Clausewitz

When auditors incorrectly scope the IT portions of an audit, they test controls that shouldn't be tested. And when those tests generate findings, they result in remediation work that often isn't needed.

Something similar happens during software testing when QA tests things that don't actually jeopardize the achievement of business goals (e.g., testing features that are no longer used).

The difference, of course, is that QA defect reports don't usually get reported to the board of directors or result in a potential footnote in the financial reporting statements filed with the SEC (i.e., the annual 10K or quarterly 10Q financial statements).

The chaos that ensues when these audit findings are generated are woefully predictable. We wrote about this in Visible Ops Security – I describe the pattern of the "Bottom-Up SOX-404 Cautionary Tale", which is the basis of the scene where John has his meltdown, realizing the actual mechanism for detecting SOX-404 failures was not an IT control, but a downstream manual reconciliation control. I also describe how management would argue their way out of audit findings after the fact, using something called "Chart 3."

So how do correctly scope audits so that we prevent unnecessary IT controls from being tested and incorrect IT audit findings from being generated?

What We Did About It: GAIT Principles and Methodology for SOX-404 From the Institute of Internal Auditors

From 2005 to 2007, I had the honor of being part of the leadership team for the GAIT (Generally Accepted IT Principles) task team at the Institute of Internal Auditors (IIA). We developed and published the GAIT Principles and Methodology, designed to help management and auditors appropriately scope the IT portions of SOX-404.

(Incidentally, the IIA remains the most effective professional organization I've ever been associated with – the IIA has over 120,000 due-paying members. Every professional organization should aspire to advance their profession and deliver as much value to their membership as IIA!)

Our goal was to get it ratified by internal auditors and security executives from the largest publicly-held companies (i.e., SEC registrants), as well as the external auditors from the Big Four and probably most importantly, the aforementioned PCAOB (who audits the auditors).

You can find a great presentation on applying GAIT to SOX-404 by Edward L. Hill (then Managing Director, Protiviti) and Jay R. Taylor, CIA, CFE, CISA (General Director of Audit, General Director of Audit General Motors Corporation) here.

GAIT: The Missing Link Between The COSO And COBIT Constructs

Among the first things we did was to more precisely define the problem statement:

  • The IT portions of SOX-404 compliance has frustrated auditors and management
  • Significant key controls reside inside IT and IT processes, as well as in the business processes
    • A lack of well-established guidance for scoping IT work results in inconsistency and an overly subjective process
    • Lack of guidance can also result in overly broad scope and excessive testing costs (the outcome of overly broad scope)
    • Significant risks to financial assertions may be left unaddressed (the outcome of overly narrow scope)
    • Suboptimal use of scarce resources (auditors, as well as business and IT management)

The word that kept coming up was "linkage." In other words, if we cannot link an IT finding with the risk of an undetected material error, then the IT finding is out of scope.

But what exactly is being linked to what?

At the Top: COSO Internal Control Objectives

One of the most widely used constructs for describing business goals is the COSO Enterprise Risk Management Cube. Typically, at the very top of the organization, the goals as discussed by board directors and executives can be framed as COSO internal control objectives. The three primary ones are:

  • accurate financial reporting (e.g., are account balances and values accurate, etc.)
  • compliance with laws, regulations and contractual obligations (e.g., SOX-404, PCI DSS, FISMA, U.S. export laws, etc.)
  • operations (i.e., whether the organization runs its internal processes effectively and efficiently to achieve the organizational goals, such as software delivery, IT operations, sales, marketing, finance, etc.)

Every organization needs reliable processes and controls to achieve those internal control objectives.

At The Bottom: COBIT Control Objectives And Controls

On the other hand, at the bottom where the daily work is actually performed in Development and IT Operations, we use constructs like the COBIT framework. COBIT is a framework which provides an exhaustive list of controls and the processes that can be used in processes that we plan, acquire, implement and monitor software and service delivery.

The last time I read all of COBIT was ten years ago, when there were approximately 318 specific control objectives that spanned the entire Development, Test, IT Operations and Infosec value stream.

For instance, one of my favorite sections is AI6, which has the following subsections:

  • AI6 (Acquire and Implement, Section 6: Manage Changes)
    • AI 6.1. Change Request Initiation and Control
    • AI 6.2. Impact Assessment
    • AI 6.3. Control of Changes
    • AI 6.4. Emergency Changes
    • AI 6.5. Documentation and Procedures
    • AI 6.6. Authorized Maintenance
    • AI 6.7. Software Release Policy
    • AI 6.8. Distribution of Software

COSO helps frame what the organization needs to achieve, and COBIT lists all of the various controls that can mitigate technology risks. Both are valid frameworks, but in the early years of SOX-404, obviously something was going wrong in how we linked them together.

Thought Experiment: Two Extremes of the IT Scoping Spectrum

One of the major breakthroughs in the development of GAIT came from the following thought experiment: imagine a spectrum of businesses, with "simple business processes" on the left, and "complex business processes" on the right.

Simple Business Process: Grain Storage Business Process

On the most simple side of the spectrum, imagine a business that runs grain storage elevators. In order to confirm accurate account balances and values, we could perform a visual inspection of grain levels in each silo, and maybe sample them to ensure that they're actually grain (as opposed to saw dust or sand).

In this scenario, one could argue that all IT applications (e.g., SAP) should be out of scope of the SOX-404 audit, as well as all the supporting "IT general controls" (i.e., IT process controls, such as change controls, authorization controls, etc.).

Why? To establish account balances and values, we can do physically measure the inventory levels, and we no longer rely upon the IT applications (and therefore, on any IT controls supporting them). Ergo, IT is out of scope.

Complex Business Process: Online Auction Business Process

On the very right side of the spectrum, imagine a business that runs a modern stock exchange or an online auction that is run on an IT application. If someone were to bypass our controls and introduce fraudulent transactions, or if an error in the code or infrastructure could cause us to erroneously add, modify or delete transactions, it could lead to an undetected financial reporting error.

Worse, even if we know that an error was made, there is no physical inventory to inspect or boxes we can count. In other words, we are entirely reliant upon what the IT systems tell us about the transactions, transaction values, and consequently, account balances and values.

In this case, once we lose assurance in the IT application, or in the IT controls supporting that application, we've lost all assurance in account balances and values, which is a material weakness (just like Enron, WorldCom, etc.).

There are well-documented cases of businesses failing because of situations like this: data center moves gone horribly wrong, accidental loss of transactions due to IT failures, etc. A saying that goes back to the 1980s says that "approximately 50% of businesses that experience major data loss fail within 18 months." Even if this statistic isn't completely accurate, we've all likely seen disasters that make this risk impossible to dismiss completely.

(A friend of mine, a partner at an audit firm, told me about his first project out of college in the 1980s: reconstruct a large company's financial transactions after virtually all their ERP data was lost during a data center move. This horrible project involved railroad boxcars full of punchcards, paper financial reports, etc. The project ended when the company went out of business.)

When there is 100% business process reliance on an IT application (e.g., online auction, stock exchange), it should be be impossible to argue that the IT application can be out-of-scope. Furthermore, in order to assert that the IT systems are functioning as designed, the all the IT general controls must also be in scope as well (i.e., to ensure that only authorized people were making authorized changes, etc.).

The GAIT Breakthrough: Defining "Reliance"

Developing these two scenarios was a breakthrough. Why? Everyone could agree that the IT systems in the grain elevator scenario could be out-of-scope, and everyone could agree that the IT systems in the online auction scenario must be in-scope.

The extremes were easy. However, scenarios between the two extremes always seemed to be in the gray area. That is, until we identified and fully defined the following two words: "significance" and "reliance."

After that, scoping became almost simple and straightforward.

For the SOX-404 context, we established that, for an IT application or control to be in-scope, it had to be "significant" and there had to be "reliance."

"Significant" was easy because the existing Audit Standard 2 (or "AS/2") guidance already defined it: "significant" was defined as "those accounts where there is a reasonably likely inherent risk of an error that is material to the financial statements arising in that account."

However, GAIT required defining "reliance," because it was not codified anywhere. We defined that a control was "relied upon" when the failure of the functionality of the control could result in an undetected material error.

In the grain elevator scenario, we could argue that there was no reliance on IT – instead, we were putting reliance on visual inspection of the grain silos. For instance, suppose that all the IT systems failed, as well as the IT general controls that supported them. We could still detect them during our visual inspection audits.

On the other hand, in the online auction, we have 100% reliance on the IT systems and the supporting IT general controls in order to assert that the system is functioning as designed. There are no downstream controls, manual or automated, that can detect a resulting material error to the financial statements.

To state this another way, for highly complex and automated business processes, the controls need to be inside the IT system. For less complex business process, reliance can be placed on controls outside the IT system.

Revisiting The Scoping Error That John Makes In "The Phoenix Project"

In "The Phoenix Project," John hits bottom during the meeting with the auditors, where they meet with the CFO and the business line managers. In this all-day meeting, the auditors and management are playing out the following scene, showing that the IT findings were indeed out-of-scope.

Over and over again, they went through scenarios that assumed all the IT infrastructure was made of Swiss cheese, where any disgruntled or wrongdoing employee or external, malicious hacker could log in and commit fraud with impunity.

But they would still detect any material error in the financial statements.

Once, Dick pointed out that an entire department of twenty people is responsible for spotting erroneous, let alone fraudulent, orders. They, and not an IT control, served as the business safety net.

To use the GAIT language, the materials management business process was in scope for SOX-404, but reliance could be placed on downstream manual controls being executed by a 20 person department whose job is to find erroneous or fraudulent orders.

So John found him pushing for IT controls that weren't actually needed by the business. (Make no mistake, though – automated controls are still likely a good idea to implement, but claiming them as an absolutely requirement to remediate the audit finding is clearly incorrect.)

The GAIT-R Principles In Action

In 2007, the GAIT task team of the Institute of Internal Auditors released the GAIT Principles and Methodology to standardize audits for financial reporting. In March 2008, GAIT-R was released, which extended the principles and methodology to the other two COSO internal control objectives: compliance with laws and operations.

If you're not an IIA member, you can find the GAIT-R document here. I think GAIT-R is an amazing piece of work, and anyone doing compliance work without reading it is doing themselves and their organizations a great disservice.

The GAIT-R principles are listed below.I'll go through each one, explaining it in "non-auditor speak:"

Principle 1: The failure of technology is only a risk that needs to be assessed, managed, and audited if it represents a risk to the business.

For the SOX context, this means that business processes that aren't "significant" are not in scope of an audit, and therefore the IT applications and the IT general controls supporting them shouldn't be either.

For the other COSO internal control objectives (e.g., operations), if a risk in a business process doesn't pose a risk to the organization, neither do the IT applications and IT general controls that support it.

This principle forces us to identify what the business risks are from the business perspective, which allows us to disallow IT controls from being tested that cannot be directly linked.

Principle 2: Key controls should be identified through a top-down assessment of business risk, risk tolerance, and the controls (including automated controls and IT general controls) required to manage or mitigate business risk.

Once we identify the business risks that are derived from Principle 1, we must then identify where "critical functionality" resides inside the IT applications or systems.

"Critical functionality" is what we place reliance upon that ensures that our IT systems are operating as designed in order to detect or prevent errors. Critical functionality is the logic in a system that enables attainment of objectives.

For financial reporting, the critical functionality is often calculations or a control necessary to ensure the integrity of account balances and outputs. For operations, it is often the functionality of all of the components of the IT service required to fulfill the business objectives, of which any impairment will negatively impact the business.

Note that critical functionality typically refers to functionality inside an IT application, such as the famous ERP "three-way match" control that can ensure that we only pay invoices with valid purchase orders and packing slips.

Principle 3: Business risks are mitigated by a combination of manual and automated key controls. To assess the system of internal control to manage/mitigate business risks, key automated controls need to be assessed.

The gist of this principle: When we find potential control weaknesses, even around critical functionality, we must step back, look at the complete system and ensure that there isn't another downstream control (automated or manual) that could detect a failure.

Principle 4: IT general controls may be relied upon to provide assurance of the continued and proper operation of automated key controls (e.g., change management, access information security, and operations. ).

Whenever there is automated control (i.e., critical functionality) in the IT application that is relied upon, if we can prove that the IT general controls are effective, we can also assert that the automated control is working, too.

For instance, suppose we rely on our ERP system's automated "three-way match" control. As long as the three-way match setting is enabled and does not change, we can trust the results of the accounts payable process enabled by the ERP system.

Consequently, IT management must prove that no unauthorized changes were made to that three-way match setting. Why? Because an unauthorized change could disarm the IT functionality that we rely upon, which could result in an undetected financial statement error. (e,g,, we rely on the correct configuration of the three-way match setting, which then relies on effective change controls.)

Principle 4a: The IT general control process risks that need to be identified are those that affect critical IT functionality in significant applications and related data.

In other words, any IT general control not supporting an automated control relied upon is out of scope.

Principle 4b: The IT general control process risks that need to be identified exist in processes and at various IT layers: application program code, databases, operating systems, and network.

IT general controls must protect all layers of the application stack. If we have an ERP system that contains critical functionality, in order to prevent the rogue DBA risk, we also must have change control around the application, as well as the database, operating system, network, etc…

Principle 4c: Risks in IT general control processes are mitigated by the achievement of IT control objectives, not individual controls.

When we find a control weakness in a specific control, we must step back, and look at the entire control environment to determine other controls that would detect a control failure. (I.e., don't go overboard when we find one control weakness.)

Principle 4c is likely where we will be the lynchpin of the DevOps Audit Defense Toolkit. We must show that we can achieve the compliance objective through controls besides the typical separation of duty or change approval process controls.

Reference Materials

Press release for GAIT

2/8/2007: Yesterday, The Institute of Internal Auditors (IIA) released long-awaited guidance providing executive management, internal and external auditors, regulators, and the IT industry with a method of identifying which IT General Controls (ITGC) should be tested as a part of an annual assessment of internal controls over financial reporting.

The guidance called GAIT – the Guide to the Assessment of IT General Controls Scope Based on Risk, will help organizations and their auditors be more efficient and could possibly result in a reduction of compliance costs, such as those associated with Section 404 of the U.S. Sarbanes-Oxley Act of 2002 (SOX).

GAIT comes on the heels of recent survey results indicating that costly ITGC scoping inefficiencies still exist. Today, technology is inherent in most organizational processes, many of which are complex and not fully understood by management or auditors. Although some excellent IT control and audit frameworks have emerged from various countries, until now there was no common methodology for clearly identifying ITGC that significantly impact financial reporting. This frequently has resulted in overlooking critical ITGC, as well as testing too many controls, which can be costly. GAIT provides a universal methodology designed to efficiently scope ITGC, regardless of the internal control framework used.

GAIT comes on the heels of recent survey results indicating that costly ITGC scoping inefficiencies still exist. Today, technology is inherent in most organizational processes, many of which are complex and not fully understood by management or auditors. Although some excellent IT control and audit frameworks have emerged from various countries, until now there was no common methodology for clearly identifying ITGC that significantly impact financial reporting. This frequently has resulted in overlooking critical ITGC, as well as testing too many controls, which can be costly. GAIT provides a universal methodology designed to efficiently scope ITGC, regardless of the internal control framework used.

Reference to 2005 KPMG statistics

  • IT controls dominate the deficiencies, significant deficiencies, and material weaknesses identified through the S-O 404 assessment.
    • The estimated percentages of material weaknesses identified include IT controls (27 percent), revenue (18 percent), taxes (11 percent), and financial reporting and close (10 percent)
    • The estimated percentage of significant deficiencies identified again shows IT controls leading the way (23 percent), followed by financial reporting and close (14 percent), procure to pay (13 percent), and revenue (12 percent).
    • The estimated percentage of deficiencies identified show IT controls accounting for the most (34 percent), followed distantly by revenue (13 percent), procure to pay (10 percent), and fixed assets (10 percent). 

The GAIT Task Team

We created the GAIT Task Team, with GAIT being an acronym for (admittedly optimistic) "Generally Accepted IT Principles." The team included Ed Hill (who was then Managing Partner at Protiviti, former Partner at Arthur Andersen), Steve Mar (Director of Internal Audit, Microsoft), Norman Marks (Chief Audit Executive, Maxtor Corporation), Jay Taylor (General Director, Internal IT Audit, General Motors Corporation), Heriot Prentice (Diretor, Technology Practices, IIA), as well as Julia Allen and Eileen Forrester (both with Software Engineering Institute at Carnegie Mellon University).

(My two primary collaborators in this project are shown on the right. Norman Marks is on the left, and his evil twin is Ed Hill. They were phenomenal team members, because they both believed there was a better way to do IT scoping, but they would often be on opposite sides of the argument. But by careful argumentation, we were able to derive the methodology.)

The post Audit 101 For DevOps: Resource Guide for "The Phoenix Project" (Part 3: Correctly Scoping IT Using GAIT and GAIT-R) appeared first on IT Revolution.


----

Shared via my feedly reader


Sent from my iPhone

No comments:

Post a Comment