Tag Archives: data warehouse

Needles & Haystacks

A long-time acquaintance of mine told me recently that, fresh out of the University of Virginia and new to forensic accounting, his first assignment consisted in searching, at the height of summer, through two unairconditioned trailers full of thousands of savings and loan records for what turned out to be just two documents critical to proving a loan fraud. He told me that he thought then that his job would always consist of finding needles in haystacks. Our profession and our tools have, thankfully, come a long way since then!

Today, digital analysis techniques afford the forensic investigator the ability to perform cost-effective financial forensic investigations. This is achieved through the following:

— The ability to test or analyze 100 percent of a data set, rather than merely sampling the data set.
–Massive amounts of data can be imported into working files, which allows for the processing of complex transactions and the profiling of certain case-specific characteristics.
–Anomalies within databases can be quickly identified, thereby reducing the number of transactions that require review and analysis.
–Digital analysis can be easily customized to address the scope of the engagement.

Overall, digital analysis can streamline investigations that involve a large number of transactions, often turning a needle-in-the-haystack search into a refined and efficient investigation. Digital analysis is not designed to replace the pick-and-shovel aspect of an investigation. However, the proper application of digital analysis will permit the forensic operator to efficiently identify those specific transactions that require further investigation or follow up.

As every CFE knows, there are an ever-growing number of software applications that can assist the forensic investigator with digital analysis. A few such examples are CaseWare International Inc.’s IDEA, ACL Services Ltd.’s ACL Desktop Edition, and the ActiveData plug-in, which can be added to Excel.

So, whether using the Internet in an investigation or using software to analyze data, fraud examiners can today rely heavily on technology to aid them in almost any investigation. More data is stored electronically than ever before; financial data, marketing data, customer data, vendor listings, sales transactions, email correspondence, and more, and evidence of fraud can be located within that data. Unfortunately, fraudulent data often looks like legitimate data when viewed in the raw. Taking a sample and testing it might or might not uncover evidence of fraudulent activity. Fortunately, fraud examiners now have the ability to sort through piles of information by using special software and data analysis techniques. These methods can identify future trends within a certain industry, and they can be configured to identify breaks in audit control programs and anomalies in accounting records.

In general, fraud examiners perform two primary functions to explore and analyze large amounts of data: data mining and data analysis. Data mining is the science of searching large volumes of data for patterns. Data analysis refers to any statistical process used to analyze data and draw conclusions from the findings. These terms are often used interchangeably.

If properly used, data analysis processes and techniques are powerful resources. They can systematically identify red flags and perform predictive modeling, detecting a fraudulent situation long before many traditional fraud investigation techniques would be able to do so.

Big data is now a buzzword in the worlds of business, audit, and fraud investigation. Big data are high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery, and process optimization. Simply put, big data is information of extreme size, diversity, and complexity.

In addition to thinking of big data as a single set of data, fraud investigators should think about the way data grow when different data sets are connected together that might not normally be connected. Big data represents the continuous expansion of data sets, the size, variety, and speed of generation of which makes it difficult to manage and analyze.

Big data can be instrumental to fact gathering during an investigation. Distilled down to its core, how do fraud examiners gather data in an investigation? We look at documents and financial or operational data, and we interview people. The challenge is that people often gravitate to the areas with which they are most comfortable. Attorneys will look at documents and email messages and then interview individuals. Forensic accounting professionals will look at the accounting and financial data (structured data). Some people are strong interviewers. The key is to consider all three data sources in unison. Big data helps to make it all work together to tell the complete picture. With the ever-increasing size of data sets, data analytics has never been more important or useful. Big data requires the use of creative and well-planned analytics due to its size and complexity. One of the main advantages of using data analytics in a big data environment is, as indicated above, that it allows the investigator to analyze an entire population of data rather than having to choose a sample and risk drawing conclusions in the event of a sampling error.

To conduct an effective data analysis, a fraud examiner must take a comprehensive approach. Any direction can (and should) be taken when applying analytical tests to available data. The more creative fraudsters get in hiding their schemes, the more creative the fraud examiner must become in analyzing data to detect these schemes. For this reason, it is essential that fraud investigators consider both structured and unstructured data when planning their engagements.
Data are either structured or unstructured. Structured data is the type of data found in a database, consisting of recognizable and predictable structures. Examples of structured data include sales records, payment or expense details, and financial reports.

Unstructured data, by contrast, is data not found in a traditional spreadsheet or database. Examples of unstructured data include vendor invoices, email and user documents, human resources files, social media activity, corporate document repositories, and news feeds.

When using data analysis to conduct a fraud examination, the fraud examiner might use structured data, unstructured data, or a combination of the two. For example, conducting an analysis on email correspondence (unstructured data) among employees might turn up suspicious activity in the purchasing department. Upon closer inspection of the inventory records (structured data), the fraud examiner might uncover that an employee has been stealing inventory and covering her tracks in the records.

Data mining has roots in statistics, machine learning, data management and databases, pattern recognition, and artificial intelligence. All of these are concerned with certain aspects of data analysis, so they have much in common; yet they each have a distinct and individual flavor, emphasizing particular problems and types of solutions.

Although data mining technologies provide key advantages to marketing and business activities, they can also manipulate financial data that was previously hidden within a company’s database, enabling fraud examiners to detect potential fraud.

Data mining software provides an easy to use process that gives the fraud examiner the ability to get to data at a required level of detail. Data mining combines several different techniques essential to detecting fraud, including the streamlining of raw data into understandable patterns.

Data mining can also help prevent fraud before it happens. For example, computer manufacturers report that some of their customers use data mining tools and applications to develop anti-fraud models that score transactions in real-time. The scoring is customized for each business, involving factors such as locale and frequency of the order, and payment history, among others. Once a transaction is assigned a high-risk score, the merchant can decide whether to accept the transaction, deny it, or investigate further.

Often, companies use data warehouses to manage data for analysis. Data warehouses are repositories of a company’s electronic data designed to facilitate reporting and analysis. By storing data in a data warehouse, data users can query and analyze relevant data stored in a single location. Thus, a company with a data warehouse can perform various types of analytic operations (e.g., identifying red flags, transaction trends, patterns, or anomalies) to assist management with its decision making responsibilities.

In conclusion, after the fraud examiner has identified the data sources, s/he should identify how the information is stored by reviewing the database schema and technical documentation. Fraud examiners must be ready to face a number of pitfalls when attempting to identify how information is stored, from weak or nonexistent documentation to limited collaboration from the IT department.

Moreover, once collected, it’s critical to ensure that the data is complete and appropriate for the analysis to be performed. Depending on how the data was collected and processed, it could require some manual work to make it usable for analysis purposes; it might be necessary to modify certain field formats (e.g., date, time, or currency) to make the information usable.

Concurrent Fraud Auditing (CFA) as a Tool for Fraud Prevention

JeSuisCharlieOne of our CFE chapter members left us a contact comment asking whether concurrent fraud auditing might not be a good anti-fraud tool for use by a retailer client of hers that receives hundreds of credit card payments for services each day.  The foundational concepts behind concurrent fraud auditing owe much to the idea of continuous assurance auditing (CAA) that internal auditors have applied for years.  Basically, at the heart of a system of concurrent fraud auditing (CFA) like that of CAA,  is the process of embedding control based software monitors in real time, automated financial or payment systems to alert reviewers of transactional anomalies as close to their occurrence as possible.  Today’s networked processing environments have made the implementation and support of such real time review approaches operationally feasible in ways that the older, batch processing based environments couldn’t.

Our member’s client uses several on-line, cloud based services to process its customer payments; these services provide our member’s client with a large database full of payment history, tantamount to a data warehouse, all available for use on SQL server,  by in-house client IT applications like Oracle and Microsoft Access.  In such a data rich environment, CFE’s and other assurance professionals can readily test for the presence of transactional patterns characteristic of defined, common payment fraud scenarios such as those associated with identity theft and money laundering.   The objective of the CFA program is not necessarily to recover the dollars associated with on-line frauds but to continuously (in as close to real time as possible) adjust the edits in the payment collection and processing system so that certain fraudulent transactions (those associated with known fraud scenarios) stand a greater chance of not even getting processed in the first place.  Over time, the CFA process should get better and better at editing out or flagging the anomalies associated with your defined scenarios.

The central process of any CFA system is that of an independent application monitoring for suspected fraud related activity through, for example (as with our Chapter member), periodic (or even real time) reviews of the cloud based files of an automated payment system. Depending upon the degree of criticality of the results of its observations, activity summaries of unusual items can be generated with any specified frequency and/or highlighted to an exception report folder and communicated to auditors via “red flag” e-mail notices.  At the heart of the system lies a set of measurable, operational metrics or tags associated with defined fraud scenarios.  The fraud prevention team would establish the metrics it wishes to monitor as well as supporting standards for those metrics.   As a simple example, the U.S. has established anti-money-laundering banking rules specifying that all transactions over $10,000 must be reported to regulators.  By experience, the $10,000 threshold is a fraud related metric investigators have found to be generic in the identification of many money-laundering fraud scenarios.  Anti-fraud metric tags could be built into the cloud based financial system of our Chapter member’s client to monitor in real time all accounts payable and other cash transfer transactions with a rule that any over $10,000 would be flagged and reviewed by a member of the audit staff.  This same process could have multiple levels of metrics and standards with exceptions fed up to a first level assurance process that could monitor the outliers and, in some instances,  send back a correcting  feedback transaction to the financial system itself (an adjusting or corrective edit or transaction flag).  The warning notes that our e-mail systems send us that our mailboxes are full are another example of this type of real time flagging and editing.

Yet other types of discrepancies would flow up to a second level fraud monitoring or audit process.  This level would produce pre-formatted reports to management or constitute emergency exception notices.  Beyond just reports, this level could produce more significant anti-fraud or assurance actions like the referral of a transaction or group of transactions to an enterprise fraud management committee for consideration as documentation of the need for an actual future financial system fraud prevention edit. To continue the e-mail example, this is where the system would initiate a transaction to prevent future mailbox accesses by an offending e-mail user.

There is additionally yet a third level for our system which is to use the CFA to monitor the concurrent fraud auditing process itself.  Control procedures can be built to report monitoring results to external auditors, governmental regulators, the audit committee and to corporate council as documented evidence of management’s performance of due diligence in its fight against fraud.

So its no surprise that  I would certainly encourage our member to discuss the CFA approach with the management of her client.  It isn’t the right tool for everyone since such systems can vary greatly in cost depending upon the existing processing environment and level of IT sophistication of the developing organization but the discussion is worth the candle. CFA’s are particularly useful for monitoring purchase and payment cycle applications with an emphasis on controls over customer and vendor related fraud.  CFA is an especially useful tool for any financial application where large amounts of cash are either coming in our going out the door like banking applications and especially  to control all aspects of the processing of insurance claims.