Where the Money Is

bank-robberyOne of the followers of our Central Virginia Chapter’s group on LinkedIn is a bank auditor heavily engaged in his organization’s analytics based fraud control program.  He was kind enough to share some of his thoughts regarding his organization’s sophisticated anti-fraud data modelling program as material for this blog post.

Our LinkedIn connection reports that, in his opinion, getting fraud data accurately captured, categorized, and stored is the first, vitally important challenge to using data-driven technology to combat fraud losses. This might seem relatively easy to those not directly involved in the process but, experience quickly reveals that having fraud related data stored reliably over a long period of time and in a readily accessible format represents a significant challenge requiring a systematic approach at all levels of any organization serious about the effective application of analytically supported fraud management. The idea of any single piece of data being of potential importance to addressing a problem is a relatively new concept in the history of banking and of most other types of financial enterprises.

Accumulating accurate data starts with an overall vision of how the multiple steps in the process connect to affect the outcome. It’s important for every member of the fraud control team to understand how important each process pre-defined step is in capturing the information correctly — from the person who is responsible for risk management in the organization to the people who run the fraud analytics program to the person who designs the data layout to the person who enters the data. Even a customer service analyst or a fraud analyst not marking a certain type of transaction correctly as fraud can have an on-going impact on developing an accurate fraud control system. It really helps to establish rigorous processes of data entry on the front end and to explain to all players exactly why those specific processes are in place. Process without communication and communication without process both are unlikely to produce desirable results. In order to understand the importance of recording fraud information correctly, it’s important for management to communicate to all some general understanding about how a data-driven detection system (whether it’s based on simple rules or on sophisticated models) is developed.

Our connection goes on to say that even after an organization has implemented a fraud detection system that is based on sophisticated techniques and that can execute effectively in real time, it’s important for the operational staff to use the output recommendations of the system effectively. There are three ways that fraud management can improve results within even a highly sophisticated system like that of our LinkedIn connection.

The first strategy is never to allow operational staff to second-guess a sophisticated model at will. Very often, a model score of 900 (let’s say this is an indicator of very high fraud risk), when combined with some decision keys and sometimes on its own, can perform extremely well as a fraud predictor. It’s good practice to use the scores at this high risk range generated by a tested model as is and not allow individual analysts to adjust it further. This policy will have to be completely understood and controlled at the operational level. Using a well-developed fraud score as is without watering it down is one of the most important operational strategies for the long term success of any model. Application of this rule also makes it simpler to identify instances of model scoring failure by rendering them free of any subsequent analyst adjustments.

Second, fraud analysts will have to be trained to use the scores and the reason codes (reason codes explain why the score is indicative of risk) effectively in operations. Typically, this is done by writing some rules in operations that incorporate the scores and reason codes as decision keys. In the fraud management world, these rules are generally referred to as strategies. It’s extremely important to ensure strategies are applied uniformly by all fraud analysts. It’s also essential to closely monitor how the fraud analysts are operating using the scores and strategies.

Third, it’s very important to train the analysts to mark transactions that are confirmed or reported to be fraudulent by the organization’s customers accurately in their data store.

All three of these strategies may seem very straight forward to accomplish, but in practical terms, they are not that easy without a lot of planning, time, and energy. A superior fraud detection system can be rendered almost useless if it is not used correctly. It is extremely important to allow the right level of employee to exercise the right level of judgment.  Again, individual fraud analysts should not be allowed to second-guess the efficacy of a fraud score that is the result of a sophisticated model. Similarly, planners of operations should take into account all practical limitations while coming up with fraud strategies (fraud scenarios). Ensuring that all of this gets done the right way with the right emphasis ultimately leads the organization to good, effective fraud management.

At the heart of any fraud detection system is a rule or a model that attempts to detect a behavior that has been observed repeatedly in various frequencies in the past and classifies it as fraud or non-fraud with a certain rank ordering. We would like to figure out this behavior scenario in advance and stop it in its tracks. What we observe from historical data and our experience needs be converted to some sort of a rule that can be systematically applied to the data real-time in the future. We expect that these rules or models will improve our chance of detecting aberrations in behavior and help us distinguish between genuine customers and fraudsters in a timely manner. The goal is to stop the bleeding of cash from the account and to accomplish that as close to the start of the fraud episode as we can. If banks can accurately identify early indicators of on-going fraud, significant losses can be avoided.

In statistical terms, what we define as a fraud scenario would be the dependent variable or the variable we are trying to predict (or detect) using a model. We would try to use a few independent variables (as many of the variables used in the model tend to have some dependency on each other in real life) to detect fraud. Fundamentally, at this stage we are trying to model the fraud scenario using these independent variables. Typically, a model attempts to detect fraud as opposed to predict fraud. We are not trying to say that fraud is likely to happen on this entity in the future; rather, we are trying to determine whether fraud is likely happening at the present moment, and the goal of the fraud model is to identify this as close to the time that the fraud starts as possible.

In credit risk management, we try to predict if there will likely be serious delinquency or default risk in the future, based on the behavior exhibited in the entity today. With respect to detecting fraud, during the model-building process, not having accurate fraud data is akin to not knowing what the target is in a shooting range. If a model or rule is built on data that is only 75 percent accurate, it is going to cause the model’s accuracy and effectiveness to be suspect as well. There are two sides to this problem.  Suppose we mark 25 percent of the fraudulent transactions inaccurately as non-fraud or good transactions. Not only are we missing out on learning from a significant portion of fraudulent behavior, by misclassifying it as non-fraud, the misclassification leads to the model assuming the behavior is actually good behavior. Hence, misclassification of data affects both sides of the equation. Accurate fraud data is fundamental to addressing the fraud problem effectively.

So, in summary, collecting accurate fraud data is not the responsibility of just one set of people in any organization. The entire mind-set of the organization should be geared around collecting, preserving, and using this valuable resource effectively. Interestingly, our LinkedIn connection concludes, the fraud data challenges faced by a number of other industries are very similar to those faced by financial institutions such as his own. Banks are probably further along in fraud management and can provide a number of pointers to other industries, but fundamentally, the problem is the same everywhere. Hence, a number of techniques he details in this post are applicable to a number of industries, even though most of his experience is bank based. As fraud examiners and forensic accountants, we will no doubt witness the impact of the application of analytically based fraud risk management by an ever multiplying number of client industrial types.

Comments are closed.