The predictive power of blended models in assessing small business credit risk
The entrepreneurial spirit is alive and well, and living in America. In the last decade, thousands of well-educated and seasoned professionals, many of them downsized mid-career during the recession-ridden
By 1994, small business had become the growth engine of economic recovery, and it now constitutes a considerable force in the marketplace. Moreover, as corporate mergers and reorganization continue to displace workers, small business start-ups are only likely to increase. Financial institutions and other organizations can ill-afford to ignore the presence of this burgeoning market segment. But neither can they afford to overlook the relatively high-risk, low-margin nature of small business generally. Making loans and extending credit within this segment can be dicey, and companies eager to capitalize on its robust growth have come to recognize that it can only be leveraged profitably and without excessive exposure through risk management techniques tailored to its particular volatility.
With increasing frequency marketing and risk management methods honed in the consumer credit arena are being used to evaluate commercial enterprises. Credit risk scoring has proved especially useful in answering the demand for higher productivity at ever-lower costs. By applying statistical analysis and automated decision processing to credit and marketing functions, financial institutions and businesses following their lead have drastically reduced credit decision turnaround times. Many have begun applying risk scoring to the direct marketing selection process as well, thus enabling organizations-often for the first time-to establish consistency between targeting (direct marketing) and acquisition (application processing) strategies. However, in applying risk scoring techniques to small businesses, problems arise stemming from the often limited amount of commercial credit data available.
While the personal credit history of small business owners is frequently more plentiful and/or readily available than credit data for the business itself, some risk managers refrain from using it for commercial decisions due to the cost and implications of complying with the federal Fair Credit Reporting Act (FCRA). Others, while conceding the value of consumer data, disagree over the point at which predictive power shifts from consumer to business data in the life cycle of a small business.
To examine these issues more closely and evaluate the relative forecasting strength of commercial and consumer data considered separately and in combination, the Analytical Research & Development team at Experian recently undertook a study using three risk scoring models:
The Consumer Risk Scoring Model: based on consumer credit performance data. Updated in 1995, this model is based on a sample population of 1 million consumers, with a performance window of 24 months.
The Commercial Risk Scoring Model: based on business credit performance data. This model was built in 1992, using a sample of 600,000 businesses of all sizes, across all industries. It contains six scoring segments and is designed to assess the performance of small, mid-size and large companies. Its window is six months.
The Blended Small Business Risk Scoring Model: combining both business and owner (consumer) credit performance data. Built in 1992 and further refined in 1996, this model is based on a sample of 500,000 small businesses. This model contains nine scoring segments and requires the user to submit an inquiry for both business and owner. This allows it to produce a score whether the inquiry matches consumer data only, both business and consumer data, or business data only. The performance window is 12 months.
Methodology
To conduct the analysis, a sample population of 1.4 million small businesses was compiled from business registrations filed with secretaries of state during the past 10 years. Each business in the sample contained identifying information for both business and owner.
Once the sample population was defined, each record was matched to Experian's business and consumer credit databases to gather performance information from three historical periods: 11/96, 11/97 and 11/98. This enabled the analytic team to evaluate each model using 12- and 24-month performance windows. Businesses were then classified as "good" or "bad" based on credit characteristics (see inset), and segmented into categories based on the match rate achieved, including:
* The total population of records which matched consumer data only
* The total population of records which matched business data only
* The total population of records which matched both consumer and business data
By segmenting data in this manner, analysts were able to compare the performance of all three models in predicting the risk of business delinquency, based on their definition of a "bad" business. All of the sample records were processed through each model used in the study. Those scoring in the bottom 20 percent were classified as the "worst scoring," from which the percentage of bad records identified by each model was drawn.
Business and Consumer Data Present (Blended Sample)
The first analysis assessed performance over a 12-month period and included records in which data was present for both business and owner. This was perhaps the best data set to analyze because the output population was essentially the same for all three models, with each scoring slightly more than 413,650 records. As Table 1 illustrates, the blended model was significantly more effective in identifying and appropriately scoring records defined as "bad" within this population.
Of the "bad" records identified, the blended model scored 57.5 percent of them in the worst 5 percent of all records scored. In other words, more than half the "bad" records in the sample would be eliminated by declining the bottom 5 percent of all scores. By comparison, the consumer model captured only 18.1 percent of the "bads" in the worst 5 percent, while the commercial model identified just 38.6 percent in the same segment. Moreover, the superiority of the blended model was maintained when the performance window was extended to 24 months, as can be seen in Table 2. (Because this analysis was based on historical data from November 1996, and the original sample included business registrations through the present, the matched population was smaller across all three models.)
The consumer model, originally designed using a 24-month performance window, showed less degradation from 12 to 24 months in this sample, but the overall performance was still significantly below that of the blended model. It is worth noting that even the commercial model exceeded the performance of the consumer model in scoring this population, indicating that when business data is present, commercial models are more effective in predicting small business risk.
Only Consumer Data Present (Consumer-Only Sample)
This population contained records in which data concerning the small business owner (only) matched Experian's consumer credit database, and there was no match for the business itself. As a result, the analysis compared the consumer-only segments of the blended model to the consumer model. Because the populations for the models using consumer data were larger than that for the commercial-only model, the respective "bad" rate (percent of all "bad" records in the sample) was smaller. Table 3 shows results for this analysis.
In this population of records, the blended model was even more effective than previously, identifying 67.5 percent of the "bads" in the worst 5 percent of all records scored, while the consumer model captured only 21.3 percent. (Because the commercial model uses only business performance data to score records, a sample could not be generated.) The results also demonstrated the blended model's clear superiority over the consumer model in evaluating small businesses, due to the fact that different parameters are built into each model. The consumer model analyzes data according to consumer-relevant criteria, whereas the blended model analyzes the same data based on business-relevant criteria. The distinction between the two illustrates the fallacy in assuming a business owner's personal risk score as a consumer can be used, by itself, to assess the risk of his or her business. Additionally, the greater accuracy of the blended model holds whether the performance window covers 12 months or 24, even though expandi ng the window creates a sample situation exactly suited to the consumer model's capabilities (i.e. only consumer data is present and the performance window is consistent with the model's design). See Table 4.
Only Business Data Present (Business-Only Sample)
This population contained records for which only business data was matched to Experian's business credit database, and there was no match for the owner. When inquiry data was available for both business and owner, it was relatively rare that no hit was made to the consumer database (in fact, the hit rate using business and owner inquiry data normally exceeds 85 percent, as it did in this analysis, but there are situations in which a hit is established for the business and not the owner). This analysis compared the business-only segments of the blended and commercial models. The results appear in Table 5.
Because this analysis isolated a sample of small businesses, the blended model was, once again, more effective than the commercial model in identifying "bads" in all segments. This is not surprising, given that it was designed specifically to predict small business performance, while the commercial model was designed to forecast the performance of businesses of all sizes. The performance of both models declined slightly when the performance window was extended, but each still exhibited strong predictive power.
Findings
In each comparative situation, the analysis showed the blended model to be more effective in predicting small business risk. This held true whether data was available for both business and owner, only for the business, or only for the owner. It should be noted that the analysis showed all three models to be useful in identifying risk, but the blended model clearly outperformed the
others.
Comparative analysis suggests that predicting the likelihood of a small business becoming delinquent is possible through use of a blended commercial model that incorporates the historical credit performance of both business and owner. Put another way, reliable risk scoring for small businesses can be achieved using an appropriate model and data. Thus, the credit application processes can be streamlined, even as confidence in credit decisions and loan approvals is increased. By more accurately identifying bad accounts early on, the blended model enables greater risk control and expanded loan and credit approval rates without compromising risk management goals. It allows, for example, a tiered approach to offering terms based on potential risk. Additionally, reliable scoring permits greater automation of decline/approve decisions, ultimately reducing acquisition costs per account. Performance improvements such as these afford opportunities for risk managers to place greater emphasis on approval strategy and to tal portfolio management, rather than expending limited resources on processing individual small business accounts. And as more of these are approved using existing or fewer resources, the overall profitability of the small business portfolio segment is enhanced.
Given current economic trends, businesses today cannot afford to discount the legitimate role of entrepreneurs in the marketplace. Fortunately, the means exist by which large organizations can afford to work with small ones, with less reluctance and far greater returns than in the past.
Scott Bronstein, Director of Product Marketing for Experian, holds an MBA from the University of California, Irvine, and a BA from San Jose State University. Angela Arriaza, Manager, Analytical Research & Development for Experian, holds a Masters in Statistics from the University of California, Los Angeles (UCLA).
BUSINESS AND CONSUMER HIT: 12-MONTH PERFORMANCE WINDOW
BLENDED MODEL COMMERCIAL MODEL CONSUMER MODEL
Businesses Scored 413,654 413,687 413,687
% Bad in Worst Scoring 5% 57.5% 38.6% 18.1%
% Bad in Worst Scoring 10% 66.9% 56.1% 31.6%
% Bad in Worst Scoring 20% 75.7% 67.6% 51.4%
Bad Rate 3.3% 3.3% 3.3%
BUSINESS AND CONSUMER HIT: 24-MONTH PERFORMANCE WINDOW
BLENDED MODEL COMMERCIAL MODEL CONSUMER MODEL
Businesses Scored 340,923 340,923 340,923
% Bad in Worst Scoring 5% 43.7% 30.9% 10.5%
% Bad in Worst Scoring 10% 52.4% 44.5% 28.4%
% Bad in Worst Scoring 20% 62.5% 55.4% 47%
Bad Rate 3.6% 3.6% 3.6%
CONSUMER HIT ONLY: 12-MONTH PERFORMANCE WINDOW
BLENDED MODEL COMMERCIAL MODEL CONSUMER MODEL
Businesses Scored 586,145 N/A 596,418
% Bad in Worst Scoring 5% 67.5% N/A 21.3%
% Bad in Worst Scoring 10% 72.2% N/A 36.3%
% Bad in Worst Scoring 20% 78.4% N/A 56.9%
Bad Rate 0.9% N/A 0.9%
CONSUMER HIT ONLY: 24-MONTH PERFORMANCE WINDOW
SMALL BUSINESS COMMERCIAL NATIONAL RISK
INTELLISCORE INTELLISCORE MODEL SCORE
Businesses Scored 666,653 N/A 676,254
% Bad in Worst Scoring 5% 52.5% N/A 22.4%
% Bad in Worst Scoring 10% 57.0% N/A 35.7%
% Bad in Worst Scoring 20% 64.9% N/A 55.7%
Bad Rate 1.0% N/A 1.0%
Business Hit Only: 12-Month Performance Window
Blended Model Commercial Model Consumer Model
Businesses Scored 130,179 119,029 N/A
% Bad in Worst Scoring 5% 58.9% 33.6% N/A
% Bad in Worst Scoring 10% 69.7% 55.3% N/A
% Bad in Worst Scoring 20% 73.1% 62.7% N/A
Bad Rate 4.5% 3.6% N/A