Explainable Performance (with S. Hué, C. Hurlin and S. Saurin) New!
We introduce the XPER (eXplainable PERformance) methodology to measure the specific contribution of the input features to the predictive or economic performance of a model. Our methodology offers several advantages. First, it is both model-agnostic and performance metric-agnostic. Second, XPER is theoretically founded as it is based on Shapley values. Third, the interpretation of the benchmark, which is inherent in any Shapley value decomposition, is meaningful in our context. Fourth, XPER is not plagued by model specification error, as it does not require re-estimating the model. Fifth, it can be implemented either at the model level or at the individual level. In an application based on auto loans, we find that performance can be explained by a surprisingly small number of features, XPER decompositions are rather stable across metrics, yet some feature contributions switch sign across metrics. Our analysis also shows that explaining model forecasts and model performance are two distinct tasks.
Computational Reproducibility in Finance: Evidence from 1,000 Tests (with O. Akmansoy, C. Hurlin, A. Menkveld, A. Dreber, F. Holzmeister, J. Huber, M. Johannesson, M. Kirchler, M. Razen, U. Weitzel)
We analyze the computational reproducibility of more than 1,000 empirical answers to six research questions in finance provided by 168 international research teams. Surprisingly, neither researcher seniority, nor the quality of the research paper seem related to the level of reproducibility. Moreover, researchers exhibit strong overconfidence when assessing the reproducibility of their own research and underestimate the difficulty faced by their peers when attempting to reproduce their results. We further find that reproducibility is higher for researchers with better coding skills and for those exerting more effort. It is lower for more technical research questions and more complex code.
Non-Standard Errors (with A. Menkveld, A. Dreber, F. Holzmeister, J. Huber, M. Johannesson, M. Kirchler, M. Razen, U. Weitzel et al.) Updated
My contribution: I co-designed and co-implemented the reproducibility verification policy of the #fincap project
In statistics, samples are drawn from a population in a data-generating process (DGP). Standard errors measure the uncertainty in estimates of population parameters. In science, evidence is generated to test hypotheses in an evidence generating process (EGP). We claim that EGP variation across researchers adds uncertainty: Non-standard errors (NSEs). We study NSEs by letting 164 teams test the same hypotheses on the same data. NSEs turn out to be sizable, but smaller for better reproducible or higher rated research. Adding peer-review stages reduces NSEs. We further find that this type of uncertainty is underestimated by participants.
The Fairness of Credit Scoring Models (with C. Hurlin and S. Saurin)
In credit markets, screening algorithms aim to discriminate between good-type and bad-type borrowers. However, when doing so, they also often discriminate between individuals sharing a protected attribute (e.g. gender, age, racial origin) and the rest of the population. In this paper, we show how (1) to test whether there exists a statistically significant difference between protected and unprotected groups, which we call lack of fairness and (2) to identify the variables that cause the lack of fairness. We then use these variables to optimize the fairness-performance trade-off. Our framework provides guidance on how algorithmic fairness can be monitored by lenders, controlled by their regulators, and improved for the benefit of protected groups.
The Economics of Research Reproducibility (with J.-E. Colliard and C. Hurlin) To be updated soon
We investigate why economics displays a relatively low level of research reproducibility. We first study the benefits and costs of reproducibility for readers (demand side) and authors (supply side), as well as the role of academic journals in matching both sides. Second, we prove that competition between journals to attract authors can lead to a suboptimally low level of reproducibility. Third, we show how to optimize the costs of reproducibility and estimate that reaching the highest level of reproducibility could cost USD 365 per paper. Finally, we discuss how leading journals can move economics out of a low-reproducibility equilibrium.