Some models are useful, but for how long?: A decision theoretic approach to choosing when to refit large-scale prediction models

Fiche du document

Date

22 mai 2024

Type de document
Périmètre
Identifiant
  • 2405.13926
Collection

arXiv

Organisation

Cornell University



Sujets proches En

Pattern Model

Citer ce document

Kentaro Hoffman et al., « Some models are useful, but for how long?: A decision theoretic approach to choosing when to refit large-scale prediction models », arXiv - économie


Partage / Export

Résumé 0

Large-scale prediction models (typically using tools from artificial intelligence, AI, or machine learning, ML) are increasingly ubiquitous across a variety of industries and scientific domains. Such methods are often paired with detailed data from sources such as electronic health records, wearable sensors, and omics data (high-throughput technology used to understand biology). Despite their utility, implementing AI and ML tools at the scale necessary to work with this data introduces two major challenges. First, it can cost tens of thousands of dollars to train a modern AI/ML model at scale. Second, once the model is trained, its predictions may become less relevant as patient and provider behavior change, and predictions made for one geographical area may be less accurate for another. These two challenges raise a fundamental question: how often should you refit the AI/ML model to optimally trade-off between cost and relevance? Our work provides a framework for making decisions about when to {\it refit} AI/ML models when the goal is to maintain valid statistical inference (e.g. estimating a treatment effect in a clinical trial). Drawing on portfolio optimization theory, we treat the decision of {\it recalibrating} versus {\it refitting} the model as a choice between ''investing'' in one of two ''assets.'' One asset, recalibrating the model based on another model, is quick and relatively inexpensive but bears uncertainty from sampling and the possibility that the other model is not relevant to current circumstances. The other asset, {\it refitting} the model, is costly but removes the irrelevance concern (though not the risk of sampling error). We explore the balancing act between these two potential investments in this paper.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en