Brief Summary
Influence functions and related data attribution scores take the form of inverse-sensitive bilinear functionals , where is a curvature operator and are training and test gradients. In modern overparameterized models, forming or inverting is prohibitive, motivating scalable influence computation via random projection with a sketch . This practice is commonly justified via the Johnson-Lindenstrauss (JL) lemma, which ensures approximate preservation of Euclidean geometry for a fixed dataset. However, preserving pairwise distances does not address how sketching behaves under inversion. Furthermore, there is no existing theory that explains how sketching interacts with other widely-used techniques, such as ridge regularization (replacing with ) and structured curvature approximations.
We develop a unified theory characterizing when projection provably preserves influence functions, with a focus on the required sketch size . When , we show that:
- Unregularized projection: exact preservation holds if and only if is injective on , which necessitates ;
- Regularized projection: ridge regularization fundamentally alters the sketching barrier, with approximation guarantees governed by the effective dimension of at the regularization scale . This dependence is both sufficient and worst-case necessary, and can be substantially smaller than ;
- Factorized influence: for Kronecker-factored curvatures , the guarantees continue to hold for decoupled sketches , even though such sketches exhibit structured row correlations that violate canonical i.i.d. assumptions; the analysis further reveals an explicit computational–statistical trade-off inherent to factorized sketches.
Beyond this range-restricted setting, we analyze out-of-range test gradients and quantify a sketch-induced leakage term that arises when test gradients have components in . This yields guarantees for influence queries on general, unseen test points.
Overall, this work develops a novel theory that characterizes when projection provably preserves influence and provides principled, instance-adaptive guidance for choosing the sketch size in practice.
Citation
@misc{hu2026unified,
title={A Unified Theory of Random Projection for Influence Functions},
author={Pingbang Hu and Yuzheng Hu and Jiaqi W. Ma and Han Zhao},
year={2026},
eprint={2602.10449},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2602.10449},
}