Research

A survey on data attribution with focus on generative AI.

SSRN

A Reliable Cryptographic Framework for Empirical Machine Unlearning Evaluation

Yiwen Tu*,

Pingbang Hu*,

Sep 18th 2025

NeurIPS 2025

#Trustworthy

#Unlearning

We design the first efficient machine unlearning evaluation metric with provable guarantees.

GraSS: Scalable Data Attribution with Gradient Sparsification and Sparse Projection

Pingbang Hu,

Joseph Melkonian,

Weijing Tang,

Han Zhao,

Sep 18th 2025

NeurIPS 2025

#Optimization

We propose an efficient gradient compression algorithm to accelerate and scale gradient-based data attribution methods to billion-scale models.

Adversarial Attack on Data Attribution

Xinhe Wang,

Pingbang Hu,

Junwei Deng,

Jan 22nd 2025

ICLR 2025

#Trustworthy

We consider the adversarial attack on training data attribution methods.

Poster

dattri: A Library for Efficient Data Attribution

Junwei Deng*,

Ting-Wei Li*,

Shiyuan Zhang,

Yijun Pan,

Hao Huang,

Xinhe Wang,

Pingbang Hu,

Xingjian Zhang,

Sep 26th 2024

NeurIPS 2024 D&B (Spotlight)

#Library

We developed a efficient library for data attribution, aiming to streamline the development of data attribution algorithms.

Poster

Most Influential Subset Selection: Challenges, Promises, and Beyond

Yuzheng Hu,

Pingbang Hu,

Han Zhao,

Sep 25th 2024

NeurIPS 2024

#Learning Theory

We provide a comprehensive study of the common practices in the Most Influential Subset Selection (MISS) problem.

Poster

Pseudo-Non-Linear Data Augmentation via Energy Minimization

Pingbang Hu,

Mahito Sugiyama

Sep 7th 2024

In Submission

#Information Geometry

#Data Augmentation

We propose a new non-linear data augmentation framework powered by information geometry.

Travel the Same Path: A Novel TSP Solving Strategy

Pingbang Hu

Oct 12th 2022

Side Project

#Optimization

Exploring a novel approach to exactly solve an NP-hard combinatorial optimization problem by using imitation learning.