
Removing private data from AI models? Now you can without accessing the original datasets.
A team from the University of California, Riverside, has demonstrated a new way to remove private and copyrighted data from AI models without accessing the original datasets. The solution addresses the problem of personal and paid content being reproduced almost verbatim in responses, even when the sources are removed or locked behind passwords and paywalls. The approach is called “source-free certified unlearning.” A surrogate set that is statistically similar to the original is used. The model parameters are modified as if it were retrained from scratch. Carefully calculated random noise is introduced to ensure cancellation. The method features a novel noise calibration










