A.I.’s un-learning problem: Researchers say it’s virtually impossible to make an A.I. model ‘forget’ the things it learns from private user data

@[email protected] · 10 months ago

A.I.’s un-learning problem: Researchers say it’s virtually impossible to make an A.I. model ‘forget’ the things it learns from private user data

Pichu0102 · 10 months ago

I feel like one way to do this would be to break up models and their training data into mini-models and mini-batches of training data instead of one big model, and also restricting training data to that used with permission as well as public domain sources. For all other cases where a company is required to take down information in a model that their permission to use was revoked or expired, they can identify the relevant training data in the mini batches, remove it, then retrain the corresponding mini model more quickly and efficiently than having to retrain the entire massive model.

A major problem with this though would be figuring out how to efficiently query multiple mini models and come up with a single response. I’m not sure how you could do that, at least very well…

@[email protected] · 10 months ago

You could certainly break up training data, but breaking up the models into mini models based on which training data is used wouldn’t work with neural networks trained using gradient descent. Basically whatever the state of the model is it depends on the totality of the training data that it has been trained on (and the order) and it isn’t possible to go and remove the effect of a specific training data point without then retraining for all of the data that followed that data point (and even that assumes you were storing a snapshot of the model before every single training data point, which I doubt anyone does)

However, that’s no excuse and it is of course possible to entirely retrain a network using a clean dataset and that is what these companies should do

@[email protected] · 10 months ago

Am I correct in assuming that sounds a bit like libraries used in programming?

eltimablo · 10 months ago

I believe this is how the Tesla FSD beta AI works.