Sunday 11 August 2024

The risk if AI model collapse / The death of generative AI

Model collapse refers to a phenomenon where machine learning models gradually degrade due to errors stemming from unchecked training on synthetic data. Specifically, this synthetic data includes outputs from other models, including prior versions of the same model. There are two distinct stages of model collapse:

  1. Early Model Collapse: In the early stages of collapse, it can be hard to detect as performance could appear to improve while the AI starts to lose its grasps on the smaller details.

  2. Late Model Collapse: This is where performance and accuracy both start to suffer greatly, with the AI becoming confused and losing much of its variance.
A study by Duke University researcher Emily Wenge where an AI model was giving a task of generating dog breeds, at first the AI would recreate breeds most common in its training data and may start to over represent a single group of breeds if it's held more in its data.

As new generations are trained using the older generation data it would compound the over representation until rare breeds disappeared from the newer generated data all together, over time this would lead to a total collapse where the new AI would just be outputting a single breed of dog.

This risk of collapse undermines and threatens generative AI as a useful tool and as human generated content is starting to be limited to the AI training set and AI generated content is on the rise are we heading towards a totally avoidable dumbing of AI.

Should we not reframe our view on AI in this process and allow it the same freedom of access as a human to the data online. allowing the growth of a tool that could change how we interact with information on a whole.


No comments:

Post a Comment