DEEPTI’S RESEARCH DISPATCH
Categorizing the Risks and Concerns around Generative AI — Part 2
A roundup of the coverage: paranoia edition
--
This is part 2 of my research roundup on the concerns surrounding generative AI. I find it useful to put the risks into categories, to better understand the range and scope of potential issue and think about potential solutions (especially regulation).
I give more background and context in Part 1, so I’d recommend starting there if you are new to the topic.
Since this is a research roundup, it is heavy on links and excerpts of other articles and might make for a dense read. So, I encourage skimming, jumping to the sections you are interested in, and following through to the linked sources to aid your own research.
In Part 1, I covered the following categories of risks and concerns:
- People trusting false AI answers
- Misinformation and disinformation
- Perpetuating bias
- Manipulation and mental health concerns.
Now, continuing onto the new categories:
5. Inappropriately sourced and mishandled training data
AI ‘learns’ by finding patterns in vast amounts of data from the internet, known as ‘training data.’ Training data includes the personal information and creative outputs of innumerable people.
The training data sets were not typically collected from their rightful owners with informed consent, so this creates concerns around privacy and misuse.
Privacy concerns
Personal data used in training sets is stored on servers owned by AI companies. If these companies’ security practices are inadequate, data breaches can make people vulnerable to identity theft, fraud, and harassment.
This article describes privacy concerns of AI (including generative AI): AI and Privacy: The privacy concerns surrounding AI, its potential impact on personal data — The Economic Times.
Apart from sitting on a server, the training data influences the outputs of the AI program. If…