Amnesty International warns that AI models are built on privacy violations
Amnesty International has warned that major generative AI systems are powered by large-scale data pipelines rooted in mass invasions of privacy.
In a new briefing, ‘Unlawful by Design: Exposing the Human Rights Costs of Generative AI’, the organisation argues that companies developing generative AI tools rely on unlawful web scraping to collect vast amounts of online data, including personal information, often without the explicit consent of the people who created or appear in it.
The briefing examines models powering widely used standalone generative AI tools, including OpenAI’s GPT-3, Google’s Gemini, Meta’s Llama, DeepSeek, and tools by Midjourney and Stable Diffusion. Amnesty says the design choices behind these systems create systemic human rights risks, particularly around privacy, discrimination, freedom of thought, and environmental harms.
Amnesty argues that large-scale scraping and processing of online posts, images, and other personal data infringes privacy by design. It also warns that training datasets drawn from the open web can reproduce and amplify discriminatory content, stereotypes, and prejudices, especially along racial and gender lines.
The organisation also highlights the environmental costs of generative AI development, pointing to rising demand for energy-intensive chips, data centres, electricity, and water. It says AI infrastructure can negatively affect historically marginalised communities when land and resources are used to build and operate data centres.
Amnesty said it wrote to Google, OpenAI, Meta, Stability AI, Midjourney, DeepSeek, Intel, VMware, Microsoft, and Amazon about the findings and related human rights concerns. At the time of publication, it said Microsoft, Amazon, Intel, OpenAI, and Meta had responded.
The organisation is calling on states to prohibit standalone generative AI systems built using unlawful web scraping and to hold companies accountable for human rights abuses linked to the design and deployment of AI systems.
Why does it matter?
The briefing adds a strong human rights framing to the debate over the training data for generative AI. Instead of focusing only on copyright or competition, Amnesty argues that large-scale scraping of personal data raises privacy, discrimination, freedom of thought, and environmental concerns. Its recommendations would significantly raise the stakes for AI developers by treating non-consensual data extraction as a human rights issue requiring regulatory intervention.
Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!
What's Your Reaction?
Like
0
Dislike
0
Love
0
Funny
0
Angry
0
Sad
0
Wow
0