My concerns about generative AI

Note: this article will be regularly updated with the latest events

In the last few years we’ve seen the rise of Generative Artificial Intelligence (GAI) to the mainstream, chiefly propelled by OpenAI in a move that took everyone by surprise, even inside the technology world^{^1}. The initial capabilities of their flagship GAI product, ChatGPT, were impressive and promised even more impressive features for their next iterations. The product that engaged the curiosity from the early followers has now turned into an widely used tool, with adoption rates higher than 70% in certain fields^{^2}, and an almost unparalleled exposure. Undoubtedly, the release of ChatGPT has been a landmark event for the history of technology. And we didn’t have to wait long to see a sea of alternatives GAI products, like Google Gemini, Microsoft Copilot, Anthropic Claude, Meta AI or X Grok. Nowadays, everyone around the world seems to use or produce a GAI-related product… but at which cost?

Unethical acquisition of training data

GAI models require massive amounts of training data in order to output precise and clear responses, whether it be text, image or video. Some of the big companies in the field haven’t been afraid of resorting to unethical practices to obtain valuable data, disregarding consent or intellectual property. Some of the most known events:

ChatGPT 3 was infamously trained using Reddit data without paying for it, which sparkled an “API war” from Reddit^{^3} (although OpenAI has now struck a deal with Reddit to consume their data^{^4})
Twitter found itself in a similar situation^{^5}, having tweets scraped for training models, and although GAI wasn’t directly pointed at, the timing of some drastic limiting on Twitter’s API is self-explanator
Stability AI was sued for training on millions of pictures from Getty Images without consent^{^6}
Stability AI, Midjourney and Google were sued by artists for using copyrighted work without consent, credit or compensation^{^7}^{^8}^{^9}
OpenAI and Microsoft are sued by The New York Times for using millions of copyrighted articles to train chatbots that now compete directly with the popular newspaper^{^33}
The Authors Guild, a professional organization for writers in the United States, calls attention to the injustice of building lucrative generative AI technologies using copyrighted works and asks AI executives to obtain consent from, credit, and fairly compensate authors^{^28}. Other authors have also decided to take legal action on their own^{^30}.
Anthropic is sued by Universal Music Group (UMG), Concord Publishing and ABKCO Music&Records for having used copyrighted lyrics for training AI models^{^29}
Microsoft (GitHub Copilot) is sued by a collective of open-source developers due to license attribution infringement^{^31}^{^32}
Perplexity AI’s crawler lies about their user agent in order to extract website data^{^24}^{^25}
AI companies disrespectfully crawl websites, with no consideration of costs associated with such massive data extraction, as reported by ReadTheDocs^{^10}, OpenStreetMap^{^11}, or the European Digital Rights association^{^12}; and often ignoring standardised requests not to crawl^{^13}.
An Nvidia whistleblower accuses the company of scraping “a human lifetime” of videos per day to train their AI products^{^37}. This scraping is likely being rushed now, while copyright and training issues haven’t been settled yet, resulting in a massive legal gray area^{^38}.
Researchers have informed about how it’s impossible to make GAI models “forget” the things learnt from private, stolen or unethically acquired data. Instead, they promote the use of smaller models for the sake of privacy^{^59}. (Added on Dec 16th 2024).

Whistleblowing and suppressed criticism

Many of the top AI companies include abusive confidentiality clauses and financial incentives when contracting workers, to prevent them from speaking freely about their work. By some accounts that can be understandable, given the rapid research and development efforts taken on the field.

Workers in the industry cannot express freely their concerns about AI, like the case of Geoffrey Hinton (considered the godfather of AI) abandoning Google in May 2023^{^26}, or Timnit Gebru (former leader at Google’s ethical AI team) fired in 2020 after writing a research paper about the risks of large language models^{^27}.
A group of whistleblowers warns in this website about the risks of AI in the hands of companies with strong financial incentives, so they propose some principles for AI companies to follow.
OpenAI whistleblower warns of a reckless race to be the first reaching an AGI^{^43} (Artificial General Intelligence — a type of artificial intelligence that includes the spectrum of human-level intelligence^{^42}), willing to make security compromises in spite of the existential risk this poses to humanity.
Meredith Whittaker, former AI rearcher in Google (Alphabet), received in 2024 the Helmut Schmidt Future Prize and revealed how private companies put no restrictions on commercial surveillance, and how the current AI craze is a result of this toxic surveillance business model^{^57}. (Added on Nov 9th 2024).
Suchir Balaji, an OpenAI whistleblower, was found dead in his apartment. He worked as a researcher, gathering data from the internet for the company’s GPT-4 model. He denounced that his past employer illegally stole copyrighted material to train their program, and claimed that “this is not a sustainable model for the internet ecosystem as a whole”^{^58}. (Added on Dec 14th 2024).

Sustainability and resource consumption

The International Energy Agency summons a global conference for discussing the increasing consumption of energy to produce GAI models^{^14}, especially sourcing from data centers. The energy consumption for the GAI industry is expected to grow by 2026 as much as ten times its demand in 2023^{^15}^{^16}. Even OpenAI’s CEO Sam Altman pointed out that GAI is consuming more energy than expected^{^17}.
There’s no transparency around carbon dioxide emissions that can be attributed to GAI. The carbon footprint is one of the least represented issues associated with AI ethics^{^18}^{^19}. Researchers argue the need for a “Green AI” that focuses on increasing the efficiency of computation rather than the current focus on what they describe as “Red AI” — accurate models trained without consideration of resource costs^{^20}^{^21}.
Researchers estimate the “secret” water footprint of generative AI to be a bottle of water per 100 words^{^35}. Up to 700,000 litres of clean freshwater might have been used for the training of ChatGPT 3 alone, from Microsoft’s US data centers^{^36}.
Nvidia estimates that 80-90% of the cost of neural networks lies in inference processing^{^22} (e.g. when ChatGPT processes and replies to a prompt). The Massachusetts Institute of Technology estimates the cost of training a single AI model to the carbon emissions of five cars in their lifetimes^{^23}. Models are trained and retrained many times during their research and development.
Research and advisory firm Gartner predicts 30% of generative AI projects to be abandoned after an initial proof of concept^{^34} due to poor data quality, inadequate risk controls, escalating costs, or unclear business value.

False productivity

Generative AI can deliver useful pieces of information for our inquiry, like bringing up a different perspective, or uncovering some unknown-unknowns. However, relying on its output without cross-reference and validation can have unintentional consequences on productivity.

False sense of expertise: the almost-instant and personalised response to queries can produce a false sense confidence over the subject matter. In environments without peer-reviewing or other methodologies for contrasting information, this can lead to users incorrectly believing that their knowledge is greater than it actually is.
Lack of critical and creative thinking: average users won’t stop to consider the veracity of the GAI’s opaque output, which stripped of reasoning or deliberation, will be regarded as a convenient truth. Even being aware of this bias might not be able to prevent it. GAI robs people of the opportunity to practice the process of making thoughtful and defensible decisions on their own^{^44}. A study by the Swiss Business School demonstrated a significant negative correlation between the frequent use of GAI tools and critical thinking abilities, inadvertently diminish users’ engagement in deep, reflective thinking processes^{^60} (added on Jan 16th 2025).
Erosion of trust: as more GAI-generated content makes its way into the Internet, it’s harder to discern what’s real and what’s fake in a sea of slop^{^40} — wasting time and effort from users. The so-called “zombie internet” has also reached social networks, where a mix of bots, humans and accounts that were once humans but aren’t any more mix together to form a disastrous website where there is little social connection at all^{^41}.
Deskilling: overreliance in AI can make a negative impact on the users’ fundamental knowledge and skills, since those have to be practiced and exercised just like a muscle^{^46}. A study showed that users value the AI’s help more when the task is harder, and value a simpler explanation more than a complex one^{^45}. Many will counter this argument by saying that the skills we need are changing, but that scenario leaves humans as uninspiring AI supervisors^{^47}.
Psychological dependence: chatbot interactions can create strong dependencies that can be compared to dystopical stories like the film Her (2013)^{^54}: there are already reports of addictive relationships with role-playing chatbots like CharacterAI and ReplikaAI^{^55}. (Added on Nov 9th 2024).

Dangerous misuses

Entrenchment of existing biases and inequalities: as the Internet is being ransacked for training data, content creators are starting to include AI output in their releases. New models will inevitably end up being trained with content partially or fully AI-produced, creating a cycle that reinforces real facts, but also biases, inequalities and misinformation.
Massive surveillance and manipulation: GAI has unprecedented capabilities to generate information for a massive amount of users, so holding power and control over it is a strategic interest of certain actors. It’s not difficult to imagine what that could mean, if it ends in the wrong hands — even if legitimate.
Weapon development:
- Researchers reported getting 40,000 new possible biochemical weapons suggested by an AI in just six hours, raising concerns about the misuse of AI in the scientific community^{^39}. Having just used Google’s generative models and some open-source toxicity datasets, the researchers indicated that it would be fairly easy for someone to replicate their work.
- Anthropic^{^48}, Meta^{^49}, OpenAI^{^50} and Microsoft^{^51} supply AI and cloud solutions for military purposes. Among the beneficiaries are several US defence agencies, Palantir (which publicly promotes the use of AI for massive surveillance^{^52}), and Chinese researchers linked to the People’s Liberation Army^{^53}. (Added on Nov 9th 2024).
- The Israeli Army uses an AI system named Lavender in Gaza, which applies pattern recognition-driven signature strikes popularized by the United States, combined with mass surveillance infrastructures and techniques of AI targeting. Once a person is on the Lavender kill list, it’s not just them who’s targeted, but the building they (and their family, neighbors, pets, whoever else) live is subsequently marked for bombing^{^56}. (Added on Nov 9th 2024).
- ChatGPT was used to assess and plan the explosion of a Tesla Cybertruck in Las Vegas, providing the bomber with information about explosives, how to detonate them, and where to buy all the material they needed^{^61}. (Added on Jan 16th 2025).

Things might change in the future, but generative AI is currently tainted by unethical practices and reckless corporations governed by greed. Using any of those products would make me feel participant in something that goes strongly against my personal beliefs for technology, so I choose to stay far away from them.

Published on September 22, 2024 - last updated on January 16, 2025