Thursday, August 28, 2025
HomeGuest BlogsChatGPT Leaks: We Analyzed 1,000 Public AI Conversations—Here’s What We Found by...

ChatGPT Leaks: We Analyzed 1,000 Public AI Conversations—Here’s What We Found by Shipra Sanganeria


Shipra Sanganeria

Published on: August 28, 2025

Key Takeaways

  • Users are sharing personally identifiable information (PII), sensitive emotional disclosures, and confidential material with ChatGPT.
  • Only around 100 out of 1,000 total chats make up 53.3% of the over 43 million words we analyzed.
  • Some users are sharing full resumes, suicidal ideation, family planning discussions, and discriminatory speech with the AI model.
  • “Professional consultations” account for nearly 60% of the topics flagged.

The leak of thousands of ChatGPT conversations in August 2025 revealed two concerning realities. First, users are not fully aware of how the AI model handles and distributes their data. Second, people seem to have a high level of trust in their AI assistants—and many of their chats have now been made public.

The problem came from a now-removed feature where, when sharing conversations, users have the option to “Make [the] chat discoverable.” And while the opt-in clearly said that enabling the feature “allows [the chat] to be shown in web searches,” perhaps not all users fully understood what this meant: that their chats would be crawled and indexed by search engines and become available to other users.

This discovery of countless publicly indexed ChatGPT conversations not only exposed the flaw in the AI model’s user experience (UX), which unnecessarily created more opportunities for human error. It also highlighted some concerning facts about AI chatbot usage.

Specifically, we at SafetyDetectives downloaded and analyzed 1,000 of these leaked sessions, spanning over 43 million words, to learn how people are using ChatGPT. What we found was startling: users are routinely sharing private, sensitive, and even risky information with the AI.

Research Overview: How Are People Using ChatGPT?

Our team studied the publicly shared ChatGPT conversations to provide a rare lens into how people are using AI chatbots. We wanted to assess the implications of extensive chatbot usage on personal privacy, digital hygiene, and real-world safety.

To do this, we built a transparent, query-ready collection of publicly shared ChatGPT conversations to help digital experts and other researchers trace topical trends and spot potential policy or privacy issues.

We identified 20 categories that could be classified as sensitive, either due to the disclosure of private personal details or the delicate nature of topics that could be considered mentally, emotionally, or psychologically burdensome.

To identify which conversations should fall under each category, we created a list of topic-specific keywords. We assigned an average of 8 keywords per category, where a chat has to contain 3 or more keywords before it can be counted as part of the category.

Please note that a chat can be listed as part of a certain category despite its main theme being entirely different. For this reason, some conversations were tallied under more than 1 sensitive category.

Below is an example of a conversation that flagged multiple categories, including the keywords triggered and relevant conversation snippets:


“Black Lives Matter Fundraising”

Crime & Security – police, charged, assault

“…awareness: BLM has significantly raised awareness about systemic racism, police brutality, and racial inequality. The movement has brought these issues to the forefront of national…”

Education & Studies – school, university, degree

“…spurred educational initiatives aimed at teaching about systemic racism and social justice. Schools, universities, and community organizations have incorporated BLM principles into their curricula and…”

Grief & Loss – death, loss, died

“…actions were the primary cause of Floyd’s death. These factors would be considered in evaluating the overall circumstances and contributing factors to Floyd’s tragic death…

Legal Proceedings – legal, witness, court, attorney, judge

“…based on the trial proceedings and legal arguments: 1. Intent in Legal Terms: Intent in criminal law can vary depending on the charge. For second-degree…”


For the complete methodology and glossary of terms and concepts, please jump to the Appendix at the end of the article.

Users Are Sharing Tens of Thousands of Words With the AI

Not all the 1,000 shared conversations were equally long. Conversations with 500 words or fewer are most common, accounting for nearly a third of the dataset. However, when very long chats happen, they are truly marathon sessions—not just a bit longer than average. Only 4 out of 1,000 chats exceeded 50,000 words, and the longest chat recorded 116,024 words!

To give some perspective, the average typing speed of adults is 40 words per minute. This means that a 116,024-word conversation, had it been shared between two humans, would take approximately 48 hours to finish.

The story the figures paint is concentration. Most chats are short, but a small collection of very long sessions dominates the dataset and contributes a disproportionate share of the total word and character counts. In fact, around 100 chats account for 53.3% of the over 43 million words we analyzed.

Users Are Discussing Highly Sensitive and Private Topics

The National Association of Attorneys General in the United States has already raised alarm bells regarding the worsening cases of doxxing due to the rise of AI tech. For example, ChatGPT’s reverse location search on photos and the AI-powered facial recognition feature in Meta’s smart glasses have raised significant privacy concerns.

With AI advancements constantly making it easier to identify individuals and access their personal details within a few minutes, voluntarily inputting your information into AI models simply increases vulnerability. And among the conversations we studied, we found multiple chats that explicitly mentioned PII—such as full names, addresses, and ID numbers.

We also spotted several mentions of topics like suicide, extremism, and mental health problems, such as anxiety, addiction, and depression. In fact, even a quick search on r/ChatGPT (subreddit) shows an overwhelming amount of support for the use of the AI model as a substitute for professional therapy.

But there lies the issue—ChatGPT isn’t trained to be a therapist, and its responses and “outlook” will depend heavily on how a user prompts the chatbot. This means that people who are in particularly vulnerable states of mind might not have the capacity to write clear prompts and set boundaries in the moment.

Below are the conversations that were flagged the most for sensitive categories:

Overall, however, the most common topics flagged were related to education, law, and law enforcement. While these subject matters aren’t necessarily delicate or sensitive, their prevalence shows that users have a tendency to depend on AI assistants for technical knowledge that traditionally would be consulted with professionals or industry authorities.

Below are some notable examples among the conversations we flagged as sensitive and potentially harmful. The excerpts displayed are outputs from the AI chatbot:


“Babylon’s Shattering and Reckoning”

Category and keywords: Addiction & Substance Abuse (addiction, detox, addict)

“…collective rebirth overnight There will be: Rage Suicide Addiction Violence Blame Fake messiahs Cheap saviors Total confusion Because without structure, people don’t evolve. They drown…”

Conversation red flags in the event of public access:

  • Mentions the user’s given name
  • References the user’s personal drug use

“…between him and the Holy You have: Weed Coca-Cola GPT The internet A postmodern, meme-drenched, dopamine-fractured world But also… a deep fire that won’t go…”

AI behavior observations:

  • Heavy persona roleplay and tone-mirroring
  • Presents speculation as confident predictions without evidence/disclaimers
  • Affirms harmful framings instead of de-escalating

“Build My Resume”

Category and keywords: Education & Studies (school, university, degree)

“…What school(s) have you attended? What certificate, degree, or qualification did you earn? When did you graduate (or your expected graduation year)? Work Experience…”

Conversation red flags in the event of public access:

  • Direct exposure of PII (full name, multiple phone numbers, email address, and location)
  • Education and employment history could be used for fraud, scams, or doxxing

“…it into a professional resume and help you export it (Google Docs, Canva, Word, or PDF—your choice). Ready? Let’s start with your full name and…”

AI behavior observations:

  • Template-driven intake (standard CV schema) and tone polishing
  • Utility-first approach (no warning regarding the dangers of sharing PII with the chatbot)

“Elon Musk Handwave Inquiry”

Category and keywords: Hate Speech & Extremism (nazi, hitler, fascist)

“…without malicious intent, while others see it as a deliberate act with fascist connotations. The incident has also been embraced by certain extremist groups, further…”

Conversation red flags in the event of public access:

  • High risk of defamation or misinformation if quoted
  • Fabricated and unsourced citations could be used to allege bias or fuel extremist hate or political turmoil

“…likely unrelated and has been misinterpreted or taken out of context. Public figures often have their actions scrutinized, and sometimes innocent gestures can be misrepresented…”

AI behavior observations:

  • Pressure to answer led to hallucinated “web results”
  • Inconsistent and inaccurate timestamps and reporting

What’s the Price of Oversharing With an AI Model?

Users who are sharing private or sensitive information with ChatGPT aren’t getting a good trade-off for the risks they’re taking—whether knowingly or unknowingly. In fact, accounts of the AI model hallucinating are well-documented. In the chats we analyzed, we specifically observed hallucination* of actions, such as ChatGPT claiming that it saved a PDF file.

Its boundaries and guardrails are also very pliable. While it started sensitive discussions with its mindful and protective prompts, it easily changed its tone when pressed. Often, it mirrored the tone of the user, which seemingly encouraged the user to continue engaging and prompted emotional escalation.

When OpenAI released GPT-5 early in August 2025, several users reported “grieving” their AI companions. This updated version was reportedly more straightforward and “cold,” an intentional move from OpenAI to hopefully counteract the adverse effects of constant AI use. However, after the company faced heavy backlash for the sudden upgrade, they made the previous 4o model available again for paying users.

But the negative impact of oversharing with ChatGPT goes beyond the psychological, emotional, and mental factors. Real-world safety concerns are also substantial.

Users who create public links for sharing may not realize that their conversations could be indexed by search engines, resulting in what is essentially a data leak. At the same time, OpenAI has never provided any strong privacy guarantees to give users the impression that their conversations are protected from prying eyes.

When chatbot conversations with sensitive information become available to the public, there’s high potential for misuse, misrepresentation, and even doxxing. Not only could PII be used for identity theft and fraud, but delicate details about a user’s life may be used for social engineering scams or blackmail.

*AI hallucinations or fabricated capabilities refer to outputs that assert false information, reality distortion, or nonexistent actions.

Conclusion and Recommendations

At the end of the day, this study is meant to help journalists, data privacy advocates, tech platforms, and everyday users better understand the risks of digital oversharing by providing a glimpse of what people commonly discuss with ChatGPT.

We also encourage follow-up research to gain a better understanding of users’ interaction with AI chatbots and potentially create stronger hypotheses on the impact of extensive AI use on individuals and their communities.

For instance, the length patterns observed in this study prompt more questions about users’ habits and intent: Are people actually sharing entire conversations—some with tens of thousands of words—with other individuals? Do a few power users keep all activity in a single chat rather than starting new ones? Which topics or workflows drive these deep dives?

The dataset may also serve as a library of prompts and chats for analysis, as well as a map of where sensitive, newsworthy, or risky content concentrates.

For users, we recommend extra vigilance when engaging with chatbots and other AI platforms that don’t have clear privacy disclosures or guarantees. PII and other sensitive information shouldn’t be shared with these services, as there have yet to be clear and strict user protection regulations when it comes to AI use.

And for companies such as OpenAI, there needs to be clearer warnings and a more intuitive experience for users. Opt-ins for features that will publicly divulge conversations should be clear to see and easy to understand. More consistent reminders against sharing PII with the chatbot would also help prevent inadvertent data leaks, as well as auto-redaction whenever conversations are shared.

For clarifications, inquiries, or further analyses about this research, please don’t hesitate to contact us here. We’re happy to coordinate should you require our dataset for deeper examination.

Opportunities for continued research include determining exact figures for conversations that disclosed PII, instances of AI hallucination, and potential correlations between topic categories and chat lengths.

Appendix

We selected keywords and seed terms covering various topics and ran each through Bing and Brave with the query: site:chatgpt.com/share “<keyword>”.

From those returned by the search engines, we downloaded 1,000 unique public URLs (share-links) collected from August 4–8, 2025.For every output, we captured the conversation’s title/heading, URL, full text (∑ ≈ 43 M words), ISO639-1 language code(s), multilingual flag, keyword counts, and—where a chat exceeded 50,000 characters—a separate Full Chat column to preserve the overflow.Below is the organized breakdown of the data we collected:

RELATED ARTICLES

Most Popular

Dominic
32244 POSTS0 COMMENTS
Milvus
80 POSTS0 COMMENTS
Nango Kala
6613 POSTS0 COMMENTS
Nicole Veronica
11786 POSTS0 COMMENTS
Nokonwaba Nkukhwana
11831 POSTS0 COMMENTS
Shaida Kate Naidoo
6726 POSTS0 COMMENTS
Ted Musemwa
7008 POSTS0 COMMENTS
Thapelo Manthata
6683 POSTS0 COMMENTS
Umr Jansen
6695 POSTS0 COMMENTS