Garbage In, Garbage Out: Perplexity Spreads Misinformation From Spammy AI Blog Posts
🌈 Abstract
The article discusses the issues with the AI search engine Perplexity, which claims to provide summaries with citations to reliable sources, but is found to be citing AI-generated blogs and content that contain inaccurate, out-of-date, and contradictory information. The article also covers Perplexity's alleged plagiarism of journalistic work from various news outlets.
🙋 Q&A
[01] Perplexity's AI-Generated Sources
1. What did the study by GPTZero find about the sources Perplexity is citing?
- The study found that Perplexity's search engine is drawing information from and citing AI-generated posts on a wide variety of topics, including travel, sports, food, technology, and politics.
- On average, Perplexity users only need to enter three prompts before they encounter an AI-generated source, according to the study.
- The study determined that sources were only considered AI-generated if GPTZero detected with at least 95% certainty that they were written with AI.
2. How does this impact the quality of Perplexity's search results?
- If the sources Perplexity is citing are AI hallucinations, then the output from Perplexity's AI system will also be inaccurate or contradictory.
- The article states that "Perplexity is only as good as its sources. If the sources are AI hallucinations, then the output is too."
3. What examples are given of Perplexity citing AI-generated sources?
- Searches for "cultural festivals in Kyoto, Japan," "impact of AI on the healthcare industry," "street food must-tries in Bangkok Thailand," and "promising young tennis players to watch" returned answers that cited AI-generated materials.
- A search for "cultural festival in Kyoto, Japan" yielded a summary where the only reference was an AI-generated LinkedIn post.
- A search for Vietnam's floating markets cited an AI-generated blog that included out-of-date information.
4. How does Perplexity respond to the issues with its sources?
- Perplexity's Chief Business Officer Dmitri Shevelenko acknowledged that their system is "not flawless" and said they continuously work to improve their source identification processes.
- Perplexity has developed internal algorithms to detect AI-generated content, but Shevelenko admitted these systems are not perfect and need to be continually refined.
[02] Perplexity's Plagiarism Issues
1. What allegations has Perplexity faced regarding plagiarism?
- Perplexity has come under scrutiny for allegations of plagiarizing journalistic work from multiple news outlets, including Forbes, CNBC, and Bloomberg.
- Forbes found that Perplexity had lifted sentences, crucial details, and custom art from an exclusive Forbes story about Eric Schmidt's secretive AI drone project without proper attribution.
2. How did Perplexity respond to the plagiarism allegations?
- Perplexity's CEO Aravind Srinivas denied the allegations, arguing that facts cannot be plagiarized, and said the company has not "'rewritten,' 'redistributed,' 'republished,' or otherwise inappropriately used Forbes content."
3. What other plagiarism issues has Perplexity faced?
- A Wired investigation found that Perplexity had accessed and scraped work from Wired and other Condé Nast publications through a secret IP address, even though Wired's engineers had attempted to block Perplexity's web crawler from stealing content.
- The search engine also tends to make up inaccurate information and attribute fake quotes to real people.
[03] Perplexity's Efforts to Address the Issues
1. What revenue sharing program has Perplexity created?
- Perplexity has created a "first-of-its-kind" revenue sharing program that will compensate publishers in a limited capacity.
- The company plans to add an advertising layer on its platform that will allow brands to sponsor follow-up or "related" questions in its search and Pages products. For specific responses generated by its AI where Perplexity earns revenue, the publishers that are cited as a source in that answer will receive a cut, though the percentage is not specified.
2. What are Perplexity's plans for partnerships with publishers?
- Perplexity has been in talks with The Atlantic and other publishers about potential partnerships, as the company recognizes the crucial role that publishers have in creating a healthy information ecosystem that its product depends on.
[04] Broader Implications and Challenges
1. What are the risks of relying on low-quality web sources for AI systems?
- Experts warn that if the real-time sources used by AI systems like Perplexity contain biases or inaccuracies, the AI model could start "spewing nonsense because there is no longer information, there is only bias."
- This can lead to a phenomenon called "model collapse," where an AI model trained on AI-generated data starts producing unreliable and misleading outputs.
2. How is this issue not unique to Perplexity?
- The article states that relying on low-quality web sources is a widespread challenge for AI companies, many of which don't cite sources at all.
- It cites the example of Google's "AI overviews" feature producing misleading responses by pulling from unvetted sources like discussion forums and satirical sites.
- The article concludes that "Perplexity is only one case. It's a symptom, not the entire problem."