The consequences of generative AI for online knowledge communities - Scientific Reports

Abstract

The article examines the impact of generative artificial intelligence technologies, particularly large language models like ChatGPT, on participation and content production in online knowledge communities. The key findings are:

  • ChatGPT's release led to a significant decline in web traffic and question volumes on Stack Overflow, particularly for topics where ChatGPT excels.
  • In contrast, activity in Reddit developer communities showed no evidence of decline, suggesting the importance of social fabric as a buffer against the community-degrading effects of large language models.
  • The decline in participation on Stack Overflow was concentrated among newer users, indicating that more junior, less socially embedded users are particularly likely to exit.
  • The questions posted to Stack Overflow became more complex and sophisticated after ChatGPT's release.

Q&A

[01] Overall impact of LLMs on community engagement

1. What were the key findings regarding the impact of ChatGPT's release on web traffic and question volumes at Stack Overflow?

  • The authors found a significant decline in daily web traffic to Stack Overflow, estimated at around 1 million individuals per day or 12% of the site's daily traffic prior to ChatGPT's release.
  • They also found a marked decline in question posting volumes per topic on Stack Overflow following ChatGPT's release.

2. How did the effects of ChatGPT compare between Stack Overflow and Reddit developer communities?

  • The authors observed no evidence of declines in participation in Reddit developer communities, in contrast to the declines seen on Stack Overflow.
  • They attribute this difference to the stronger social fabric and connections present in Reddit communities, compared to the more pure information exchange focus of Stack Overflow.

[02] LLMs' effect on user content production

1. What did the difference-in-differences analysis reveal about the impact of ChatGPT on question posting volumes at Stack Overflow?

  • The analysis showed that question posting volumes per topic on Stack Overflow declined markedly since ChatGPT's release.
  • This suggests that LLMs are replacing online communities as a source of knowledge for many users.

2. How did the effects differ across Reddit developer communities?

  • The authors found no evidence that ChatGPT had any effects on user engagement at Reddit developer communities.
  • This contrast with the declines observed on Stack Overflow reinforces the importance of social fabric as a buffer against the community-degrading effects of LLMs.

[03] Heterogeneity in ChatGPT's effect on Stack Overflow posting volumes by topic

1. What did the authors find regarding the heterogeneity of ChatGPT's effects across different topics on Stack Overflow?

  • The authors observed a great deal of heterogeneity in the effects across Stack Overflow topics.
  • The most heavily affected topics were those closely tied to concrete, self-contained software coding activities, where ChatGPT was likely to perform well due to the availability of accessible training data.
  • In contrast, topics involving more complex tasks and requiring contextual information beyond just syntax were less affected.

2. How did the authors explain this heterogeneity in terms of the availability of training data for ChatGPT?

  • The authors found a correlation between the volume of active GitHub repositories and subscribed subreddits for each topic, and the magnitude of the observed effects.
  • This suggests that topics with more publicly available training data were more susceptible to the negative effects of ChatGPT on participation.

[04] ChatGPT's effect on average user account age and question complexity

1. What did the authors find regarding the change in average posting user account tenure on Stack Overflow after ChatGPT's release?

  • The authors found a systematic rise in the average tenure of posting users' accounts on Stack Overflow following ChatGPT's release.
  • This indicates that newer, less experienced user accounts became less likely to participate in the Stack Overflow community after ChatGPT became available.

2. How did the complexity of questions posted on Stack Overflow change after ChatGPT's release?

  • The authors found that questions exhibited a systematic rise in complexity, as measured by the prevalence of longer words, following the release of ChatGPT.
  • This suggests that the questions that failed to be posted were more likely to be relatively simpler ones that ChatGPT could have addressed.


