Xu says that while it is likely studies that received AI-generated responses have already been published, she doesn’t think that LLM use is widespread enough to require researchers to issue corrections or retractions. Instead, she says, “I would say that it has probably caused scholars and researchers and editors to pay increased scrutiny to the quality of their data.”

“We don’t want to make the case that AI usage is unilaterally bad or wrong,” she says, adding that it depends on how it’s being used. Someone may use an LLM to help them express their opinion on a social issue, or they may borrow an LLM’s description of other people’s ideas about a topic. In the first scenario, AI is helping someone sharpen an existing idea, Xu says. The second scenario is more concerning “because it’s basically asking to generate a common tendency rather than reflecting the specific viewpoint of somebody who already knows what they think.”

If too many people use AI in that way, it could lead to the flattening or dilution of human responses. “What it means for diversity, what it means in terms of expressions of beliefs, ideas, identities – it’s a warning sign about the potential for homogenization,” Xu says.

This has implications beyond academia. If people use AI to fill out workplace surveys about diversity, for example, it could create a false sense of acceptance. “People could draw conclusions like, ‘Oh, discrimination’s not a problem at all, because people only have nice things to say about groups that we have historically thought were under threat of being discriminated against,’ or ‘Everybody just gets along and loves each other.’ ”

The authors note that directly asking survey participants to refrain from using AI can reduce its use. There are also higher-tech ways to discourage LLM use, such as code that blocks copying and pasting text. “One popular form of survey software has this function where you can ask to upload a voice recording instead of written text,” Xu says.

The paper’s results are instructive to survey creators as a call to create concise, clear questions. “Many of the subjects in our study who reported using AI say that they do it when they don’t think that the instructions are clear,” Xu says. “When the participant gets confused or gets frustrated, or it’s just a lot of information to take in, they start to not pay full attention.” Designing studies with humans in mind may be the best way to prevent the boredom or burnout that could tempt someone to fire up ChatGPT. “A lot of the same general principles of good survey design still apply,” Xu says, “and if anything are more important than ever.”