Skip to Main Content

Using Generative AI To Do Research

Limitations of Using Generative AI for Research

LIMITATIONS OF USING GENERATIVE AI FOR RESEARCH PURPOSES
Mistakes or Inaccurate Information: Be Aware of AI Hallucination

Generative AI tools, like ChatGPT (OpenAI), Google (Gemini)Co-Pilot (Microsoft), rely on prediction or probability to create text based on patterns they’ve learned from lots of data. They can sometimes generate information that is inaccurate or even false. This is sometimes referred to as AI hallucination. 

Why is it hard to know if AI is right? Well, here are a couple of key reasons: 

  • Black box problems: We often don't know exactly how these tools work so it's hard to verify how the information is produced.
  • Source training data: Much of the information these tools train on is from the internet and if that information isn't accurate, these tools may repeat the problem and generate misinformation.

Other Points to Consider: 

  • Some generative AI tools are better than others: Newer tools are often more accurate, as this technology improves
  • Integration of web search or external sources: Many AI tools now use outside sources like web search or external databases to check their information, i.e., they are grounded in external sources of knowledge. This can help with accuracy and with the tool telling you where it got its information from. But remember, even if these tools use the internet, they can still find wrong or misleading information.
Fake, Incomplete or Inaccurate References/Citations: One Form of AI Hallucination

Generative AI tools may create problematic references or citations.

Here are things to watch out for: 

  • Fake citations that look real: Parts of the citation may look legitimate, e.g., like giving you a legitimate-sounding book title that's not real. Or the AI tool might identify the name of an expert on the topic correctly, but pair it up with a made-up publication.
  • Incomplete citations: They may give you the name of a book and its author but leave out all the other important citation details you need.
  • No citations: Sometimes AI will provide information that seems very convincing but not back it up with any sources at all which is problematic in academic writing. 

What can you do about this?

Out-of-Date Information

Out-of-date information is not always an issue with generative AI tools, especially as more and more of them incorporate real-time web searching, but remember not all of them do this, and remember these tools don't have all sources of current information.

What can you do to verify currency?

  • It's always a good idea to check the information about the tool you are using to find out how up-to-date their information is.
  • If you are writing about current affairs or new discoveries, etc, always verify against other current sources of information, including credible news sources available via the web or the Libraries (see the Newspapers Guide for more help), or by using tools like OMNI, the library's academic discovery tool. 
Biased Information

Generative AI tools learn from massive datasets pulled from the internet and beyond. This information can be biased.

Why can this information be biased?

  • AI learns from humans and human bias = AI bias.
  • Bias is a reality in information that humans create and these tools will inevitably reflect those very biases inherent in data they are trained on.

Note: Even though fine-tuning and reinforcement learning is done with AI to help it avoid mistakes or bias, these are not perfect systems. 

What can you do about this issue?

  • Question the information that these tools generate and don't take it at face value. Ask yourself if it appears to be fair and balanced?
  • Seek out diverse perspectives on the topic you are researching.
  • Compare and contrast the information these tools produce with what you can find elsewhere, including, scholarly articles, books or other sources that are contextually relevant for your research task.

Check out the Verifying What You Find & Use section of our guide to learn more.

Lack of Transparency: Black Box Problems

While we understand quite a lot about how generative AI works, many of these tools are created by commercial entities that don't want to disclose what data they are using and how they process it. This is partly because they want to keep their technology a secret so they can stay ahead of the competition.

What can you do about this?

  • Try to stay informed by keeping up to date about AI transparency
  • Use these tools wisely and in tandem with other information sources, including library sources, where transparency is stronger
  • Try to validate what you are learning from AI tools. Check out the Verifying What You Find & Use section of our guide to learn more.
Information Ownership/Copyright

In an ideal world, generative AI tools would use information that is credited appropriately and legally shared. Just as at university, students are asked to cite the sources they use, using this practice in outputs generated by these tools, would render them more credible and easier to verify. 

The reality is that AI companies have been in hot water and faced law suits because they have not always sought clearance for some information sources where they should have. For example, they have been sued for using the work of artists, musicians or writers without their permission, which raises issues around fairness, respect, author rights, and copyright.

What does this matter to you?

  • You want to make sure the information you draw on is properly sourced and credited, as that is part of engaging in research and scholarship with integrity.
  • Understanding issues like this one helps you use AI tools more responsibly.

What can you do about this?

  • This is a complex issue, but one thing you can do is ask the AI tool to identify the sources for the information it gives you, or check to see if it is already doing this and then cross-check the citations are valid. 
  • Check out the Verifying What You Find & Use section of our guide to learn more.
Lack of Privacy Protection

When using generative AI tools, think about how they handle your privacy. Some have faced criticism for privacy violations.

Before using these tools consider: 

  • Is your privacy being protected?
  • Is there a possibility a document you share that you don't want shared more broadly may be used as part of the tool's learning process?
  • Could you use a local/non-online AI tool option to meet your purposes, e.g., LMStudio. While these do require more computing resources, as you are running a complex model on a local machine, there are distinct benefits in terms of privacy protection.

How to protect yourself here:

  • Check the fine print: Yes really! Because it is here you can determine the tool's privacy policy and terms of service. Look for words like "data collection", "privacy", or "information usage"
  • Be careful what you share: If in any doubt, don't share anything you want to keep private, that is sensitive or confidential, or where it simply isn't yours to share, e.g., where you don't own the copyright.