z-logo
Discover

From Boolean to Intelligent Search: A Librarian’s Guide to Smarter Information Retrieval

calendarOct 15, 2025 |clock6 Mins Read

As a librarian, you’ve always been the person people turn to when they need help finding answers. But the way we search for information is changing fast. Databases are growing, new tools keep appearing, and students expect instant results. Only then will you know the true benefit of AI for libraries, to help you make sense of it all.

From Boolean to Intelligent Search

Traditional search is still part of everyday library work. It depends on logic and structure, keywords, operators, and carefully built queries. But AI adds something new. It doesn’t just look for words; it tries to understand what someone means.

If a researcher searches for “climate change effects on migration,” an AI-powered tool doesn’t just pull results with those exact words. It also looks for studies about environmental displacement, regional challenges, and social impacts.

This means you can spend less time teaching people how to “speak database” and more time helping them understand the research they find.

The Evolution of Library Search

Traditional search engines focus on matching keywords, which often leads to long lists of results. With AI, search tools can now read queries in natural language, just the way people ask questions, and still find accurate, relevant material.

Natural language processing (NLP) and machine learning (ML) make it possible for search systems to connect related ideas, even when the exact words aren’t used. Features like semantic search and vector databases help AI recognise patterns and suggest other useful directions for exploration.

Examples of AI Tools Librarians Can Use

Tool / PlatformWhat It DoesWhy It Helps Librarians
ZendyA platform that combines literature discovery, AI summaries, keyphrase highlighting, and PDF analysisHelps librarians and researchers access, read, and understand academic papers more easily
ConsensusAn AI-powered academic search engine that summarises findings from peer-reviewed studiesHelps with literature reviews and citation management
Ex Libris PrimoUses AI to support discovery and manage metadataImproves record accuracy and helps users find what they need faster
MeilisearchA fast, flexible search engine that uses NLPMakes it easier to search large databases efficiently

The Ethics of Intelligent Search

Algorithms influence what users see and what they might miss. That’s why your role is so important. You can help users question why certain results appear on top, encourage critical thinking, and remind them that algorithms are not neutral.

Digital literacy today isn’t just about knowing how to search, it’s about understanding how the search works.

In Conclusion

AI tools for librarians are becoming easier to use and more helpful every day. Some platforms now include features like summarisation, citation analysis, and even plans to highlight retracted papers, something Zendy is working toward.

Trying out these tools can make your work smoother: faster reference responses, smarter cataloguing, and better guidance for researchers who often feel lost in the flood of information.

AI isn’t replacing your expertise, it’s helping you use it in new ways. And that’s what makes this moment exciting for librarians everywhere.

You might also like
Key Considerations for Training Library Teams on New Research Technologies
Nov 25, 202511 Mins ReadDiscover

Key Considerations for Training Library Teams on New Research Technologies

The integration of Generative AI into academic life appears to be a significant moment for university libraries. As trusted guides in the information ecosystem, librarians are positioned to help researchers explore this new terrain, but this transition requires developing a fresh set of skills. Training your library team on AI-powered research tools could move beyond technical instruction to focus on critical thinking, ethical understanding, and human judgment. Here is a proposed framework for a training program, organised by the new competencies your team might need to explore. Foundational: Understanding Access and Use This initial module establishes a baseline understanding of the technology itself. Accessing the Platform: Teach the technical steps for using the institution's approved AI tools, including authentication, subscription models, and any specific interfaces (e.g., vendor-integrated AI features in academic databases, institutional LLMs, etc.). Core Mechanics: Explain what a Generative AI platform (like a Large Language Model) is and, crucially, what it is not. Cover foundational concepts like: Training Data: Familiarise staff with how to access the institution’s chosen AI tools, noting any specific authentication requirements or limitations tied to vendor-integrated AI features in academic databases. Prompting Basics: Introduce basic prompt engineering, the art of crafting effective, clear queries to get useful outputs. Hallucinations: Directly address the concept of "hallucinations," or factually incorrect/fabricated outputs and citations, and emphasise the need for human verification. Conceptual: Critical Evaluation and Information Management This module focuses on the librarian's core competency: evaluating information in a new context. Locating and Organising: Train staff on how to use AI tools for practical, time-saving tasks, such as: Generating keywords for better traditional database searches. Summarising long articles to quickly grasp the core argument. Identifying common themes across a set of resources. Evaluating Information: This is perhaps the most critical skill. Teach a new layer of critical information literacy: Source Verification: Always cross-check AI-generated citations, summaries, and facts against reliable, academic sources (library databases, peer-reviewed journals). Bias Identification: Examine AI outputs for subtle biases, especially those related to algorithmic bias in the training data, and discuss how to mitigate this when consulting with researchers. Using and Repurposing: Demonstrate how AI-generated material should be treated—as a raw output that must be heavily edited, critiqued, and cited, not as a final product. Social: Communicating with AI as an Interlocutor The quality of AI output is often dependent on the user’s conversational ability. This module suggests treating the AI platform as a possible partner in a dialogue. Advanced Prompt Engineering: Move beyond basic queries to teach techniques for generating nuanced, high-quality results: Assigning the AI a role (such as a 'sceptical editor' or 'historical analyst') to potentially shape a more nuanced response. Practising iterative conversation, where librarians refine an output by providing feedback and further instructions, treating the interaction as an ongoing intellectual exchange. Shared Understanding: Practise using the platform to help users frame their research questions more effectively. Librarians can guide researchers in using the AI to clarify a vague topic or map out a conceptual framework, turning the tool into a catalyst for deeper thought rather than a final answer generator. Socio-Emotional Awareness: Recognising Impact and Building Confidence This module addresses the human factor, building resilience and confidence Recognising the Impact of Emotions: Acknowledge the possibility of emotional responses, such as uncertainty about shifting professional roles or discomfort with rapid technological change, and facilitate a safe space for dialogue. Knowing Strengths and Weaknesses: Reinforce the unique, human-centric value of the librarian: critical thinking, contextualising information, ethical judgment, and deep disciplinary knowledge, skills that AI cannot replicate. The AI could be seen as a means to automate lower-level tasks, allowing librarians to focus on high-value consultation. Developing Confidence: Implement hands-on, low-stakes practice sessions using real-world research scenarios. Confidence grows from successful interaction, not just theoretical knowledge. Encourage experimentation and a "fail-forward" mentality. Ethical: Acting Ethically as a Digital Citizen Ethical use is the cornerstone of responsible AI adoption in academia. Librarians must be the primary educators on responsible usage. Transparency and Disclosure: Discuss the importance of transparency when utilizing AI. Review institutional and journal guidelines that may require students and faculty to disclose how and when AI was used in their work, and offer guidance on how to properly cite these tools. Data Privacy and Security: Review the potential risks associated with uploading unpublished, proprietary, or personally identifiable information (PII) to public AI services. Establish and enforce clear library policies on what data should never be shared with external tools. Copyright and Intellectual Property (IP): Discuss the murky legal landscape of AI-generated content and IP. Emphasise that AI models are often trained on copyrighted material and that users are responsible for ensuring their outputs do not infringe on existing copyrights. Advocate for using library-licensed, trusted-source AI tools whenever possible. Combating Misinformation: Position the librarian as the essential arbiter against the spread of AI-generated misinformation. Training should include spotting common AI red flags, teaching users how to think sceptically, and promoting the library’s curated, authoritative resources as the gold standard. .wp-block-image img { max-width: 85% !important; margin-left: auto !important; margin-right: auto !important; }

Digital Information Literacy Guidelines for Academic Libraries
Nov 13, 202514 Mins ReadDiscover

Digital Information Literacy Guidelines for Academic Libraries

Information literacy is the skill of finding, evaluating, and using information effectively. Data literacy is the skill of understanding numbers and datasets, reading charts, checking how data was collected, and spotting mistakes. Critical thinking is the skill of analysing information, questioning assumptions, and making sound judgments. With so many digital tools today, students and researchers need all three skills, not just to find information, but also to make sense of it and communicate it clearly. Why Academic Libraries Should Offer Literacy Programs Let’s face it: research can be overwhelming. Over 5 million research papers are published every year. This information overload means researchers spend 25-30%1 of their time finding and reviewing academic literature, according to the International Study: Perceptions and Behavior of Researchers. Predatory journals, low-quality datasets, and confusing search results can make learning stressful. Libraries are more than book storage, they’re a place to build practical skills. Programs that teach information and data literacy help students think critically, save time, and feel more confident with research. Key Skills Students, Researchers, and Librarians Need Finding and Using Scholarly Content Knowing how to search a database efficiently is a big deal. Students should learn how to use filters, Boolean logic, subject headings and, of course, intelligent search. They should also know the difference between journal articles, conference papers, and open-access resources. Evaluating Sources and Data Not all information is equal. Programs should teach students how to check if sources are reliable, understand peer review, and spot bias in datasets. A few practical techniques, like cross-checking sources or looking for data provenance, can make research much stronger. Managing Information Ethically Citing sources properly, avoiding plagiarism, and respecting copyright are essentials. Tools like Zotero or Mendeley help keep references organised, so students spend less time managing files and more time on research. Sharing Findings Clearly Communicating is sharing, and sharing is caring. It’s one thing to collect information; it’s another to communicate it. Using infographics, slides, or storytelling techniques to make research more memorable. Ultimately, clear communication ensures that the work they’ve done can be understood, used, and appreciated by others. Frameworks That Guide Literacy Programs ACRL Framework: Provides six key concepts for teaching information literacy. EU DigComp / DigCompEdu: Covers digital skills for students and educators. Data Literacy Project: Helps students understand how to work with datasets, complementing traditional research skills. These frameworks help librarians structure programs so students get consistent, practical guidance. Steps to Build a Digital Literacy Program Audit Campus Needs: Talk to students and faculty, see what resources exist, and find gaps. Set Learning Goals: Decide what students should be able to do at the end, and make goals measurable. Select Content and Tools: Choose databases, software, and datasets that fit the library’s budget and tech setup. Create Short, Modular Lessons: Break skills into manageable pieces that build on each other. Launch and Improve: Introduce the program, gather feedback, and adjust lessons based on what works and what doesn’t. Teaching Strategies and Online Tools Flipped and Embedded Instruction Students watch a short video about search techniques at home, then practice in class. A librarian might join a research methods class, helping students build search strings live. Pre-class quizzes on topics like peer review versus predatory journals prepare students for hands-on exercises. Short Videos and Tutorials Quick videos (2–5 minutes) can teach one skill at a time, like citation management, evaluating sources, or basic data visualisation. Include captions, transcripts, and small practice exercises to reinforce learning. AI Summaries and Chatbots AI tools can summarise articles, suggest keywords, highlight main points, and even draft bibliographies. But they aren’t perfect, they can make mistakes, miss nuances, or misread complex tables. Human oversight is still important. Free Resources and Open Datasets Students can practice with free databases and datasets like DOAJ, arXiv, Kaggle, or Zenodo. Using one of the open-access resources keeps programs affordable while providing real-world examples. Checking if Students Are Learning Before and After Assessments: Simple quizzes or tasks to see how skills improve. Performance Rubrics: Compare beginner, developing, and advanced levels in searching, evaluating, and presenting data. Analytics: Track which videos or tools students use most to improve future lessons. Working With Faculty Embedded Workshops: Librarians teach skills directly tied to assignments. Joint Assignments: Faculty design research projects that naturally teach literacy skills. Faculty Training: Show instructors how to integrate digital literacy into their courses. Tackling Challenges Staff Training: Librarians may need extra help with data tools. Peer mentoring and workshops work well. Limited Budgets: Open access tools, collaborative licensing, and free platforms help make programs feasible. Distance Learners: Make videos and tutorials accessible anytime, account for different time zones and internet access. Looking Ahead AI, open science, and global collaboration are changing research. AI can personalise learning, but it still needs oversight. Open science and FAIR data principles (set of guidelines for making research dataFindable,Accessible,Interoperable, andReusable to both humans and machines) encourage transparency and reproducibility. Libraries can also connect with international partners to share resources and best practices. FAQs How long does a program take to launch?Basic services can start in six months; full programs usually take 1–2 years. Do humanities students need data skills?Yes, focus is more on qualitative analysis and digital humanities tools. Where can libraries find free datasets?Government repositories, Kaggle, Zenodo, and university archives. Can small libraries succeed without data specialists?Yes, faculty collaboration and online resources can cover most needs. .wp-block-image img { max-width: 75% !important; margin-left: auto !important; margin-right: auto !important; }

Why AI like ChatGPT still quotes retracted papers?
Oct 10, 202516 Mins ReadDiscover

Why AI like ChatGPT still quotes retracted papers?

AI models like ChatGPT are trained on massive datasets collected at specific moments in time, which means they lack awareness of papers retracted after their training cutoff. When a scientific paper gets retracted, whether due to errors, fraud, or ethical violations, most AI systems continue referencing it as if nothing happened. This creates a troubling scenario where researchers using AI assistants might unknowingly build their work on discredited foundations. In other words: retracted papers are the academic world's way of saying "we got this wrong, please disregard." Yet the AI tools designed to help us navigate research faster often can't tell the difference between solid science and work that's been officially debunked. ChatGPT and other assistants tested Recent studies examined how popular AI research tools handle retracted papers, and the results were concerning. Researchers tested ChatGPT, Google's Gemini, and similar language models by asking them about known retracted papers. In many cases, they not only failed to flag the retractions but actively praised the withdrawn studies. One investigation found that ChatGPT referenced retracted cancer imaging research without any warning to users, presenting the flawed findings as credible. The problem extends beyond chatbots to AI-powered literature review tools that researchers increasingly rely on for efficiency. Common failure scenarios The risks show up across different domains, each with its own consequences: Medical guidance: Healthcare professionals consulting AI for clinical information might receive recommendations based on studies withdrawn for data fabrication or patient safety concerns Literature reviews: Academic researchers face citation issues when AI assistants suggest retracted papers, damaging credibility and delaying peer review Policy decisions: Institutional leaders making evidence-based choices might rely on AI-summarised research without realising the underlying studies have been retracted A doctor asking about treatment protocols could unknowingly follow advice rooted in discredited research. Meanwhile, detecting retracted citations manually across hundreds of references proves nearly impossible for most researchers. How Often Retractions Slip Into AI Training Data The scale of retracted papers entering AI systems is larger than most people realise. Crossref, the scholarly metadata registry that tracks digital object identifiers (DOIs) for academic publications, reports thousands of retraction notices annually. Yet many AI models were trained on datasets harvested years ago, capturing papers before retraction notices appeared. Here's where timing becomes critical. A paper published in 2020 and included in an AI training dataset that same year might get retracted in 2023. If the model hasn't been retrained with updated data, it remains oblivious to the retraction. Some popular language models go years between major training updates, meaning their knowledge of the research landscape grows increasingly outdated. Lag between retraction and model update Training Large Language Models requires enormous computational resources and time, which explains why most AI companies don't continuously update their systems. Even when retraining occurs, the process of identifying and removing retracted papers from massive datasets presents technical challenges that many organisations haven't prioritised solving. The result is a growing gap between the current state of scientific knowledge and what AI assistants "know." You might think AI systems could simply check retraction databases in real-time before responding, but most don't. Instead, they generate responses based solely on their static training data, unaware that some information has been invalidated. Risks of Citing Retracted Papers in Practice The consequences of AI-recommended retracted papers extend beyond embarrassment. When flawed research influences decisions, the ripple effects can be substantial and long-lasting. Clinical decision errors Healthcare providers increasingly turn to AI tools for quick access to medical literature, especially when facing unfamiliar conditions or emerging treatments. If an AI assistant recommends a retracted study on drug efficacy or surgical techniques, clinicians might implement approaches that have been proven harmful or ineffective. The 2020 hydroxychloroquine controversy illustrated how quickly questionable research spreads. Imagine that dynamic accelerated by AI systems that can't distinguish between valid and retracted papers. Policy and funding implications Government agencies and research institutions often use AI tools to synthesise large bodies of literature when making funding decisions or setting research priorities. Basing these high-stakes choices on retracted work wastes resources and potentially misdirects entire fields of inquiry. A withdrawn climate study or economic analysis could influence policy for years before anyone discovers the AI-assisted review included discredited research. Academic reputation damage For individual researchers, citing retracted papers carries professional consequences. Journals may reject manuscripts, tenure committees question research rigour, and collaborators lose confidence. While honest mistakes happen, the frequency of such errors increases when researchers rely on AI tools that lack retraction awareness, and the responsibility still falls on the researcher, not the AI. Why Language Models Miss Retraction Signals The technical architecture of most AI research assistants makes them inherently vulnerable to the retraction problem. Understanding why helps explain what solutions might actually work. Corpus quality controls lacking AI models learn from their training corpus, the massive collection of text they analyse during development. Most organisations building these models prioritise breadth over curation, scraping academic databases, preprint servers, and publisher websites without rigorous quality checks. The assumption is that more data produces better models, but this approach treats all papers equally regardless of retraction status. Even when training data includes retraction notices, the AI might not recognise them as signals to discount the paper's content. A retraction notice is just another piece of text unless the model has been specifically trained to understand its significance. Sparse or inconsistent metadata Publishers handle retractions differently, creating inconsistencies that confuse automated systems: Some journals add "RETRACTED" to article titles Others publish separate retraction notices A few quietly remove papers entirely This lack of standardisation means AI systems trained to recognise one retraction format might miss others completely. Metadata، the structured information describing each paper, often fails to consistently flag retraction status across databases. A paper retracted in PubMed might still appear without warning in other indexes that AI training pipelines access. Hallucination and overconfidence AI hallucination occurs when models generate plausible-sounding but false information, and it exacerbates the retraction problem. Even if a model has no information about a topic, it might confidently fabricate citations or misremember details from its training data. This overconfidence means AI assistants rarely express uncertainty about the papers they recommend, leaving users with no indication that additional verification is needed. Real-Time Retraction Data Sources Researchers Should Trust While AI tools struggle with retractions, several authoritative databases exist for manual verification. Researchers concerned about citation integrity can cross-reference their sources against these resources. Retraction Watch Database Retraction Watch operates as an independent watchdog, tracking retractions across all academic disciplines and publishers. Their freely accessible database includes detailed explanations of why papers were withdrawn, from honest error to fraud. The organisation's blog also provides context about patterns in retractions and systemic issues in scholarly publishing. Crossref metadata service Crossref maintains the infrastructure that assigns DOIs to scholarly works, and publishers report retractions through this system. While coverage depends on publishers properly flagging retractions, Crossref offers a comprehensive view across multiple disciplines and publication types. Their API allows developers to build tools that automatically check retraction status, a capability that forward-thinking platforms are beginning to implement. PubMed retracted publication tag For medical and life sciences research, PubMed provides reliable retraction flagging with daily updates. The National Library of Medicine maintains this database with rigorous quality control, ensuring retracted papers receive prominent warning labels. However, this coverage is limited to biomedical literature, leaving researchers in other fields without equivalent resources. DatabaseCoverageUpdate SpeedAccessRetraction WatchAll disciplinesReal-timeFreeCrossrefPublisher-reportedVariableFree APIPubMedMedical/life sciencesDailyFree Responsible AI Starts with Licensing When AI systems access research papers, articles, or datasets, authors and publishers have legal and ethical rights that need protection. Ignoring these rights can undermine the sustainability of the research ecosystem and diminish trust between researchers and technology providers. One of the biggest reasons AI tools get it wrong is that they often cite retracted papers as if they’re still valid. When an article is retracted, e.g. due to peer review process not being conducted properly or failing to meet established standards, most AI systems don’t know, it simply remains part of their training data. This is where licensing plays a crucial role. Licensed data ensures that AI systems are connected to the right sources, continuously updated with accurate, publisher-verified information. It’s the foundation for what platforms like Zendy aim to achieve: making sure the content is clean and trustworthy.  Licensing ensures that content is used responsibly. Proper agreements between AI companies and copyright holders allow AI systems to access material legally while providing attribution and, when appropriate, compensation. This is especially important when AI tools generate insights or summaries that are distributed at scale, potentially creating value for commercial platforms without benefiting the sources of the content. in conclusion, consent-driven licensing helps build trust. Publishers and authors can choose whether and how their work is incorporated into AI systems, ensuring that content is included only when rights are respected. Advanced AI platforms, such as Zendy, can even track which licensed sources contributed to a particular output, providing accountability and a foundation for equitable revenue sharing. .wp-block-image img { max-width: 85% !important; margin-left: auto !important; margin-right: auto !important; }

Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom