• Search by category

  • Show all

Data, data everywhere… far too much to think

April 20, 2023
 - Tim Hardman

Every year, a huge amount of scientific data is released. It's out there, but how do you find it? There are times when it's hard to be sure that your own study question hasn't already been answered. One common reason journals don't want to print your work is that the reviewers think your data is already out there. Just for this reason, it's important to keep up with new books in your field and use a structured search strategy to make sure you don't miss any that are related.

It seems reasonable to think that all scientists are good at keeping up with new research. But not everyone works on the cutting edge. Sometimes we are asked to look outside our area of expertise, and we tend to stick to the methods we learned as young scientists. But the field of literature management, retrieval, and searching is changing, and we don't know where new technologies like Chat GPT and other AIs will lead us.

The size of the material is hard for anyone with a normal mind to understand. There are already more than 50 million journal pages online. There are an extra 2.5 million academic articles added every year. Every four minutes, a new scientific paper (and let's not even talk about preprints) is added to the body of knowledge.

Not only are you spending more time and money looking for relevant information, but a lot of other people are too. I recently had a friend ask me what the best way was to do a structured literature search. Of course, I told her to read our Insider's Insight on literature looking. That's the main point. For your research to be successful, you need quick, low-cost access to scientific material, whether you are in business or in school. It's possible that what you do and how you do it is more important than the items you find.

If you do your search strategy right, you can avoid doing the same work twice, get a clear search path that can be re-engineered if you find any gaps or omissions, and give a report on your approach. As more and more old drugs are used against new targets, reported trial data is being used more and more in regulatory papers. This means that you need to give a structured explanation of where the data you use came from.

Similarly, there are many search engines (databases usually have their own ways of getting data and picking which journal articles to index) and methods that, when combined with research technology to access and sort your search results, will make your search for scientific literature a lot easier. Unlike books and papers, databases and software tools are always changing, and so are the metadata descriptions that go with them. Current practices for the literature, on the other hand, include standards and best practices to make sure that your search results can be repeated and that article material and links stay stable.

I told my friend that writing a procedure is the best way to make sure that the search goes well. How useful your results are will depend on how easily you can repeat your search approach, so it's important to keep any possible subjectivity to a minimum. A method that makes as many of the variables clear as possible is helpful for managing a process with many parts. You can get our search methodology template from the 'Resources' page on the Niche website. This should be done in the form of an objective protocol that explains the brief, your suggested search strategy, and the criteria for review. Write down the exact search terms, as well as the information about any filters and search engine(s), so that the search can be done again.

How will you record or save the results of the book search? What information will you keep on each citation? It should also say in the protocol how you will look over each source and give it a score. By going through the titles and/or descriptions by hand, you can be sure that all the results match your search criteria and that you've gathered all the relevant literature. By using strict criteria to choose studies, you can reduce bias, which in turn makes your results more reliable and correct. You can also help yourself by writing an official report in which you explain how the papers were chosen, including how many studies were looked at, why some were thrown out, and the final count of articles that were included.

Remember to:

  • Define your keywords: Break up the topic being researched into specific components and define keywords for each. Expand the list by writing down synonyms and alternative phrasings for each keyword. Also, use terms that you plan hope to include in your work – it may give you some ideas of how relevant they are.
  • Create a checklist for defining keywords: What alternative vocabulary is used in discussion of the topic? Are there American and British variants of spelling or vocabulary? Is there any word-stems for truncation? E.g., child$ to find child, children, or childish. What common abbreviations, acronyms or formulae are there? Are there any categories to exclude?
  • Interrogate relevant citations: Review identifying relevant journal articles, a simple way to find more studies is by looking through the reference lists of these articles (backward searching – but make sure you document it). Studies referenced in the article may be quite relevant. Also, look which papers have cited the articles since they were published (forward searching).
  • Record everything: There is no excuse in the electronic age to NOT keen a record of all your searches, the strategy you used and the results they produce. Things were different in the days of Index Medicus. Keeping track will allow you to logically refine your search strategy, enriching the final outcome data set and reducing the amount of manual confirmation your will need to engage in.
  • Rank: At some point you will need to review individual publications to determine their relevance and scientific ‘value.’ This can be challenging when content you may want to review is locked behind paywalls. If an article seems relevant you can always let your peers guide you – keep the simple altmetric widget on your desk-top and review an articles score to see whether other researchers felt the work made a useful contribution to the science.

Because there are likely to be so many papers to read and keep track of, you can't put together references by hand anymore. A reference manager can help you keep track of the results of your search. Some of them even let you download and save papers straight to your computer's library. Though books aren't as useful as they used to be, they can be a good place to start if you're new to a topic. They can give you a broad picture of your subject. In the same way, 'Grey' literature is losing its importance. This is because it contains information that is hard to find using regular search engines, databases, and library catalogues. Even when you look for 'grey literature,' you can still find useful information that points to hot research topics right now. The papers from conferences can tell you about the newest research and discussions on the subject you're studying, as well as hints about papers that might be released soon. You can find information about studies that have already been done (and maybe even their results) on registries like ClinicalTrials.gov that list unpublished clinical trials. Theses, dissertations, and working papers can help you find other researchers who are doing work that is related to yours. But desk-based study (online) has mostly taken the place of wondering around a library.

One note of caution to those who do wander into the library, you need to be extra careful when citing grey literature – most database content has already had a certain level of peer consideration prior to inclusion. In today’s fast-paced digital landscape, researchers who can successfully exploit the goldmine of content published online—via the skilful use of search tools and automated retrieval solutions—will enjoy significant advantage in the race to scientific invention and discovery. Who knows where AI will take us in the future but in the words of the sadly missed, late, great Carl Sagan once famously put it “You have to know the past to understand the present.”

About the author

Tim Hardman
Managing Director
View profile
Dr Tim Hardman is Managing Director of Niche Science & Technology Ltd., a bespoke services CRO based in the UK. He is also Chairman of the Association of Human Pharmacology in the Pharmaceutical Industry, the representative industry body for early for early phase clinical studies in the UK, and President of the sister organisation the European Federation for Exploratory Medicines Development. Dr Hardman is a keen scientist and an occasional commentator on all aspects of medicine, business and the process of drug development.

Related Articles

Get our latest news and publications

Sign up to our news letter

© 2025 Niche.org.uk     All rights reserved

HomePrivacy policy Corporate Social Responsibility