Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Systematic reviews: Remove duplicates

Removing duplicates from search results for Systematic Reviews

Searching multiple databases leads to a LOT of duplicate records 

  • We currently recommend that SVHM staff remove duplicates using EndNote software, but with a tweak to the standard dedupe process (explained below).
  • If you have access to different tools outside SVHM you may prefer to use those.
  • Regardless of the system used, you still need to check the suggested duplicates (yes, that's tiresome ... sigh ...)

Why use EndNote for deduping?

  • EndNote can be set up to better spot duplicates next to each other (instructions follow)
  • As at October 2021 we are not aware of a freely available tool that reliably removes duplicates more efficiently.

How can SVHM staff access EndNote?

  • Onsite it should be installed on your hospital computer. If not, contact the IT department for installation. 
  • Offsite, library staff can provide installation files for your home computer. You will need an SVHM login. Please email us for more information. 
  • Library staff can also provide individual EndNote tutorials and extensive online resources.

Note: These instructions and screenshots are for EndNote X9.3.3 for Windows. Your version may look a little different.

Create a new EndNote Library for your search results

BEFORE running your searches create or open the library where you want your results to be saved

  • Why? If you export search results to EndNote it will automatically put them into the last library you were using. It will be annoying if it is the wrong one!
  • How? Create a new EndNote library: File > New. 
  • Name it along the lines of Your topic - search results DD MM YY.
  • Avoid saving it in the cloud because this can create syncing issues.
  • You will save other versions later.

Set up a group for each set of search results

BEFORE exporting search results to EndNote, create groups for each database eg Medline, Embase, Emcare etc.

  • Why? It helps keep track of which results you have imported, so that you don't forget any. 
  • How? Right click on My groups > Create Group
  • You can remove these groups later

Export search results to EndNote

Run your searches and export results to EndNote in each database, starting with Medline

  • Run your search and export ALL results as RIS/EndNote files.
  • In Ovid databases we have a limit of 1000 results at a time, so this may involve more than one download for each database eg 1-1000, 1001-2000.
  • Don't forget to record how many results you retrieved, the date, search strategy and the exact database searched.

Move search results straight into EndNote groups as you import them

  • As you import each set of results into EndNote, move them into the relevant group you prepared earlier eg Medline results into the Medline group.
  • This helps track which database you are up to.

Label your search results in EndNote

Label results - it can help to identify preferred records

Why?

  • When choosing which duplicate to keep it can help to know which database it came from.
  • Sometimes the database indicates the record quality. For example I prefer Medline records unless another version is more complete.

How?

  • You can add labels/notes in bulk.
  • Select all the records in one group (such as all the Medline records)
  • Tools > Change/Move/Copy Fields > Change Fields > Research Notes  (in EndNote X9.3.3 for Windows)
  • Briefly write the name of the database and the date eg Medline 23 03 21
  • Click OK

BACKUP!!! Before removing duplicates create a fresh EndNote library

BEFORE starting to remove duplicates, create a fresh EndNote Library.

Why?

  • Creating a fresh library is like creating a backup .... just in case
  • If you accidentally lose something you should have kept while deduping, it will still be in your initial search results library.

How?

  • Have your existing search results library open
  • File > Compressed Library (.enlx)
  • Select:
    • Create
    • Without file attachments
    • All references in library
    • Next
  • Save in format Your topic - dedupe process DD MM YY
  • This creates a compressed (zipped) file. 
  • Clicking on the saved .enlx file will open the full library. This is where you will remove your duplicates.
  • After all your duplicates have been removed you can repeat the process, creating an after dedupe compressed file with a new date.

Select your deduplication preferences in EndNote

BEFORE starting to dedupe in EndNote check your deduplication preferences

Why?

  • It is important to know how EndNote (or any other deduplicating software) is choosing duplicates
  • The fewer fields that "match", the more likely it is that they are not actually duplicates
  • For example two articles with the same title might be from different journals, different authors, published in a different year etc.

How?

  • Edit > Preferences > Duplicates 
  • Select which fields you want matched eg Author, Year, Title, Secondary Title (Journal), Pages
  • The more fields you select to compare, the less will be matched, but the higher the likelihood of them being duplicates.
  • Do NOT select Automatically discard duplicates! Once gone, you won't get them back.

Set up your columns in EndNote to easily compare records

BEFORE deduping set up your columns to easily compare records.

Why? 

  • Columns help you to compare like with like at a glance
  • For example if you sort by title you can easily see which have different authors, are in different journals or on different page numbers

How do you organise columns?

  • Right click on the column heading line.
  • Choose columns to view eg: Year, Author, Title, Journal, Pages, Research Notes, Abstracts
  • You can choose any field you like,  as well as changing the order and width of any column

 

Find Duplicates in EndNote

Bypassing the standard deduplication process

  • Run the dedupe process by References > Find Duplicates
  • The first screen will compare two duplicates. This is a very slow way to compare records, and after 5 minutes your eyes will glaze over and you'll just click on records without looking .... so instead click CANCEL on the top right of the pop up window.

  • This creates a temporary folder called Duplicate References.
  • In the main column view you will see a list with suggested duplicate references highlighted. Remember that this is never perfect, and you need to check that they are actually duplicates. 
  • If you don't have too many duplicates to compare (say 500 or so), check through results from this screen, using Ctrl (or Command on a Mac) to deselect records you do not want to delete.
  • When you are finished, drag the highlighted duplicates into the Trash on the left.

 

If you have a lot of records to compare, break them into smaller groups for easier handling

  • Run the Find Duplicates command on a smaller group so you don't get lost.
  • For example you could create a group called Dedupe and drag a set of records in eg all titles starting with A, then B etc. Work logically.
  • Select that Dedupe group while clicking on Reference > Find Duplicates and it will just run the command on the smaller set.
  • Same process as before - Click Cancel, then check accuracy of duplicates in the column view, move duplicates to Trash
  • Repeat.

Yet more duplicates? Adjust your deduping preferences

  • There will almost certainly be more duplicates not picked up by this system, but it will get easier as your numbers go down.
  • You can adjust the Deduplication Preferences as described earlier so that it picks up more results to compare. 

Finally - sort by title and clean up most of the remaining duplicates

  • When the numbers are (hopefully) manageable, stop using the automated dedupe function
  • In the column view, sort remaining records in alphabetical order by title
  • Check through them manually looking for stray duplicates - you can just click Delete or drag to Trash
  • You can also select bulk duplicates to remove, using Click and Ctrl to select.

Even then, you'll find some duplicates will probably turn up when importing them into Covidence, but these should be easy to deal with.

BACKUP AGAIN!! After removing duplicates save to a fresh EndNote library

Why create another backup?

  • Because things can go wrong so backup, backup, backup ...
  • It will be easier to find the final EndNote library of deduped results later if it is named clearly.
  • This is the compressed library you are likely to share with fellow reviewers.

How?

  • Follow previous instructions to create a compressed EndNote library titled something like Your topic - after dedupe DD MM YY.

Guide Author

Helen Wilding, Senior Research Librarian

Carl de Gruchy Library, St Vincent's Hospital Melbourne
Helen.Wilding@svha.org.au

Literature Searching, Systematic Reviews, Mental Health liaison 
Thursdays, Fridays & alternate Wednesdays
Helen's profile | Researchgate | Orcid