There are two forms of deduplication to consider when conducting a systematic review. The first is the removal of identical records retrieved from multiple databases. The second is the issue of multiple articles published from the same data set. If undetected, either could create bias in the conclusions of your review.
Identifying and removing duplicate records is necessary because multiple databases often index overlapping journals. Your method of deduplication may depend on the number of articles included in your review: manual deduplication is more realistic with smaller numbers, whereas larger numbers may require automatic tools. Automatic tools are not perfect, so both methods should be used for accurate deduplication. Whichever process you decide to follow, document it and report it accurately in your article.
Identifying multiple articles published from the same data set is a bit more complicated. The Cochrane Handbook offers some good suggestions for authors here: https://training.cochrane.org/handbook/current/chapter-04#section-4-6-2, and recommends that studies (not reports) serve as the reporting unit (https://training.cochrane.org/handbook/current/chapter-04#_Ref531774783). This requires careful analysis because you don't want to leave out important articles.
You need to track the number of duplicate articles you remove for either reason for inclusion in your PRISMA diagram.
You may want to export the entire list of articles from each database to a citation manager such as EndNote, Sciwheel, Zotero, or Mendeley (including both citation and abstract in your file) and remove the duplicates there. If you are using Covidence for your review, you should also add the duplicate articles identified in Covidence to the citation manager number.
When you search in each Ebscohost database by itself the results will be deduplicated automatically, however when you search in more then one the databases, the number of results posted on the first page (unless it says duplicates have been removed), includes duplicate articles.
To remove duplicates from any ProQuest database, scroll down on the advanced search page and click on the Result page options link and then check the box next to Exclude duplicate documents.
Export your references to a CSV or Excel file. In most cases, you will need to first use conditional formatting in Excel to identify duplicates, then do a final scan manually.
Conditional formatting
Sort the column alphabetically. (Start with titles, though you can use this same process for any other columns you choose, such as DOI.)
Select conditional formatting from the Home ribbon, go to Highlight Cells Rules, then Duplicate Values.
Replace punctuation (dashes, periods, question marks, semi colons, colons) in titles with spaces using the find and replace tool.
For titles, truncating (to 30 characters, for example, though this number is arbitrary) will sometimes find more duplicates.
insert a blank column
use this formula =LEFT(C2,30) where C2 is the cell you are truncating
copy the formula down the length of the column to truncate it all
Manual scan
Sort by title
Scan through the list, looking for duplicate titles
Check the additional information (author, journal, volume, page number) to make sure it matches before designating a duplicate
DO NOT delete duplicate records. Instead, move them to a separate sheet for duplicates, to track numbers.
This process was adapted from Kwon (2015).
Most bibliographic management software includes a deduplication option. You might consider uploading your references to EndNote, for example, removing the duplicates, and then going through the remainder of your list manually. Qi (2013) found that one method of automatic deduplicating was inadequate. See the following YouTube on integrating PRISMA with EndNote.
Before deduplicating, you will need a merged EndNote library containing the records from all your separate EndNote libraries for the individual database searches, if you had previously exported records from each database into separate libraries:
Earlier version of "Bramer method" for deduplicating, with steps provided in Word document format:
After deduplication - Create a compressed library for backup after having removed as many duplicates as possible, with a filename like SearchTerms-yyyymmdd-Deduplicated—xRecords.enlx. This will be the library for screening.
These tutorials will get you started on the deduplication process:
Zotero (free open access)
Covidence also includes automatic deduplication. One limitation to using Covidence for deduplication is that you cannot easily review duplicates manually after the automatic deduplication is complete, because Covidence limits the number of records displayed on a page. APU does not have a subscription to this resource however you can set up an individual trial by contacting the company.
Removing duplicate references obtained from different databases is an essential step when conducting and updating systematic literature reviews. ASySD is a tool to automatically identify and remove duplicate records. Hair, et al. (2021) compared ASySD deduplication to SRA-DM & Endnote. They found that "ASySD identified more duplicates than either SRA-DM or Endnote, with a sensitivity in different datasets of 0.95 to 0.99. The false-positive rate was comparable to human performance, with a specificity of 0.94-0.99. The tool took less than 1 hour to deduplicate all datasets" (Hair, et al, 2021).
The tool is written in R and has been created as a Shiny web app available online. For very large datasets (>50,000 records) it is advisable to download the code and run locally as a Shiny app within RStudio.
Removing duplicate records with the IEBH SR-Accelerator Deduplicator
Large sets of records
Bramer, W. M., Giustini, D., de Jonge, G. B., Holland, L., & Bekhuis, T. (2016). De-duplication of database search results for systematic reviews in EndNote. Journal of the Medical Library Association : JMLA, 104(3), 240–243. https://doi.org/10.3163/1536-5050.104.3.014
Kwon, Y., Lemieux, M., McTavish, J., & Wathen, N. (2015). Identifying and removing duplicate records from systematic review searches. Journal of the Medical Library Association : JMLA, 103(4), 184–188. https://doi.org/10.3163/1536-5050.103.4.004
Qi, X., Yang, M., Ren, W., & Jia, J. (2013). Find duplicates among the PubMed, EMBASE, and Cochrane Library databases in systematic review. PLoS ONE 8(8): e71838. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0071838
Rathbone, J., Carter, M., Hoffmann, T., & Glasziou, P. (2015). Better duplicate detection for systematic reviewers: evaluation of Systematic Review Assistant-Deduplication Module. Systematic Reviews, 4(6), 1-6. https://doi.org/10.1186/2046-4053-4-6
Hair, K., Bahor, Z., Macleod, M.R., Liao, J., & Sena, E.S. (2021). The Automated Systematic Search Deduplicator (ASySD): a rapid, open-source, interoperable tool to remove duplicate citations in biomedical systematic reviews. bioRxiv. https://doi.org/10.1101/2021.05.04.442412