Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
minLevel1
maxLevel7

...


As an alternative to BlueBee, we recommend utilizing the publicly available UMI-Tools package available on GitHub here. Detailed documentation can be found in the ReadTheDocs. The following command line will extract the UMI sequence from the read while removing the adjacent 4 nt TATA spacer:

'
Code Block
language
bash
umi_tools extract –extract-method=regex –bc-pattern "(?P<umi_1>.{6})(?P<discard_1>TATA1>.{4}).*" -L "/path/to/my_outputlog.txt" -I "/path/to/my_input.fastq.gz" -S "/path/to/my_output.fastq.gz"' 


After alignment, reads can be deduplicated with the following command:

Code Block
languagebash
umi_tools dedup -I example.bam –output-stats=deduplicated -S deduplicated.bam 

...

The deduplication method of UMI-Tools has been published here.

NOTE: The current implementation of this method can take some time and can consume significant memory. If you experience issues with run time or memory usage, please refer to these FAQs.

...