How do I analyze my QuantSeq FWD-UMI Sequencing Data?

BlueBee Data Analysis Pipeline

Automated data analysis for QuantSeq FWD-UMI libraries is available on the BlueBee® Genomics Platform.
To analyze data from QuantSeq FWD libraries that contain UMIs, simply use the activation code included with your QuantSeq FWD library prep kit and select the respective "FWD-UMI" pipeline when setting up your data analysis run.

Be sure to only select the "FWD-UMI" pipelines for UMI-containing libraries, otherwise duplicate reads will not be collapsed.

For further details see Can I analyze my QuantSeq data using the Data Analysis Pipelines on the BlueBee® Genomics Platform?

UMI-Tools

As an alternative to BlueBee, we recommend utilizing the publicly available UMI-Tools package available on GitHub here. Detailed documentation can be found in the ReadTheDocs. The following command line will extract the UMI sequence from the read while removing the adjacent 4 nt TATA spacer:

umi_tools extract --extract-method=regex --bc-pattern "(?P<umi_1>.{6})(?P<discard_1>.{4}).*" -L "/path/to/my_outputlog.txt" -I "/path/to/my_input.fastq.gz" -S "/path/to/my_output.fastq.gz"

After alignment, reads can be deduplicated with the following command:

umi_tools dedup -I example.bam --output-stats=deduplicated -S deduplicated.bam

The deduplication method of UMI-Tools has been published here.

NOTE: The current implementation of this method can take some time and can consume significant memory. If you experience issues with run time or memory usage, please refer to these FAQs.

If you would like to run the package in a less complex way, you can set the parameter:

 "-method=unique"

This will only collapse UMIs having identical sequences.

For further information contact us at support@lexogen.com.