What is the difference between a Read and a Fragment in RNA-seq?

January 10, 2025 Off By admin

In RNA-seq (and other sequencing technologies like whole-genome sequencing), the terms Read and Fragment are often used, and they have distinct meanings. Here’s a detailed explanation of the difference between the two, along with how they relate to single-end and paired-end sequencing:

Table of Contents

1. Fragment

A fragment refers to a physical piece of DNA or RNA that is generated during the library preparation process.
In RNA-seq, RNA molecules are first converted into complementary DNA (cDNA) and then fragmented into smaller pieces. These pieces are called fragments.
The size of these fragments is often controlled during library preparation and is referred to as the insert size (the length of the DNA/RNA between the sequencing adapters).
Fragments are the templates for sequencing.

2. Read

A read is the sequenced output of a fragment (or part of a fragment) generated by the sequencing machine.
Each read corresponds to the sequence of nucleotides determined by the sequencer.
Reads can be generated from one or both ends of a fragment, depending on the sequencing strategy (single-end or paired-end).

3. Single-End vs. Paired-End Sequencing

Single-End Sequencing:
- Only one end of the fragment is sequenced.
- This produces one read per fragment.
- Example: If a fragment is 300 bp long, and the read length is 100 bp, you will get a single 100 bp read from one end of the fragment.
Paired-End Sequencing:
- Both ends of the fragment are sequenced.
- This produces two reads per fragment (one from each end).
- Example: If a fragment is 300 bp long, and the read length is 100 bp, you will get two 100 bp reads: one from the forward end and one from the reverse end of the fragment.

4. Relationship Between Fragment and Read

A fragment is the original piece of DNA/RNA, while a read is the sequenced output of that fragment.
In single-end sequencing, one read corresponds to one end of a fragment.
In paired-end sequencing, two reads correspond to the two ends of the same fragment.

5. Example in RNA-seq

Suppose you have an RNA molecule that is converted into cDNA and fragmented into pieces of 300 bp.
During sequencing:
- In single-end mode, you might get a 100 bp read from one end of the fragment.
- In paired-end mode, you might get two 100 bp reads: one from the forward end and one from the reverse end of the same fragment.

6. Why Does This Matter?

Fragment Length vs. Read Length:
- The fragment length (insert size) is the actual size of the DNA/RNA piece.
- The read length is the number of bases sequenced from the fragment.
- For example, a 300 bp fragment might produce two 100 bp reads in paired-end sequencing, leaving a 100 bp unsequenced gap in the middle.
Applications:
- Paired-end sequencing provides more information about the fragment, such as the distance between the two reads, which can help with alignment and structural variant detection.
- Single-end sequencing is simpler and cheaper but provides less information.

7. Key Points

A fragment is the physical piece of DNA/RNA being sequenced.
A read is the sequenced output of a fragment.
Single-end sequencing produces one read per fragment.
Paired-end sequencing produces two reads per fragment (one from each end).

8. Visualization

Here’s a simple visualization:

Fragment:   ----------------------------- (300 bp)
Single-End: ---------- (100 bp read from one end)
Paired-End: ---------- (100 bp read from one end)
            ----------------------------- (300 bp)
            ---------- (100 bp read from the other end)

Understanding the difference between fragment and read is crucial for interpreting RNA-seq data, designing experiments, and analyzing sequencing results.