What is the difference between a Read and a Fragment in RNA-seq?
January 10, 2025In RNA-seq (and other sequencing technologies like whole-genome sequencing), the terms Read and Fragment are often used, and they have distinct meanings. Here’s a detailed explanation of the difference between the two, along with how they relate to single-end and paired-end sequencing:
1. Fragment
- A fragment refers to a physical piece of DNA or RNA that is generated during the library preparation process.
- In RNA-seq, RNA molecules are first converted into complementary DNA (cDNA) and then fragmented into smaller pieces. These pieces are called fragments.
- The size of these fragments is often controlled during library preparation and is referred to as the insert size (the length of the DNA/RNA between the sequencing adapters).
- Fragments are the templates for sequencing.
2. Read
- A read is the sequenced output of a fragment (or part of a fragment) generated by the sequencing machine.
- Each read corresponds to the sequence of nucleotides determined by the sequencer.
- Reads can be generated from one or both ends of a fragment, depending on the sequencing strategy (single-end or paired-end).
3. Single-End vs. Paired-End Sequencing
- Single-End Sequencing:
- Only one end of the fragment is sequenced.
- This produces one read per fragment.
- Example: If a fragment is 300 bp long, and the read length is 100 bp, you will get a single 100 bp read from one end of the fragment.
- Paired-End Sequencing:
- Both ends of the fragment are sequenced.
- This produces two reads per fragment (one from each end).
- Example: If a fragment is 300 bp long, and the read length is 100 bp, you will get two 100 bp reads: one from the forward end and one from the reverse end of the fragment.
4. Relationship Between Fragment and Read
- A fragment is the original piece of DNA/RNA, while a read is the sequenced output of that fragment.
- In single-end sequencing, one read corresponds to one end of a fragment.
- In paired-end sequencing, two reads correspond to the two ends of the same fragment.
5. Example in RNA-seq
- Suppose you have an RNA molecule that is converted into cDNA and fragmented into pieces of 300 bp.
- During sequencing:
- In single-end mode, you might get a 100 bp read from one end of the fragment.
- In paired-end mode, you might get two 100 bp reads: one from the forward end and one from the reverse end of the same fragment.
6. Why Does This Matter?
- Fragment Length vs. Read Length:
- The fragment length (insert size) is the actual size of the DNA/RNA piece.
- The read length is the number of bases sequenced from the fragment.
- For example, a 300 bp fragment might produce two 100 bp reads in paired-end sequencing, leaving a 100 bp unsequenced gap in the middle.
- Applications:
- Paired-end sequencing provides more information about the fragment, such as the distance between the two reads, which can help with alignment and structural variant detection.
- Single-end sequencing is simpler and cheaper but provides less information.
7. Key Points
- A fragment is the physical piece of DNA/RNA being sequenced.
- A read is the sequenced output of a fragment.
- Single-end sequencing produces one read per fragment.
- Paired-end sequencing produces two reads per fragment (one from each end).
8. Visualization
Here’s a simple visualization:
Fragment: ----------------------------- (300 bp) Single-End: ---------- (100 bp read from one end) Paired-End: ---------- (100 bp read from one end) ----------------------------- (300 bp) ---------- (100 bp read from the other end)
Understanding the difference between fragment and read is crucial for interpreting RNA-seq data, designing experiments, and analyzing sequencing results.