
Understanding Samtools View Output
January 3, 2025This step-by-step guide will help you comprehend the output of samtools view. The guide covers essential details about the fields in the output, provides example scripts for processing, and mentions recent tools and software for interpreting the data.
Step 1: Command Overview
The samtools view command retrieves alignment information from BAM or SAM files for specified regions.
Example:
This command extracts alignments from BAMFILE in the region spanning 1,000,000 to 2,000,000 on chromosome 2.
Step 2: Output Format Explanation
The output of samtools view is a tab-delimited format, following the SAM format. Each row represents a single alignment with the following columns:
- QNAME: Query name (e.g., read identifier).
- FLAG: Bitwise flag indicating alignment information (e.g., strand, paired-end, etc.).
- RNAME: Reference sequence name (e.g., chromosome).
- POS: Leftmost 1-based position of the alignment on the reference.
- MAPQ: Mapping quality (Phred scale).
- CIGAR: Encoded representation of alignment (e.g., matches, insertions, deletions).
- MRNM: Mate reference sequence (
=if the same asRNAME). - MPOS: 1-based position of the mate.
- ISIZE: Inferred insert size.
- SEQ: Aligned sequence.
- QUAL: ASCII-encoded quality score for each base in the sequence.
Optional fields (tags):
- RG: Read group.
- NM: Edit distance.
- OQ: Original quality.
- E2: Second sequence.
- Additional tags may vary.
Step 3: Script for Parsing SAM/BAM
Using Python (pysam):
Using Unix:
Using R (Rsamtools):
Step 4: Online Tools for SAM/BAM Interpretation
- IGV (Integrative Genomics Viewer)
Visualize BAM/SAM files alongside the genome.
Website: IGV - Galaxy
Offers tools for processing and interpreting BAM/SAM files.
Website: Galaxy Project - SAMStat
Generates summaries and quality control metrics for SAM/BAM files.
Website: SAMStat - BamTools
Comprehensive toolkit for BAM file analysis.
Website: BamTools


















