
Step-by-Step Manual: Converting GFF3 to GTF
January 9, 2025Converting GFF3 (General Feature Format version 3) to GTF (Gene Transfer Format) is a common task in bioinformatics, especially for downstream analyses like RNA-seq. Below is a detailed guide on how to perform this conversion using popular tools.
1. Understand the Formats
- GFF3: A flexible format for describing genomic features. It has 9 columns:
seqid,source,type,start,end,score,strand,phase, andattributes. - GTF: A stricter format derived from GFF2, commonly used for gene annotation. It also has 9 columns but requires specific attributes like
gene_idandtranscript_id.
2. Use gffread from Cufflinks
gffread is a widely used tool for converting GFF3 to GTF.
Step 2.1: Install Cufflinks
If you don’t have gffread, install Cufflinks:
# Download Cufflinks wget http://cole-trapnell-lab.github.io/cufflinks/assets/downloads/cufflinks-2.2.1.Linux_x86_64.tar.gz # Extract the tarball tar -xzvf cufflinks-2.2.1.Linux_x86_64.tar.gz # Add to PATH export PATH=$PATH:/path/to/cufflinks-2.2.1.Linux_x86_64
Step 2.2: Convert GFF3 to GTF
Run gffread to convert your GFF3 file:
gffread input.gff3 -T -o output.gtf
input.gff3: Your input GFF3 file.-T: Specifies output format as GTF.output.gtf: The output GTF file.
3. Use AGAT (Another GFF Analysis Toolkit)
AGAT is a powerful toolkit for working with GFF/GTF files.
Step 3.1: Install AGAT
Install AGAT using Conda:
conda install -c bioconda agat
Step 3.2: Convert GFF3 to GTF
Run the following command:
agat_convert_sp_gff2gtf.pl --gff input.gff3 -o output.gtf
input.gff3: Your input GFF3 file.output.gtf: The output GTF file.
4. Use rtracklayer in R
If you prefer working in R, you can use the rtracklayer package from Bioconductor.
Step 4.1: Install rtracklayer
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("rtracklayer")Step 4.2: Convert GFF3 to GTF
Run the following R script:
library(rtracklayer) # Import GFF3 file gff3_file <- "input.gff3" gff3_data <- import(gff3_file) # Export as GTF gtf_file <- "output.gtf" export(gff3_data, gtf_file, format = "gtf")
5. Use GenomeTools
GenomeTools is another tool for working with GFF/GTF files.
Step 5.1: Install GenomeTools
# On Ubuntu/Debian sudo apt-get install genometools # On macOS brew install genometools
Step 5.2: Convert GFF3 to GTF
Run the following command:
gt gff3_to_gtf input.gff3 > output.gtf6. Validate the Output
After conversion, validate the GTF file to ensure it meets the required format:
- Check for mandatory attributes like
gene_idandtranscript_id. - Use tools like
gtf2bedorIGVto visualize the GTF file.
7. Automate the Workflow
If you frequently convert GFF3 to GTF, consider automating the process using a script or workflow manager like Snakemake or Nextflow.
Example Snakemake Workflow
rule all: input: "output.gtf" rule convert_gff3_to_gtf: input: "input.gff3" output: "output.gtf" shell: "gffread {input} -T -o {output}"
Recent Tools and Tips
- AGAT: A comprehensive toolkit for GFF/GTF manipulation.
- gffread: Fast and reliable for GFF3-to-GTF conversion.
- rtracklayer: Ideal for R users working with genomic data.
- GenomeTools: A versatile tool for GFF/GTF manipulation.
Tips for Conversion
- Check Attribute Consistency: Ensure mandatory attributes like
gene_idandtranscript_idare present in the GTF file. - Handle Large Files: Use tools like
AGATorgffreadfor efficient processing of large GFF3 files. - Validate Output: Always validate the converted GTF file to ensure it meets the required format.
By following this guide, you can efficiently convert GFF3 files to GTF format using the latest tools and best practices.


















