Submitting High-Throughput Sequence Data to GEO (Gene Expression Omnibus)
December 3, 2024This tutorial provides a step-by-step guide to submit high-throughput sequence data to GEO. GEO (Gene Expression Omnibus) is a public repository for gene expression data and other functional genomics datasets.
Table of Contents
Step 1: Understand Submission Guidelines
- Visit the GEO Home Page: Navigate to GEO’s home page.
- Review Submission Guidelines: Go to the submission guidelines at www.ncbi.nlm.nih.gov/geo/info/seq.html.
- These guidelines provide details on required files and submission procedures.
- Required Components for Submission:
Step 2: Prepare Metadata Spreadsheet
- Download the Metadata Template:
- Click on Metadata spreadsheet and then Download metadata spreadsheet on the submission guidelines page.
- Fill in Metadata:
- Metadata Tab: Provide detailed information for your study, samples, and protocols.
- Fields marked with an asterisk (*) are required. Ensure all required fields are completed.
- Use tips available in the spreadsheet by hovering over the field headers.
- Review Instructions:
- Use the Instructions tab in the spreadsheet for guidance.
- Refer to example worksheets for different experiment types.
- Special Notes:
- For paired-end fastq files, list both R1 and R2 files in the same row under the Samples section.
- Ensure filenames are unique and match submitted files exactly. Avoid whitespace or special characters in filenames.
Step 3: Format Data Files
- Raw Data:
- Typically provided in fastq or bam formats.
- Check SRA’s accepted formats here.
- Processed Data:
- Include final quantified data (e.g., normalized read counts for RNA-seq studies).
- Do not submit only differentially expressed genes or read alignment files as processed data.
- Organize Files:
- Place raw and processed data files in a single directory for each experiment.
- Do not compress fastq or bam files into
.tar
or.zip
archives. - Avoid subdirectories with identically named files.
Step 4: Transfer Files to GEO
- Set Up FTP Transfer:
- Log in to your My NCBI account.
- Go to the GEO FTP submission page at geo/info/submissionftp.
- Click the Transfer Files button to create your personalized upload space.
- Upload Files:
- Transfer your folder containing all data files to the FTP upload space.
- Follow the instructions provided on the GEO FTP page.
Step 5: Upload Metadata File
- Access Metadata Upload Page:
- Use the Upload Metadata button to navigate to the metadata submission page.
- Select FTP Subfolder:
- Link the uploaded raw and processed data files to the metadata file.
- Upload Metadata:
- Select and upload your local metadata file.
- Specify the release date for public access (maximum four years from the upload date).
- Provide additional comments if needed in the Comment to GEO staff box.
Step 6: Validate Submission
- Metadata Validation:
- After submission, the metadata file is assessed for missing fields or errors.
- Address any identified issues by uploading a revised metadata file.
- Receive Confirmation:
- Upon successful validation, you’ll receive a confirmation message and an email summarizing your submission.
Step 7: Process and Access Records
- Processing:
- Your submission enters the GEO processing queue.
- Upon processing, you’ll receive a GEO accession letter via email with accession numbers.
- Reviewer Access:
- The email will also include instructions for creating a reviewer access token.
Related posts:
Integrating CRISPR-Cas9 into Undergraduate Research to Teach Fundamental Bioinformatics Techniques
Degree in Bioinformatics, Genomics, or Omics: Your Gateway to a Thriving Career in the Life Sciences
Comprehensive Guide to Proteomics Types: Delving into Expression, Functional, and Structural Proteom...
Getting Started With Molecular Dynamics Simulation
How Deep Learning is Revolutionizing Omics?
Utilizing ChatGPT in Bioinformatics Education: A Guide to the REFINED Strategy
Using PubChem to explore compound-protein interactions
Overview of Recent Advancements in Proteomics Bioinformatics Tools
Proteogenomics
Cryo-Electron Microscopy (Cryo-EM): A Comprehensive Guide from Basics to Advanced Techniques
Exploring the Future of Bioinformatics: Trending Topics and Research Opportunities
The Future of Bioinformatics Tools and Databases: Embracing Cloud Computing and Big Data Innovations