Step-by-Step Manual to Find Disease-Associated SNPs
January 9, 2025This guide provides the latest tools, databases, and tips to identify disease-associated SNPs, including associated PMIDs, case/control numbers, and population studies.
Step 1: Define Your Research Goal
- Objective: Identify SNPs associated with a specific disease (e.g., diabetes, Alzheimer’s).
- Required Information:
- Disease name or trait.
- Gene(s) of interest (optional).
- Chromosomal region (optional).
- Population or study details (optional).
Step 2: Use the NHGRI-EBI GWAS Catalog
- Purpose: Find validated SNP-disease associations from published GWAS studies.
- Website: NHGRI-EBI GWAS Catalog
- Steps:
- Go to the GWAS Catalog website.
- Use the search bar to enter your disease/trait (e.g., “Type 2 Diabetes”).
- Filter results by:
- P-value threshold (e.g., < 1 x 10⁻⁵).
- Population (e.g., European, Asian).
- Study size (cases/controls).
- Review the results:
- SNP ID (e.g., rs7903146).
- Associated gene(s).
- PMID (PubMed ID) for the study.
- Odds ratio, confidence intervals, and p-values.
- Export results as a CSV/TSV file for further analysis.
Step 3: Cross-Reference with dbSNP
- Purpose: Retrieve detailed SNP information, including functional annotations and population frequencies.
- Website: dbSNP
- Steps:
- Enter the SNP ID (e.g., rs7903146) in the search bar.
- Review the SNP summary:
- Genomic location (chromosome, position).
- Functional consequences (e.g., missense, intronic).
- Allele frequencies in different populations.
- Check the “Clinical Significance” section for disease associations.
- Use the “PubMed” link to find related studies.
- Export data using the “Send to” option (e.g., file, clipboard).
Step 4: Explore SNPedia for Curated SNP-Disease Associations
- Purpose: Access manually curated SNP-disease associations and additional annotations.
- Website: SNPedia
- Steps:
- Search for your disease (e.g., “Type 2 Diabetes”) or SNP ID (e.g., rs7903146).
- Review the SNP page for:
- Disease associations.
- Genotype risks (e.g., TT, CT, CC).
- Links to external databases (e.g., ClinVar, OMIM).
- Use Promethease for personalized SNP analysis:
- Upload your raw genetic data (e.g., 23andMe, AncestryDNA).
- Generate a report with disease-associated SNPs.
- Export the report as a CSV/TSV file.
Step 5: Use OMIM for Gene-Disease Relationships
- Purpose: Identify genes associated with your disease and their SNPs.
- Website: OMIM
- Steps:
- Search for your disease (e.g., “Alzheimer’s Disease”).
- Review the gene entries (e.g., APP, PSEN1, PSEN2).
- Check the “Allelic Variants” section for SNPs and their clinical significance.
- Use the “PubMed” links to find related studies.
- Export gene and SNP data for further analysis.
Step 6: Analyze Genomic Context with UCSC Genome Browser
- Purpose: Visualize SNP locations, nearby genes, and functional annotations.
- Website: UCSC Genome Browser
- Steps:
- Enter your SNP ID or genomic coordinates (e.g., chr10:114758349-114758349).
- Add tracks for:
- Use the “Table Browser” to extract SNP data:
- Select the “snpXXX” table (e.g., snp153 for the latest version).
- Filter by chromosome, position, or gene.
- Export results as a BED or text file.
Step 7: Use ClinVar for Clinical Significance
- Purpose: Find SNPs with clinical significance and disease associations.
- Website: ClinVar
- Steps:
- Search for your disease or SNP ID.
- Review the clinical significance (e.g., pathogenic, benign).
- Check the associated conditions and supporting evidence (e.g., PMIDs).
- Export data for further analysis.
Step 8: Perform Advanced Queries with BioMart
- Purpose: Extract SNP-disease associations from Ensembl.
- Website: Ensembl BioMart
- Steps:
- Select the “Ensembl Genes” dataset.
- Choose your species (e.g., human).
- Apply filters:
- Chromosome/region.
- Gene name(s).
- SNP consequences (e.g., missense, synonymous).
- Add attributes:
- SNP IDs.
- Associated phenotypes.
- External references (e.g., dbSNP, ClinVar).
- Export results as a CSV/TSV file.
Step 9: Validate Findings with Literature
- Purpose: Confirm SNP-disease associations using published studies.
- Tools:
- PubMed: Search for PMIDs from GWAS Catalog or ClinVar.
- Google Scholar: Look for additional studies.
- Zotero/Mendeley: Organize and annotate references.
Step 10: Automate with APIs and Scripts
- Purpose: Streamline data retrieval and analysis.
- Tools:
- NCBI E-Utilities: Programmatically access dbSNP, ClinVar, and PubMed.
- UCSC Table Browser API: Automate genomic data extraction.
- Python/R Scripts: Use libraries like
Biopython
orbiomaRt
for data integration.
Tips for Success
- Combine Multiple Databases: Cross-reference results from GWAS Catalog, dbSNP, and ClinVar for robust findings.
- Check Population Specificity: Ensure SNP associations are relevant to your target population.
- Use Latest Data: Always use the most recent versions of databases (e.g., dbSNP build 155, GWAS Catalog updates).
- Leverage Visualization Tools: Use tools like UCSC Genome Browser or IGV for genomic context.
- Stay Updated: Follow updates from NHGRI, EBI, and SNPedia for new associations and tools.
By following this manual, you can systematically identify and validate disease-associated SNPs, ensuring accurate and comprehensive results for your research.