R-program-data-plot

ChatGPT Cheat Sheet for Data Science (100 Prompts)

December 5, 2024 Off By admin
Shares

General Coding Workflows

  1. Debugging Python Code
    I want you to be a Python programmer. Here is a piece of Python code containing {problem} — {insert code snippet} — I am getting the following error {insert error}. What is the reason for the bug?
  2. Debugging R Code
    I want you to be an R programmer. Here is a piece of R code containing {problem} — {insert code snippet} — I am getting the following error {insert error}. What is the reason for the bug?
  3. Debugging SQL Code
    I want you to be a SQL programmer. Here is a piece of SQL code containing {problem} — {insert code snippet} — I am getting the following error {insert error}. What is the reason for the bug?
  4. Python Code Explanation
    I want you to act as a code explainer in Python. I don’t understand this function. Can you explain what it does and provide an example? {Insert function}
  5. R Code Explanation
    I want you to act as a code explainer in R. I don’t understand this function. Can you explain what it does and provide an example? {Insert function}
  6. SQL Code Explanation
    I want you to act as a code explainer in SQL. I don’t understand this snippet. Can you explain what it does and provide an example? {Insert SQL query}
  7. Python Code Optimization
    I want you to act as a code optimizer in Python. {Describe problem with current code, if possible}. Can you make the code {more Pythonic/cleaner/more efficient/run faster/more readable}? {Insert Code}
  8. R Code Optimization
    I want you to act as a code optimizer in R. {Describe problem with current code, if possible}. Can you make the code {cleaner/more efficient/run faster/more readable}? {Insert Code}
  9. SQL Code Optimization
    I want you to act as a query optimizer in SQL. {Describe problem with current code, if possible}. Can you suggest ways to make the query {run faster/more readable/simpler}? {Insert Code}
  10. Python Code Simplification
    I want you to act as a programmer in Python. Please simplify this code while ensuring that it is {efficient/easy to read/Pythonic}? {Insert Code}
  11. R Code Simplification
    I want you to act as a programmer in R. Please simplify this code while ensuring that it is {efficient/easy to read}? {Insert Code}
  12. SQL Code Simplification
    I want you to act as a SQL programmer. I am running {PostgreSQL 14/MySQL 8/SQLite 3.4/other versions}. Can you please simplify this query {while ensuring that it is efficient/easy to read/insert any additional requirements}?
  13. Translate R to Python
    I want you to act as a programmer in R. Please translate this code to Python. {Insert code}
  14. Translate Python to R
    I want you to act as a programmer in Python. Please translate this code to R. {Insert code}
  15. Compare Python Function Speeds
    I want you to act as a Python programmer. Can you write code that compares the speed of two functions {functionname} and {functionname}? {Insert functions}
  16. Write Unit Tests in R
    I want you to act as an R Programmer. Can you please write unit tests for the function {functionname}? {Insert requirements for unit tests, if any} {Insert code}
  17. Write Unit Tests in Python
    I want you to act as a Python Programmer. Can you please write unit tests for the function {functionname}? {Insert requirements for unit tests, if any} {Insert code}

Data Analysis Workflows

  1. SQL Data Generation
    I want you to act as a data generator. Can you write SQL queries in {database version} that create a table {table name} with the columns {column name}. Include relevant constraints and index.
  2. Common Table Expressions in SQL
    I want you to act as a SQL code programmer. I am running {database version}. Can you rewrite this query using CTE? {Insert query}
  3. SQL Aggregation Example
    I want you to act as a data scientist. {Insert description of tables}. Can you {count/sum/take average} of {value} which are {insert filters}?
  4. 7-Day Running Average in SQL
    I want you to act as a data scientist. I am running {PostgreSQL 14/MySQL 8/SQLite 3.4/other versions}. I have the tables {table_name} which are {table description}. The sales table consists of the columns {column names}. Can you please write a query that finds the 7-day running average of {quantity}?
  5. Window Functions in SQL
    I want you to act as a data scientist. I am running {PostgreSQL 14/MySQL 8/SQLite 3.4/other versions}. I have the tables {table_name} which are {table description}. The sales table consists of the columns {column names}. Can you please write a query that finds {required window function}?
  6. Generate Markdown in Python
    I want you to act as a data generator in Python. Can you generate a Markdown file that contains {data requirement}. Save the file to {filename}.
  7. Generate CSV in Python
    I want you to act as a data generator in Python. Can you generate a CSV file that contains {data requirement}. Save the file to {filename}.
  8. Generate JSON in Python
    I want you to act as a data generator in Python. Can you generate a JSON file that contains {data requirement}. Save the file to {filename}.
  9. Clean Data with Pandas
    I want you to act as a data scientist programming in Python Pandas. Given a CSV file that contains data of {dataframe name} with the columns {column names} for {dataset context}, write code to clean the data? {Insert requirements for data}
  10. Data Aggregation in Pandas
    I want you to act as a data scientist programming in Python Pandas. Given a table {table name} that consists of the columns {column names}, can you please write a query that finds {requirement}?
  11. Merge Data in Pandas
    I want you to act as a data scientist programming in Python Pandas. Given a table {table 1 name} that consists of the columns {column names} and another table {table 2 name} with the columns {column names}, please merge the two tables. {Insert additional requirement, if any}
  12. Data Reshaping in Pandas (Long to Wide)
    I want you to act as a data scientist programming in Python Pandas. Given a table {table name} that consists of the columns {column names}, can you aggregate the {value} by {column} and convert it from long to wide format?
  13. Generate Markdown in R
    I want you to act as a data generator in R. Can you generate a Markdown file that contains {data requirement}. Save the file to {filename}.
  14. Generate CSV in R
    I want you to act as a data generator in R. Can you generate a CSV file that contains {data requirement}. Save the file to {filename}.
  15. Generate JSON in R
    I want you to act as a data generator in R. Can you generate a JSON file that contains {data requirement}. Save the file to {filename}.
  16. Data Cleaning in R (Tidyr)
    I want you to act as a data scientist programming in R tidyr. You are given the {dataframe name} dataframe containing the columns {column name}. {Insert requirement}
  17. Data Aggregation in R (Tidyr)
    I want you to act as a data scientist programming in R tidyr. You are given the {dataframe name} dataframe containing the columns {column name}. {Insert requirement}
  18. Merge Data in R (Tidyr)
    I want you to act as a data scientist programming in R tidyr. You are given the {dataframe 1 name} dataframe containing the columns {column name}. You also have a {dataframe 2 name} dataframe containing the columns {column name}. Find the {required output}
  19. Reshape Data (Long to Wide) in R (Tidyr)
    I want you to act as a data scientist programming in R tidyr. You are given the {dataframe name} dataframe containing the columns {column name}. Please convert the data to wide format.
  20. Reshape Data (Wide to Long) in R (Tidyr)
    I want you to act as a data scientist programming in R tidyr. You are given the {dataframe name} dataframe containing the columns {column name}. Please convert the data to long format.

Data Visualization Workflows

  1. Create Plots in ggplot2
    I want you to act as a data scientist coding in R. Given a dataframe {dataframe name} containing the columns {column names}, use ggplot2 to plot a {chart type and requirement}.
  2. Gridplot Visualizations in ggplot2
    I want you to act as a data scientist coding in R. Given a dataframe {dataframe name} containing the columns {column names}, please create a gridplot using ggplot2 with {number of rows/columns}.
  3. Subplot Visualizations in Python
    I want you to act as a data scientist programming in Python. I have a dataset {dataset name} with columns {column names}. Please create a subplot with {rows/columns and visualization type}.
  4. Heatmap Visualization in Python
    I want you to act as a data scientist programming in Python. Please generate a heatmap for a dataset {dataset name} containing columns {column names}. Adjust the color palette according to {requirement}.
  5. Interactive Plots in Python (Plotly)
    I want you to act as a data scientist programming in Python. Please generate an interactive plot using {Plotly/Bokeh/other tool} for {data context}.
  6. Boxplot Visualization in R
    I want you to act as a data scientist coding in R. Please create a boxplot visualization for {dataset name} with {parameters/variables for visualization}.
  7. Barplot Visualization in Python
    I want you to act as a data scientist programming in Python. I have a dataset {dataset name} with columns {column names}. Please generate a barplot visualization.
  8. Pie Chart Visualization in Python
    I want you to act as a data scientist programming in Python. I have a dataset {dataset name}. Please generate a pie chart visualization showing the distribution of {column name}.
  9. Scatter Plot Visualization in R
    I want you to act as a data scientist coding in R. Please create a scatter plot with {dataset name} using {x-axis} and {y-axis} for the plot.
  10. Line Plot Visualization in R
    I want you to act as a data scientist coding in R. Please create a line plot using {dataset name} with {x-axis and y-axis}.
    1. Pairplot Visualization in Python (Seaborn)
      I want you to act as a data scientist programming in Python. Given a dataset {dataset name}, please create a pairplot to show relationships between all numerical variables.
    2. Histogram Visualization in R
      I want you to act as a data scientist coding in R. Please create a histogram using {dataset name} with {column name} on the x-axis.
    3. Violin Plot in Python (Seaborn)
      I want you to act as a data scientist programming in Python. Please create a violin plot to show the distribution of {column name} from {dataset name}.
    4. Facet Grid Visualization in R (ggplot2)
      I want you to act as a data scientist coding in R. Please create a facet grid using ggplot2 to visualize {variables} from {dataset name}.
    5. Time Series Plot in Python
      I want you to act as a data scientist programming in Python. Please create a time series plot with {dataset name} showing {time variable} on the x-axis and {data variable} on the y-axis.
    6. Density Plot Visualization in R
      I want you to act as a data scientist coding in R. Please create a density plot using {dataset name} for {column name} to visualize the distribution.
    7. 3D Plot in Python (Matplotlib)
      I want you to act as a data scientist programming in Python. Given a dataset {dataset name}, please create a 3D scatter plot using {x-axis, y-axis, z-axis}.
    8. Radar Chart in Python (Plotly)
      I want you to act as a data scientist programming in Python. Please create a radar chart to visualize the values of {features} from {dataset name}.
    9. Gantt Chart in Python (Plotly)
      I want you to act as a data scientist programming in Python. Please create a Gantt chart to visualize {project/task information}.
    10. Bubble Plot in Python (Plotly)
      I want you to act as a data scientist programming in Python. Please create a bubble plot to visualize {data variables} with size proportional to {another variable}.
    11. Stacked Bar Chart in Python (Matplotlib)
      I want you to act as a data scientist programming in Python. Please create a stacked bar chart using {dataset name} to show {category comparisons over time}.
    12. Tree Map in Python (Plotly)
      I want you to act as a data scientist programming in Python. Please create a tree map using {dataset name} to visualize hierarchical data with {category names}.
    13. Sunburst Plot in Python (Plotly)
      I want you to act as a data scientist programming in Python. Please create a sunburst plot to visualize hierarchical data using {dataset name}.

    Machine Learning Workflows

    1. Train a Linear Regression Model in Python
      I want you to act as a machine learning engineer. Please create and train a linear regression model in Python using {dataset name} to predict {target variable}.
    2. Train a Random Forest Model in Python
      I want you to act as a machine learning engineer. Please create and train a random forest model in Python using {dataset name} to predict {target variable}.
    3. Train a Decision Tree Model in Python
      I want you to act as a machine learning engineer. Please create and train a decision tree model in Python using {dataset name} to predict {target variable}.
    4. Train a SVM Classifier in Python
      I want you to act as a machine learning engineer. Please create and train a Support Vector Machine (SVM) classifier in Python using {dataset name} to classify {target variable}.
    5. Train a K-Nearest Neighbors (KNN) Model in Python
      I want you to act as a machine learning engineer. Please create and train a K-Nearest Neighbors (KNN) model in Python using {dataset name} to predict {target variable}.
    6. Model Evaluation in Python
      I want you to act as a machine learning engineer. Please evaluate the performance of the {model type} using {accuracy/precision/recall/F1 score} on {test dataset}.
    7. Cross-Validation in Python
      I want you to act as a machine learning engineer. Please perform cross-validation on the {model type} with {dataset name} using {number of folds}.
    8. Hyperparameter Tuning with GridSearchCV in Python
      I want you to act as a machine learning engineer. Please perform hyperparameter tuning for the {model type} using GridSearchCV on {dataset name} with the parameters {list of parameters}.
    9. Feature Selection in Python
      I want you to act as a data scientist in Python. Given the {dataset name} containing {features}, please perform feature selection and explain which features should be retained for {target variable}.
    10. Feature Engineering in Python
      I want you to act as a data scientist in Python. Given {dataset name} and {target variable}, please suggest and implement relevant feature engineering techniques.
    11. Ensemble Learning in Python
      I want you to act as a machine learning engineer. Please combine the predictions from {model 1}, {model 2}, and {model 3} using an ensemble method (e.g., bagging/boosting) in Python.
    12. Dimensionality Reduction in Python (PCA)
      I want you to act as a machine learning engineer. Given {dataset name}, please apply Principal Component Analysis (PCA) to reduce the dimensionality and explain the result.
    13. Clustering with K-Means in Python
      I want you to act as a data scientist. Please apply the K-Means clustering algorithm to {dataset name} to identify {number of clusters}.
    14. Clustering with DBSCAN in Python
      I want you to act as a data scientist. Please apply the DBSCAN clustering algorithm to {dataset name} to identify clusters and visualize the results.
    15. Train a Neural Network in Python (Keras)
      I want you to act as a machine learning engineer. Please build and train a simple neural network in Python using Keras to classify {target variable} from {dataset name}.
    16. Train a Convolutional Neural Network (CNN) in Python (Keras)
      I want you to act as a machine learning engineer. Please build and train a Convolutional Neural Network (CNN) in Python using Keras to classify images from {image dataset}.
    17. Train a Recurrent Neural Network (RNN) in Python (Keras)
      I want you to act as a machine learning engineer. Please build and train a Recurrent Neural Network (RNN) in Python using Keras to predict {target variable} from {time-series dataset}.
    18. Train a LSTM Model in Python
      I want you to act as a machine learning engineer. Please build and train a Long Short-Term Memory (LSTM) network in Python to predict {target variable} from {time-series dataset}.
    19. Transfer Learning with Pretrained Models in Python
      I want you to act as a machine learning engineer. Please implement transfer learning using a pretrained {model type, e.g., ResNet} to classify {image dataset} in Python.
    20. Natural Language Processing (NLP) Pipeline in Python
      I want you to act as a data scientist. Please create an NLP pipeline in Python that performs {text preprocessing, tokenization, sentiment analysis, etc.} on {text data}.

    Model Deployment and Production

    1. Deploy a Model with Flask
      I want you to act as a machine learning engineer. Please deploy the trained {model type} to a Flask API for predictions with {input format}. Include code for {input preprocessing, output formatting}.
    2. Deploy a Model with FastAPI
      I want you to act as a machine learning engineer. Please deploy the trained {model type} to a FastAPI API for predictions with {input format}. Include code for {input preprocessing, output formatting}.
    3. Containerize a Model with Docker
      I want you to act as a machine learning engineer. Please containerize the {model type} and its dependencies using Docker. Include instructions for running the container.
    4. Model Monitoring in Production
      I want you to act as a machine learning engineer. Please explain how to monitor the performance of {model type} in a production environment, including methods for {error logging, performance tracking, model drift detection}.
    5. Automated Retraining Pipeline
      I want you to act as a machine learning engineer. Please design an automated retraining pipeline that triggers when {data drift/accuracy drop} occurs in production for the {model type}. Include steps for {data collection, model evaluation, retraining}.

    Big Data Workflows

    1. Set Up Apache Spark Cluster
      I want you to act as a data engineer. Please explain how to set up an Apache Spark cluster for distributed data processing using {cloud platform/technology}.
    2. Process Big Data with Apache Spark
      I want you to act as a data engineer. Given a large dataset {dataset name}, please write a Spark job in Python/Scala to perform {data transformation, aggregation, filtering}.
    3. Data Pipeline with Apache Kafka
      I want you to act as a data engineer. Please create a simple data pipeline using Apache Kafka to stream data from {source} to {destination}.
    4. Querying Big Data with Apache Hive
      I want you to act as a data engineer. Please write a HiveQL query to extract {data fields} from the big data stored in {HDFS} using Apache Hive.
    5. ETL Process for Big Data
      I want you to act as a data engineer. Please explain and implement an ETL process for {dataset name}, including {data extraction, transformation, and loading} into {data warehouse}.
    6. Data Warehousing with Amazon Redshift
      I want you to act as a data engineer. Please create a schema and load {dataset name} into Amazon Redshift. Write an SQL query to analyze {specific data insights}.
    7. Big Data Processing with Google BigQuery
      I want you to act as a data engineer. Please use Google BigQuery to process {dataset name} and write a query to extract {specific data insights}.
    8. Streaming Data Analysis with Apache Flink
      I want you to act as a data engineer. Please write a job in Apache Flink to process streaming data from {source} and output results to {destination}.
    9. NoSQL Database Querying (MongoDB)
      I want you to act as a database engineer. Please write a MongoDB query to extract {data fields} from a collection where {specific conditions} are met.
    10. Data Lake Implementation on AWS S3
      I want you to act as a data engineer. Please explain how to implement a data lake on AWS S3 to store and query large datasets, including {data cataloging with AWS Glue}.
    11. Distributed Machine Learning with Spark MLlib
      I want you to act as a machine learning engineer. Please train a machine learning model using Apache Spark’s MLlib to predict {target variable} from {dataset name}.
    12. Data Visualization in Tableau
      I want you to act as a data analyst. Please create a dashboard in Tableau to visualize {specific data insights} from {dataset name}.
    13. Interactive Dashboards with Power BI
      I want you to act as a data analyst. Please create an interactive dashboard in Power BI to visualize {specific data insights} from {dataset name}.
    14. Pipeline Orchestration with Apache Airflow
      I want you to act as a data engineer. Please design and implement a data pipeline workflow using Apache Airflow to perform {data extraction, transformation, and loading}.
    15. Scalable Data Processing with AWS Glue
      I want you to act as a data engineer. Please explain how to use AWS Glue to automate the ETL process for {dataset name} and prepare data for analysis.
Shares