Hi, my name is

Varun Deepak Gudhe

A passionate Data Scientist specializing in ML, AI, and Web Development.

About Me

I’m Varun Deepak Gudhe, a graduate student at North Carolina State University, pursuing my Masters in Computer Science. I’m really passionate about data science and have developed numerous machine learning projects and like to deploy them in the cloud to bring my projects to a wider audience. Currently, I’m exploring the world of machine learning in DNA and protein sequences. I’m also diving into web development.

Here's a peek at the programming languages and tools I've been working with:

Python R Java cpp HTML CSS Javascript bootstrap rails Ruby React Tensorflow Git bash linux PyTorch regex Postman AWS Kubernetes MySQL vscode docker netlify

Experience

Graduate Research Assistant - Ashes Lab NCSU
Sep 2023 - present

I am currently working as a graduate research assistant under the supervision of Dr. Carter Clinton.

  • Analyzing metagenomic data from the historic New York African Burial Ground using Next Generation Sequencing and tools like Qiime2 and Kraken to reconstruct the history of the enslaved African population.
  • Categorizing DNA sequences (human, bacterial, animal, etc.) using Bowtie2 and SAMTools; compare human DNA with public databases for genealogical links and disease markers, utilizing bash, CUDA, and HPC.
Graduate Research Assistant - North Carolina State University
Sep 2023 - present

I am currently working as a graduate research assistant under the supervision of Dr. Rafael Guerrero.

  • Analyzing protien sequences to modify their thermostability through amino acid changes, opening up the path for advanced applications.
  • Extracted protein sequence and taxonomic data from NCBI and PDB databases, cleaned the data to rectify inconsistencies and improve data integrity, transformed the raw data using techniques like merging and joining to facilitate further analysis, loaded the optimized datasets for subsequent computational processes.
  • Utilized the ete3 toolkit for phylogenetic tree processing, aligning leaf labels with protein sequence data, tree pruning, and understanding unique sister clades' evolutionary significance. Concurrently, conducted statistical analyses to find correlations between amino acid changes and T growth, visualized with box plots, and designed a linear regression classifier for predicting thermostability based on protien sequences.
Graduate Student Assistant - NC State Data Science Academy
May 2023 - Aug 2023
  • Mentored 25+ students in Data Science projects for rural works organizations. Enabled hands-on experience for students using real data from these organizations, leading to valuable insights that helped the nonprofits improve their operations.
  • Guided students in delivering impactful presentations, showcasing their findings and providing actionable recommendations to stakeholders.
Course Collobarative Leader - NC State Data Science Academy
Aug 2023 - present
  • Developing and maintaining self-updating datasets sourced from real-time public data sources. Using APIs to connect with public data and create dynamic datasets.
  • Documenting codebooks for each dataset and maintaining workflow notes to help future progress by others.
  • Proposing ideas for extracting insights and projects from the datasets to guide research and match course goals.
  • Leading regular meetings to update on project status, milestones, and future plans of action.
  • Publishing these datasets and their insights for the wider academic community.
Teaching Assistant - NC State Data Science Academy
Oct 2023 - present
  • Assisting NC DHHS employees in the course “Data at Work: Data Analytics in Excel and Beyond” with key topics including ETL Tools, Data Warehousing, Microsoft Excel, SQL, PowerBI, Statistics, and Data Visualization.
  • Conducting regular office hours providing guidance on course concepts, lab techniques, and assisting on their capstone projects.
  • Grading lab assignments and student capstone projects.
Graduate Research Assistant - IEC Lab NCSU
Dec 2022 - May 2023
  • Collaborated with Tasmia Shahriar on the AI-based application Simstudent, using data analysis skills.
  • Enhanced model accuracy through data coding, mirroring middle school perspectives.
  • Improved Simstudent’s performance, benefiting many middle school students.
Teaching Assistant - SRM University Ap
Jul 2019 - Jun 2020
  • Tutored and evaluated 60+ students in Python course using Minerva platform.
  • Teaching Assistant for the course Probability and Statistics. Assisted in attending student queries and graded students assignments and final project to aid the professor.

Education

2022 - 2024
Master of Computer Science
North Carolina State University
GPA: 3.78 out of 4.0

Course Work:

  • Design and Analysis of Algorithms, Neural Networks Deep Learning, Automated Learning and Data Analysis, Experimental Stats for Engineers, Cloud Computing, DBMS, Software Engineering, Object Oriented Programming.
2018 - 2022
Bachelor of Science in Computer Science
SRM University AP
GPA: 9.54 out of 10

Course Work:

  • Artificial Intelligence, Machine Learning, Big Data, Data Mining, DBMS, Data Structures, Software Engineering, Computer Networks, Operating Systems, Object Oriented Programming.

Extracurricular Activities:

  • Volunteered as a infra-tech team head for tech fest and organised events for cultural fest.
  • Worked as a Campus Ambassador for SRM-AP to represent SmartKnower.

Projects

Sync-Ends Library
Python API Github Github actions
Sync-Ends Library
A Python Library that can detect any change across Postman Collection APIs and instantly send notifications on Slack, Teams & Email.
Prot_pgls
Python R Bioinformatics Phylogenetics ete3 Data Wrangling Statistical Analysis Parallel Processing Large-scale Data processing Database Querying GLM(Generalized Linear Models) Data Visualization
Prot_pgls
Bioinformatics-driven optimization of protein thermostability through sequence analysis, phylogenetic tree processing, and predictive modeling using data from NCBI and PDB databases.
NYABG Metagenomic Analysis
Metagenomic Analysis Next Generation Sequencing Qiime2 GWAS Kraken Bowtie2 SAMTools Bash Python HPC
NYABG Metagenomic Analysis
Metagenomic analysis of the historic New York African Burial Ground using Next Generation Sequencing, with tools like Qiime2, Kraken, Bowtie2, and HPC assisting in DNA categorization and genealogical exploration.
IMU Terrain Classification
Python TensorFlow Keras CNN Time Series Analysis
IMU Terrain Classification
A deep learning model to identify & classify different terrains from IMU time series dataset.
Histopathologic Cancer Detection
Python Transfer Learning TensorFlow CNN VGG19
Histopathologic Cancer Detection
A deep learning model (CNN's,VGG19) to identify metastatic cancer in colored image patches, using Transfer Learning, Neural Networks and Image Segmentation.
Road Accident Dashboard
Tableau KPI's Data Analysis Data Visualization
Road Accident Dashboard
An interactive Tableau dashboard that delivers a visual exploration of road accident trends, providing immediate insights into casualty statistics through dynamic KPIs, trend analyses, and geographic data representations for informed decision-making.

Achievements

Get in Touch

Open to exciting roles and collaborations!. Got an opportunity?, I’m all ears. Let’s connect and discuss the possibilities!