I'm a Masters student in the Department of Computer Science at the University of Illinois at Urbana-Champaign (UIUC).
I like to code. Furthermore, I like to reason with algorithms. I am comfortable to work in any software development pipeline and have academically researched and professionally worked in database systems in cloud and data mining realms in Healthcare and Security domains.
I received my B.Tech. degree in Computer Science and Engineering from Symbiosis International University where I was advised by Dr. Madhura Ingalhalikar and Dr. Rahee Walambe.

  • University of Illinois, Urbana-Champaign, U.S.A. (Aug 2019 - Dec 2020)
    GPA: 3.94/4.00
    Coursework:
      • Distributed Systems
      • Database Systems
      • Computer Security
      • Software Engineering
      • Advanced Social and Information Networks
      • Design Thinking
      • Machine Learning with Signal Processing
      • Intro to Bioinformatics
  • Symbiosis International University, Pune, India (Jul 2015 - May 2019)
    GPA: 8.94/10.0
    Awarded “Symbiosis Award for Academic Excellence” (Departmental Rank: 2nd)

  • S/W Tools: Java; Python; Go; C; C++; Selenium; Git
  • Databases: SQL; PostgreSQL; Lucene; NoSQL (MongoDB, Neo4J)
  • Data Eng.: Scikit-learn; Scipy; Pandas; Numpy
  • Comp. Vision: Radiomics; OpenCV; Matplotlib
  • Cloud Platform Engineer Intern, American Family Insurance (May 2020 - Present)
    Madison, Wisconsin, U.S.A.


  • Graduate Teaching Assistant, University of Illinois at Urbana Champaign (Jan 2020 - May 2020)
    Urbana-Champaign, Illinois, U.S.A.
    CS225: Data Structures
    Designing and grading machine problems, assignments and exams for a class strength of >700 students.
    Conducting labs on a weekly basis with >60 students and teaching them to efficiently code diff. data structures from scratch.

  • Graduate Research Assistant, University of Illinois at Urbana Champaign (Dec 2019 - Aug 2019) [Project Link]
    Urbana-Champaign, Illinois, U.S.A.
    Everything Search: An Information Retrieval System
    Designing a prototype of entity-aware web indexing system to search by and for keywords, entities and documents; Java.
    • Added a functionality of converting a stream of tokens into an efficient index structure optimized for read-only queries.
    • Performed experiments on how to efficiently populate a database and store materialized views using late materialization.
    • Building indexes, generating materialized views, extracting frequency and position posting list on Lucene for benchmarking.
    Web Information Extraction Project
    Mined unstructured content on online shopping pages by extracting price, image and title elements for various products using web scraper and generated labeled dataset for further ML training; Python, Selenium

  • Research Assistant, Symbiosis Centre for Medical Image Analysis (Mar 2018 - May 2019)
    Pune, Maharasthra, India

    Predictive Markers for Parkinson’s Disease
    Paper published in Elsevier’s NeuroImage journal (Vol. 22, 2019, 101748)
    Research laid at the intersection of ML, Radiomics and Neuroimaging, under the guidance of Dr. Ingalhalikar, Head, SCMIA.
    • Preprocessed data of 80 candidates by creating a robust pipeline, which converts 3D MRI image to a set of feature vectors generated by Radiomics and Image Filtering techniques.
    • Ran three models – CNN, XGBoost with contrast ratios (CR-ML) and XGBoost with Radiomics features (RA-ML) for comparison. CNN won the race by achieving a cross-validation accuracy of 83.7% (AUC of ROC = 0.90).
  • Software Engineer Intern, Symantec (Jan 2018 - Jun 2018) [Project Link]
    Pune, Maharasthra, India

    Security Analysis and Response (STAR) Team
    Researched and developed an anti-phishing solution to identify phishing websites in real-time for Norton Browser Extensions such as ‘Norton Safe Web’ and ‘Norton Safe Search’.
    • Coded automation script to scrape and extract >100 features for >100k phishing sites using Data Mining and Comp Vision
    • Ran a classification model on the feature engineered data for 2 different use cases:
      • All User: Binary Classifier – Phishing or Legitimate; Supervised; XGBoost model (AUC of ROC = 0.9825).
      • Company as a User: Multiclass Classifier – No of Companies to Detect; Semi-Supervised; DBSCAN Clustering (TPR = 81% and FPR = 0%). This helped avoid almost every phishing attempt made on the Symantec’s client companies with its null FP.
  • MapleJuice – a simplified Hadoop: Scratch implementation of distributed computational framework which features protocols like group membership, leader election, and failure recovery, distributed file system and map-reduce algorithm for multiple applications ranging from simple word count to recommending friends based on social network graph. For testing, distributed grep was built to query terabytes of log data distributed over several machines; Golang.
  • Applications of Lucene: Performed operations on Lucene, a full-text search library in Java, for building indexes, extracting frequency and position posting list, and generating materialized views; Java, Lucene.
  • Oracle: Oracle Certified Associate Java SE 8 Programmer I (2017)
  • IBM: Hadoop Foundation and Programming; Applied Data Science with Python; Big Data Foundation (2017-18)
  • Coursera: Programming Frameworks by Duke University; Neural Networks and Deep Learning by deeplearning.ai (2018)
  • Overall Advisory Head, Technical Festival of Symbiosis; (2019)
  • Head, The Editorial Board, Computer Science and Information Technology Department, SIU; (2016–2019)
  • Head, Software Development Team, Civil Engineering Society of Symbiosis(CESS); (2017)
  • Head, Code-Wars Team, Technical Festival of Symbiosis; (2015)
  • Consultant, Illinois Business Consulting (IBC) (2019)
  • Senior Student Mentor, Symbiosis’ Mentor – Mentee Committee (2017 – 2018)
  • Drama Team, Skit on “Importance of Inter-Disciplinary Skills”, Symbiosis Inauguration Programme, Pune, India
  • Dance Team, Second Place at Flash Mob Competition at Season's Mall, Pune, India