
Projects and Academic Work
Skills put into action
To provide a tangible showcase of my work, I’ve included a portfolio of my recent academic projects. Take a moment to explore below my various academic and practical projects and skills showcase.
Secured the 1st rank in class and 11th rank out of the total 464 participants at a school-wide Kaggle competition to build the best rental price predicting machine learning model for Airbnb, using an enormous dataset of 40,000 observations across 90 features. The machine learning model was to be built in the R language, using models limited to Gradient Boosting, Ranger, Lasso, Ridge, Stepwise Selection, Regression, Bagging and Bootstrapping.
Grade: A+
Click below for in-class presentation. (Subject to copyright)
Scrutinized ‘FreshDirect LLC’, USA and proposed a new market segmentation plan, and a profitable supply chain optimization schedule. Out of given data consisting of sales figures, customer database, weekly trends, zipcodes, items sold, delivery schedules, etc., made a comprehensive analysis using Excel Pivot Table, IBM Watson, and Tableau to report the most profitable customer segments to target, build a geographical segmentation profile and heat map analysis, and suggest profit and delivery optimization.
Grade: A+
Click below for in-class submission. (Subject to copyright)
Project 4 - NYPD - Murder Rate Research Proposal
Research Design, Hypothesis, Sampling, Population
MS. Applied Analytics (Columbia University)
Designed a detailed research study for the NYPD's concerns of increasing murders rates in New York City. The design consists of several aspects such as management dilemma, research questions, benefits of the research, the methodology and type of research study, the population and sample selection, threats to validity, and analytical methodology.
Grade: A+
Click below for in-class submission. (Subject to copyright)
Project 5 - Youtube - New Revenue Model Proposal
MS Powerpoint, Canva
MS. Applied Analytics (Columbia University)
Created an entirely new ad-based revenue model for YouTube as part of the Storytelling class. Analyzed top videos to enlist certain characteristics about good videos and modeled a new earnings section in the YouTube website called 'YouTube Preferred'. This was done for the Storytelling with Data module, and all data provided may not resemble actual figures.
Grade: A+
Click below for in-class presentation. (Subject to copyright)
Project 6 - Women's Clothing - Rating Prediction
R, Unsupervised Learning, Sentiment Analysis, Clustering
MS. Applied Analytics (Columbia University)
Secured the 1st rank in class for an Unsupervised Learning Project. Performed detailed analysis and predictive modeling in R on a dataset containing Women's Clothing Reviews, with a total of 20,000 records and 11 variables. The analysis techniques and models included wordclouds, histograms, sentiment analysis, lexicon analysis, clustering analysis, TF, TF-IDF models, predictions using tree and linear regression models.
Grade: A+
Click below for in-class presentation. (Subject to copyright)
Project 8 - eBay Inc. - Market Expansion Analysis
Blue Ocean, ERRC, Strategy Analytics, MS Excel
MS. Applied Analytics (Columbia University)
Performed in-depth case study analysis of eBay Inc.'s current business model. Built a Strategic Plan for a Multi-Modal Auction e-Platform for eBay Inc. using tools such as Blue Ocean approach strategy, and an ERRC Model. Analyzed the competitive environment using SWOT, PESTLE and Porter's Five Forces, and clearly defined the Roadmap and Path to Implementation, Adaptive Planning, Analytics Application, and Risk Management for the suggested strategy.
Grade: A+
Click below for in-class submission. (Subject to copyright)
Project 7 - Google Playstore App - Rating Prediction
Python, Jupyter Notebook, pandas, numpy, sklearn, nltk
MS. Applied Analytics (Columbia University)
Built a Ratings Predictive Model for the Google Playstore Apps market, using a vast dataset form Kaggle containing numerous information such as size, date, free, installs, category, and others. Performed in-depth Data Cleaning, Data Transformation, Exploratory Analysis, Sentiment Analysis and Clustering, to develop meaningful actionable insights for app developers and also suggested a future outlook for the app market.
Grade: A+
Click below for Jupyter Notebook file. (Subject to copyright)
Project 9 - Lyft Inc. - Market Expansion Analysis
Blue Ocean, ERRC, Strategy Canvas, Strategy Analytics, MS Excel
MS. Applied Analytics (Columbia University)
Built a new Strategic Roadmap for Lyft, for its expansion into other areas, other sectors, other regions, and other industries, and revenue growth, and customer satisfaction. Evaluated the current standing, competitive environment, the relevance of different strategic frameworks such as SWOT, Porters Five Forces, PESTLE, and the ERRC model, and suggested the timeline and implementation of the recommendation.
Grade: A+
Click below for in-class submission. (Subject to copyright)
Project 10 - Predicting Loan Defaults
Feature Engineering, EDA, Random Forest, h2o, Sampling, Lift
MS. Applied Analytics (Columbia University)
Project 11 - Healthcare Outlier Detection 1
Python Outlier Detection PyOD, kNN, PCA Clustering
MS. Applied Analytics (Columbia University)
Used a dataset containing data on the healthcare industry, such as DRG, hospital name, location, average charges, medicare payments, total discharges, and others over 163k rows. Performed an in-depth exploratory data analysis to understand each feature, and then performed thorough feature engineering to build many new meaningful features to help in hospital fraud detection. Further performed clustering analysis using the PyOD modules, to identify anomalous or potentially fraudulent clusters with the help of the average summary statistics table for each cluster.
Grade: A+
Click below for in-class submission. (Subject to copyright)
Project 13 - ML Model Monitoring Dashboard Proposal for Loan Default Model built in Project 10
Model Performance Metrics, System Usage Indicators, Service Response Metrics, Production Cost Monitoring
Used a real loan default dataset (company name withheld) containing 80,000 rows of data over 89 variables to performed an in-depth Exploratory Data Analysis and Feature Engineering, Further built multiple random forest models using the H2O package to build a stable and acceptable loan default prediction machine learning model. The final LIFT score was 3.01 with a Precision-Recall score of 0.50, and AUC of 0.79.
Grade: A+
Click below for in-class submission. (Subject to copyright)
Project 12 - Healthcare Outlier Detection 2
Python Outlier Detection PyOD, Autoencoder, iForest Clustering
MS. Applied Analytics (Columbia University)
In this project, I continue with Project 11 and take it a step further to explore more PyOD modules such as Autoencoder and Isolated Forests. I further performed clustering analysis using the PyOD modules, to identify anomalous or potentially fraudulent clusters with the help of the average summary statistics table for each cluster.
Grade: A+
Click below for in-class submission. (Subject to copyright)
MS. Applied Analytics (Columbia University)
Built a draft of a Model Minotoring Dashboard, which is used to track usage, set thresholds, and set parameters to accept/reject a machine learning model and decide its validity and stability. The dashboard is built based on 4 distinct segments - Model Performance, System Usage, Production Cost, and Service Response.
Grade: A+
Click below for in-class submission. (Subject to copyright)
For more examples and further understanding of my works, don’t hesitate to reach out. Keep exploring for my co-curricular and activities.
Project 3 - Johnson & Johnson - Sales Analysis
MS Excel, MS Word
MS. Applied Analytics (Columbia University)
Designed a research study for Johnson & Johnson's baby powder's declining sales. Analyzed the market trends, past pattern analysis, future forecast, supply chain stock-outs, and reported key metrics regarding profitability. Further analyzed the marketing activities, the loyalty programs, and understood competitors' offerings to develop a more effective marketing campaign. Also analyzed the resignation of the CMO, and succession planning.
Grade: A+
Click below for in-class submission. (Subject to copyright)