Recognizing Digits
This page is currently under construction. Please check again soon!
Data Scientist
This page is currently under construction. Please check again soon!
There are about 225,000 new cases of lung cancer every year in the United States, with a cost of $12 Billion. Although doctors try to do everything they can to accurately diagnose patients, even they are prone to mistakes and human error. By combining the dataset provided by Kaggle and machine learning algorithms, we will try to accurately predict whether a person will develop lung cancer within the next year based solely on their CT Scans.
Netflix, one of the world’s biggest provides of movies and tv shows, is no stranger to machine learning algorithms. In 2009, they gave away the biggest prize ($1,000,000!) for improving a movie recommendation algorithm. Although Netflix is great at movie recommendations and unsupervised algorithms to determine those movies, they have not been collecting much data on the top movies of all time. In this project, we will try to find which factors contribute the most to pushing a movie to the top of the list and making the IMDB Top 250.
The crashing of the Titanic has been one of the most tragic events in human history, with a total of 1,503 deaths. It’s difficult to imagine what you would have done if you were in that situation. Aboard the ship, some people were more likely to survive than others. In this project, I am going to use a predictive model to find out who was more likely to survive, and who was not so fortunate.
Finding the optimum salary in which both employee and company can agree on is a necessity in any job market. We will apply our knowledge of web scraping and logistic regression to come up with a model that correctly predicts what a data scientist’s salary should be based on the job results of Indeed.com.
The real estate market in Ames, Iowa constantly fluctuates in sale price. Therefore, it’s important to figure out what causes these fluctuations and if it’s possible to predict which neighborhoods and types of houses will increase in sale price. This article attempts to find which features, if any, significantly contribute to the changes in sale price and if there are any other useful insights we can gleam from the data.
The dataset we are going to explore is the Billboard’s Top 100, a list of the most popular songs in America. Our purpose for this project is to analyze the data and see which factors made major hits and stayed at the top of the charts the longest. The underlying question of this is: Why do we like what we like? Are there specific genres the public prefers? Does release date matter? We will examine and attempt to answer these questions.