Solving Kaggle’s Titanic Machine Learning Dataset

Image for post
Image for post
Image by S.Herman and F. Richter from Pixabay

Kaggle’s Titanic Machine Learning Dataset––a classic open-source introduction to the realm of machine learning. While this may be a beginner project, there is still a leaderboard (it is captivating to see yourself rank up as you continue to work on your code).

To start, it is required that you download the “train.csv” and “test.csv” files directly from Kaggle. This can be downloaded here: Make sure to make sure that the files have been placed in a designated spot that can be pulled later.

In this competition, a large train dataset describes various characteristics of those whose lives were cut…

Image for post
Image for post
Image by Lorenzo Cafaro from Pixabay

Kaggle, a Google subsidiary, is a community of machine learning enthusiasts. This particular project launched by Kaggle, California Housing Prices, is a data set that serves as an introduction to implementing machine learning algorithms. The main focus of this project is to help organize and understand data and graphs.

This article will discuss how to graph, organize, and set-up data using sklearn, pandas, and NumPy in reference to the Kaggle project.

I am going to be using Jupyter Labs, and the code will be based on that.

Sklearn: Sklearn is a machine learning software in Python’s library. The main features…

Image for post
Image for post
(Image by Author)

Background Information

Q-Learning is generally deemed to be the “most simple” reinforcement learning algorithm. I find myself agreeing with this statement.

In another paper, I discussed the use of Q-Learning compared to Deep Q Networks. So, I will pull the information that discussed what Q-Learning is, the positives and negatives, and the general equation. I believe that this information is crucial background information.

Q-Learning is one of the more basic reinforcement learning algorithms; that is due to its “model-free reinforcement learning” nature. A model-free algorithm, as opposed to a model-based algorithm, has the agent learn policies directly. Like many of…

Image for post
Image for post
Kevin Ku on Unsplash

For image recognition tasks, using pre-trained models are great. For one, they are easier to use as they give you the architecture for “free.” Additionally, they typically have better results and typically require need less training.

To see a real application of this theory, I will be using Kaggle’s CatVSDogs dataset in an attempt to discuss the results of using the different methods.

The steps will be as follows:

1) Imports2) Download and Unzip Files3) Organize the Files4) Set-up and Train Classic CNN Model 5) Test the CNN Model6) Set-up and Train Pre-Trained Model7) Test…

Using a classic environment from OpenAI, CarRacing-v0, a 2D autonomous vehicle environment, alongside a custom based modification of the environment, a Deep Q-Network (DQN) was created to solve both the classic and custom environments. Through the use of a Resnet18 pre-trained architecture and a custom made convolutional neural network structure, these models were used to solve the classic and modified environments. All-inclusive, the custom environment did not allow for free movement, which ultimately caused catastrophic forgetting, making the classic environment more suitable for training. Additionally, the pre-trained model produced more randomized results, while the custom made CNN architecture resulted in…

Ali Fakhry

Ali Fakhry is a high school senior with passions that relate to the field of machine learning and computer science.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store