Introduction: Random Forest Now that we have an idea about decision trees and how exactly they work, I think we can now go a step further and try to improve our decision tree models by introducing a very basic but very effective extension for decision trees, which are popularly known as “Random Forest”. To understand decision trees in detail, you…

# Category: R

## All you want to know about Decision Tree Part 3

Decision Tree Part 3 This is the third article in the decision tree series, you can access other two here: Part 1: All you need to know about Decision Tree Part 1) Part 2: All you need to know about Decision Tree Part 2) In this previous article, I tried to construct a decision tree using R. For this, I have considered…

## Text Analytics: Mining Enron Emails

Mining Enron Emails You might have heard about the Enron scandal that came to light in 2001 which eventually led to bankruptcy of the Enron corporation. This is the largest corporate fraud that had happened so far. The Enron top-honchos used what is called Mark-to-market accounting to make up their financial statements. They used this accounting and financial shenanigan to…

## Implementing Logistic Regression using Titanic dataset in R

Introduction In my last post, “Understanding mathematics behind Logistic Regression“, I explained the basic maths behind logistic regression. In this post, I intend to implement logistic regression model in R using Titanic dataset. I have used Titanic dataset for explaining logistic regression where the target variable is ‘Survived’ which has two values 0 and 1. Data Dictionary Variable Definition Key…

## Data Visualisation in R (Part-3)

Data Visualisation in R (Part-3) Introduction In this report I will plot some more advanced charts using ggplot2 package. If you want to learn more about some basic plots you can refer to my earlier articles Data Visualization in R (Part 1) and Data Visualization in R (Part 2) library(Hmisc) library(dplyr) library(ggplot2) library(ggplot2movies) library(RColorBrewer) library(PerformanceAnalytics) library(GGally) Boxplots and Variable Transformation…

## Data Visualization in R (Part-2)

Introduction In this report, I will plot some more advanced charts using packageggplot2. If you want to learn more about some basic plots you can refer to my earlier article Data Visualization in R (Part 1). Also, you can view other posts related to visualizations here. library(ggplot2) library(RColorBrewer) Data Smoothing in plots Smoothing means to use algorithms to remove noise…

## Data Visualization-R (Part-1)

Data Visualisation – R (Part-1) Introduction In this report, I will use different datasets to plot the data to gain some meaningful insights using ggplot2 package. There is one more post which explains how to visualize maps in R using ggmaps package, you can read more about it here. This post will cover basics of data visualisation-R. Some basic plots First load the…

## Understanding Softmax Regression with an example in R

Introduction to Softmax Regression We have commonly used many classification algorithms for binary classification. Now we will see a classification technique which is used to classify k classes. This technique is called softmax regression. Softmax regression is also called as multinomial logistic regression and it is a generalization of logistic regression. Softmax regression is used to model categorical dependent variables…