Introduction In natural language processing, particularly text mining, topic modeling is a very important technique used commonly for identifying topics from a text source to enable informed decision making. Topic modeling is an unsupervised statistical modeling technique used for finding out a group of words, which collectively represent a topic in a large collection of documents. The article focusses on…

# Author: Abhay Padda

## Introduction to Random Forest

Introduction: Random Forest Now that we have an idea about decision trees and how exactly they work, I think we can now go a step further and try to improve our decision tree models by introducing a very basic but very effective extension for decision trees, which are popularly known as “Random Forest”. To understand decision trees in detail, you…

## Implementing Logistic Regression using Titanic dataset in R

Introduction In my last post, “Understanding mathematics behind Logistic Regression“, I explained the basic maths behind logistic regression. In this post, I intend to implement logistic regression model in R using Titanic dataset. I have used Titanic dataset for explaining logistic regression where the target variable is ‘Survived’ which has two values 0 and 1. Data Dictionary Variable Definition Key…

## Understanding mathematics behind Logistic Regression

Introduction to Logistic Regression Logistic Regression is a type of regression in which returns the probability of occurrence of an event by fitting the data to a mathematical function called ‘logit function’. It is basically a classification algorithm and is used mostly when the dependent variable is categorical, the independent variables can be discrete or continuous. Generalized Linear Models Before starting with…

## Top Trends in Social Media Analytics for 2017

Introduction: Social Media Analytics Social Media platforms have been successfully able to attract new users every year and the rate of adding new users to these platform has been increasing ever since the rise of these platforms. In the year 2016 itself, a massive 219 million active users have been added to various social media platforms, which is a 10%…

## Geospatial Analytics for Boosting Sales

Geospatial analytics The power of big data analytics has been widely acknowledged by the decision makers and analysts worldwide. But still, big data has not been utilized to its potential by the analysts, especially the location data. Location data, also known as the geospatial data or geographical information, has been on the rise ever since the advancement of technology. The…

## Data Visualization-R (Part-1)

Data Visualisation – R (Part-1) Introduction In this report, I will use different datasets to plot the data to gain some meaningful insights using ggplot2 package. There is one more post which explains how to visualize maps in R using ggmaps package, you can read more about it here. This post will cover basics of data visualisation-R. Some basic plots First load the…

## Plotting maps in R using ggmap

Introduction The objective is to explore ‘ggmap’ package in R and use this package to plot points on the map. Also, you can view other posts related to visualizations here. For this post, I’ll be using the map of India. Initially, I’ll try to explain some of the basic functions in ggmap and then I’ll explain by plotting different airports…