LDA visualization using pyLDAvis
Basics, Text Mining, Unsupervised Learning

Introduction to topic modeling using LDA (Latent Dirichlet Allocation)

Introduction In natural language processing, particularly text mining, topic modeling is a very important technique used commonly for identifying topics from a text source to enable informed decision making. Topic modeling is an unsupervised statistical modeling technique used for finding out a group of words, which collectively represent a topic in a large collection of documents. The article focusses on…

Continue Reading

Random Forests
Basics, R, Supervised Learning

Introduction to Random Forest

Introduction: Random Forest Now that we have an idea about decision trees and how exactly they work, I think we can now go a step further and try to improve our decision tree models by introducing a very basic but very effective extension for decision trees, which are popularly known as “Random Forest”. To understand decision trees in detail, you…

Continue Reading

R, Text Mining

Text Analytics: Mining Enron Emails

Mining Enron Emails You might have heard about the Enron scandal that came to light in 2001 which eventually led to bankruptcy of the Enron corporation. This is the largest corporate fraud that had happened so far. The Enron top-honchos used what is called Mark-to-market accounting to make up their financial statements. They used this accounting and financial shenanigan to…

Continue Reading

Neural Networks

Activation Functions in ANNs (Conclusion)

In my last article Activation Functions in ANNs, we discussed on few activation functions, now let’s explore more on some other available activation functions. Tanh Function These are scaled sigmoid function which is similar to sigmoid functions. Or It is nonlinear so we can have more than one layer of neurons depending upon the requirement. Its range is (-1, 1).…

Continue Reading

Inverted tree image
Basics, Supervised Learning

All you need to know about Decision Tree (Part-1)

Introduction As the title suggests, I’ll try to put necessary information on decision tree under this article. However, providing all the required information in one post will be difficult and makes you lost. So, I’ve made this article into three parts. Part 1 (this post) : we shall discuss introduction and definitions Part 2 :  Advanced topics related to decision…

Continue Reading

Sigmoid
Logistic Regression, R, Regression

Implementing Logistic Regression using Titanic dataset in R

Introduction In my last post, “Understanding mathematics behind Logistic Regression“, I explained the basic maths behind logistic regression. In this post, I intend to implement logistic regression model in R using Titanic dataset. I have used Titanic dataset for explaining logistic regression where the target variable is ‘Survived’ which has two values 0 and 1. Data Dictionary Variable Definition Key…

Continue Reading