Data Science

Data Analysis: Smart Phones & Other Trends In Will Creation

Writing a last will and testament is not usually an activity associated with millennials.  However, young people are thinking differently about protecting their families, and, in turn are "disrupting" the will industry like many others. And in doi...

The Beautiful Binomial Logistic Regression

The Logistic Regression is an important classification model to understand in all its complexity. There are a few reasons to consider it: It is faster to train than some other classification algorithms like Support Vector Machines and Random For...

The Worst Kind of Data: Missing Data

Most publicly available datasets or datasets at the workplace are complete. However, from time to time we encounter datasets where some or many entries are missing. The problem of missing data exists on a spectrum; only a few entries missing among mi...

How to Overcome the Curse of Dimensionality

Dimensionality reduction is an important technique to overcome the curse of dimensionality in data science and machine learning. As the number of predictors (or dimensions or features) in the dataset increase, it becomes computationally more expensiv...

K-Means Clustering: All You Need to Know

In machine learning, we are often in the realm of “function approximation”. That is, we have a certain ground-truth (y) and associated variables (X) and our aim is to use identify a function to wrap our variables in that does a good job in approx...