## Hadoop 2.0 & YARN Architecture

*By Joshua Burkhow On September 15, 2014 · 1 Comment · In Big Data, Hadoop, YARN*

I’ve been spending quite a bit of time learning Hadoop and realize there is a ton of books, videos, and blogs out there that are currently teaching Hadoop from a point of view of Hadoop version 1.x yet Hadoop 2.0 has been out since really May 2012 and more so is radically different. I […]

## 5 Great Resources For Learning Linear Algebra

*By Joshua Burkhow On June 1, 2014 · Leave a Comment · In Mathematics*

Why Do You Have to Learn Linear Algebra?

I am always learning new things and as I take on more and more things to learn I am constantly forcing myself to first ask “Why is this important?” because obviously if I can’t find a reason clearly and easily enough then I usually won’t go down that […]

## Big Data Analytics and Netflix’s House of Cards

*By Joshua Burkhow On February 15, 2014 · 2 Comments · In Analytics, Big Data*

The start of a bad joke goes like this: What do Kevin Spacey, Politics, Netflix, and Big Data all have in common? Yeah well they actually do have a lot in common. They are more than anything extremely addictive. I haven’t met anyone yet (I know I am about to eat my […]

## 5 Tools You Need To Know To Work With Big Data

*By Joshua Burkhow On February 7, 2014 · 6 Comments · In Analytics, Big Data, Data Science, Programming, Statistics*

I could easily study and write solely on the topic of Big Data. I could dive deep into every single Apache project and all the other software offerings, white papers, and technologies around big data and I’d have a lot to write about. The challenge with this is that we are not robots, we can’t […]

## Linear Programming: The Gateway to Analytics

*By Joshua Burkhow On February 1, 2014 · 5 Comments · In Analytics, Mathematics*

Reporting, BI, and the lack of Mathematics

I have done reporting, data warehousing, and overall BI development for a while now and the one thing I always struggled with is how easy and common it was to be presenting numbers and yet not be very mathematical at all. What I mean is the most […]

## What is the Central Limit Theorem?

*By Joshua Burkhow On December 31, 2013 · Leave a Comment · In Statistics*

The Central Limit Theorem (CLT) is a critical topic in statistics. Here are a handful of links and a video from Khan Academy to get you up to speed on the CLT.

## The Anscombe Quartet and Why Data Visualization is Critical

*By Joshua Burkhow On December 28, 2013 · 1 Comment · In Data Science, Data Visualization, Statistics*

I love examples that are so blatantly clear that even people who aren’t as crazy as I am about data can get it. Anscombe’s Quartet is exactly that, it shows that we must look at data visually to really ensure we understand the data.

Anscombe’s quartet comprises four datasets that have nearly identical simple statistical […]

## Skills of Top 10 Kaggle Competitors

*By Joshua Burkhow On December 21, 2013 · Leave a Comment · In Data Science*

I am slowly starting to get into Kaggle and want to eventually be one of the top competitors. I thought to myself I wonder what the top 10 competitors are like from a skill set? Well here are the top 10 as of December 21st, 2013:

You can find the up to date list […]

## Steps to Developing a Usable Algorithm

*By Joshua Burkhow On September 22, 2013 · Leave a Comment · In Algorithms, Computer Science, Data Science*

I am a huge fan of trying to take the complex, detailed, and unmanageable into a concise, usable and understandable list or outline. I love seeing subjects like data architecture management, data modeling, programming languages, and even Data Science put into visual models that make it easier for others to see conceptually how it exists. […]

## Things to Learn in R

*By Joshua Burkhow On September 1, 2013 · 1 Comment · In Analytics, Data Visualization, Programming, Statistics*

You up for learning something new? or maybe like me want to “do” statistics? If you are looking to do data analysis then R is a great tool/language to learn.

What is R?

R is a system for statistical computation and graphics. It consists of a language plus a run-time environment with graphics, a […]

- 6,630 feed subscribers
### Categories

- Analytics (18)
- Big Data (8)
- Business Intelligence (57)
- Data Modeling (7)
- Data Warehousing (19)
- Reporting (4)

- Data Science (67)
- Computer Science (6)
- Algorithms (2)
- Machine Learning (4)

- Data Visualization (17)
- Mathematics (2)
- Programming (18)
- Statistics (14)

- Computer Science (6)
- Miscellaneous (16)
- Data Sources (3)

### Tags

2008 Analysis Analytics Article Big Data Book Business Intelligence Charts Cognos Dashboards Data Data Warehouse Design Dimensional Fusion Tables Google Hadoop Humor IBM Learning Logical Market Microsoft Model Modeling Operational Predictive Programming Python R Ralph Kimball Reporting Science Server SQL SQL Server SSIS Statistics TED Tools Tutorial Unstructured Video Visualization Warehousing