I’ve been spending quite a bit of time learning Hadoop and realize there is a ton of books, videos, and blogs out there that are currently teaching Hadoop from a point of view of Hadoop version 1.x yet Hadoop 2.0 has been out since really May 2012 and more so is radically different. I [...]
Why Do You Have to Learn Linear Algebra?
I am always learning new things and as I take on more and more things to learn I am constantly forcing myself to first ask “Why is this important?” because obviously if I can’t find a reason clearly and easily enough then I usually won’t go down that [...]
The start of a bad joke goes like this: What do Kevin Spacey, Politics, Netflix, and Big Data all have in common? Yeah well they actually do have a lot in common. They are more than anything extremely addictive. I haven’t met anyone yet (I know I am about to eat my [...]
I could easily study and write solely on the topic of Big Data. I could dive deep into every single Apache project and all the other software offerings, white papers, and technologies around big data and I’d have a lot to write about. The challenge with this is that we are not robots, we can’t [...]
Reporting, BI, and the lack of Mathematics
I have done reporting, data warehousing, and overall BI development for a while now and the one thing I always struggled with is how easy and common it was to be presenting numbers and yet not be very mathematical at all. What I mean is the most [...]
The Central Limit Theorem (CLT) is a critical topic in statistics. Here are a handful of links and a video from Khan Academy to get you up to speed on the CLT.
I love examples that are so blatantly clear that even people who aren’t as crazy as I am about data can get it. Anscombe’s Quartet is exactly that, it shows that we must look at data visually to really ensure we understand the data.
Anscombe’s quartet comprises four datasets that have nearly identical simple statistical [...]
I am slowly starting to get into Kaggle and want to eventually be one of the top competitors. I thought to myself I wonder what the top 10 competitors are like from a skill set? Well here are the top 10 as of December 21st, 2013:
You can find the up to date list [...]
I am a huge fan of trying to take the complex, detailed, and unmanageable into a concise, usable and understandable list or outline. I love seeing subjects like data architecture management, data modeling, programming languages, and even Data Science put into visual models that make it easier for others to see conceptually how it exists. [...]
You up for learning something new? or maybe like me want to “do” statistics? If you are looking to do data analysis then R is a great tool/language to learn.
What is R?
R is a system for statistical computation and graphics. It consists of a language plus a run-time environment with graphics, a [...]
- 5,887 feed subscribers
- Analytics (18)
- Big Data (8)
- Business Intelligence (57)
- Data Science (67)
- Miscellaneous (16)
- Data Sources (3)
Tags2008 Analysis Analytics Article Big Data Book Business Intelligence Charts Cognos Dashboards Data Data Warehouse Design Dimensional Fusion Tables Google Hadoop Humor IBM Learning Logical Market Microsoft Model Modeling Operational Predictive Programming Python R Ralph Kimball Reporting Science Server SQL SQL Server SSIS Statistics TED Tools Tutorial Unstructured Video Visualization Warehousing