Quantcast
Browsing all 13 articles
Browse latest View live

Apache Hadoop Developer Training Helps Query Massive Telecom Data

I was given the opportunity to write a guest blog post for the Cloudera Blog. You can read it here : Apache Hadoop Developer Training Helps Query Massive Telecom Data.

View Article


[HOW TO] Install Apache Hive

Important: I have made a complete screencast demonstrating the installation of Apache Hive. You can find it at Hadoop Screencasts – Episode 4: Installing Apache Hive. You can ignore the post below and...

View Article


Image may be NSFW.
Clik here to view.

Introducing Hadoop Screencasts

Apache Hadoop has gained lot of attention over the past few years and there are many organizations using Apache Hadoop to process their large (Terabytes to Petabytes) data. This change has also got...

View Article

Apache Hive – Getting Started

The Apache Hive™ data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the...

View Article

Apache Hadoop Streaming

You can find my screencast for Apache Hadoop Streaming here at www.hadoopscreencasts.com Apache Hadoop Streaming is a feature that allows developers to write MapReduce applications using languages like...

View Article


Apache Sqoop

Apache Sqoop(TM) is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.- from http://sqoop.apache.org/ In this post we...

View Article

Apache Pig Tutorial – Part 1

Apache Pig is a tool used to analyze large amounts of data by represeting them as data flows. Using the PigLatin scripting language operations like ETL (Extract, Transform and Load), adhoc data...

View Article

Apache Pig Tutorial – Part 2

Let’s have a quick look at the FILTER command from our Part 1: grunt> movies_greater_than_four = FILTER movies BY (float)rating>4.0; Here, we see a (float) keyword placed before the column...

View Article


Image may be NSFW.
Clik here to view.

Apache Oozie Installation

In this post we will be going through the steps to install Apache Oozie Server and Client. These instructions assume that you have Hadoop installed and running. My Hadoop Location : /home/hduser/hadoop...

View Article


Image may be NSFW.
Clik here to view.

Cloudera Administration Handbook

I am really happy to share that my book, Cloudera Administration Handbook has been published and is available for you at Cloudera Administration Handbook.

View Article

Frequent Asked Questions (Hadoop)

It has been a while since my last post and over that period I have received several questions via comments on my different posts. Almost all of the questions are related to Hadoop and I thought of...

View Article

Image may be NSFW.
Clik here to view.

Introducing Apache Kafka – Part One

What is Apache Kafka? From http://kafka.apache.org/ The Apache Kafka project page defines Apache Kafka as a publish-subscribe messaging rethought as a distributed commit log. Kafka is a high-throughput...

View Article

[HOW TO] Install Apache Hive

Important: I have made a complete screencast demonstrating the installation of Apache Hive. You can find it at Hadoop Screencasts – Episode 4: Installing Apache Hive. You can ignore the post below and...

View Article

Browsing all 13 articles
Browse latest View live