Preface
Hadoop has now grown into a huge community backed open source project and is the base for several other open source projects as well.The usability of Hadoop is now enhanced by an ecosystem of Apache projects, such as Pig and Hive.
This half day hands-on workshop session is targeted at Hadoopers covering a high level introduction on Hadoop and will compare the effectiveness of the different Hadoop based platforms. This session will primarily focus on effective implementation using Pig.
The Pig platform allows Hadoopers to focus more on analyzing large data sets and spend less time having to write mapper and reducer programs, and interestingly handles any kind of Data.
Rajesh Balamohan
Rajesh Balamohan has been working on Hadoop ecosystem since 2009. His main interests are in performance tuning and has been involved recently in chasing performance problems of Hive/Tez in Hadoop ecosystem.
Target Audience for workshop
Beginner / Intermediate
Agenda of workshop
- High level introduction to Hadoop
- MapReduce
- HDFS
- Introduction to Pig
- Features of Pig Latin
- Working session on Pig with examples
- Best Practices
- Q/A
Requirements
- Laptop