Course Overview

Real World Hadoop in the Enterprise is targeted towards the Java Developer and assumes working knowledge of Java programming in Eclipse and comfort in a Unix shell environment. We will go well beyond the "Hello World" word-count example into practical, applied uses of Hadoop in large-scale real-world scenarios, including fraud detection, algorithmic trading, and data mining. Students will develop in an environment architected for a dynamically changing business-rule driven infrastructure with multiple disparate data sources and large-scale datasets on a real Hadoop/Drools cluster.

Apache Hadoop is an OpenSource™ framework for creating reliable and distributable compute clusters. Credited with the IBM Watson Jeopardy win in 2011, Hadoop can be used (with other related frameworks) to process large unstructured or semi-structured data sets from multiple sources to dissect, classify, learn from and make suggestions for business analytics, decision support, and other advanced forms of machine intelligence.

Key Learning Areas

  • Learn practical, advanced uses of Hadoop, targeted at the developer experience.
  • Explore real-world scenarios such as fraud detection and data mining with concrete examples.
  • Learn by doing: Develop a Hadoop architecture on a real Hadoop/Drools cluster.
  • Focused learning via Hands-on-Labs throughout the course.

Course Outline

  • Map/Reduce
  • Hadoop Architecture
  • Retrieving and Localizing Data
  • Feeding Hadoop in the Enterprise
  • Scheduling the YARN
  • Machine Learning with Mahout
  • Applying Business Rules with Drools
  • Pig and Pig Pipelines
  • Working with Hive
  • Testing, Performance, and Troubleshooting
  • Other Optional Topics: Storm Project, Apache Kafka, Cassandra Bolt

Who Benefits

This class is focused on the Hadoop 2.6 release, but most features can be used on earlier 2.x releases.

This hands-on class is approximately 40/60 lab to lecture ratio, combining engaging lecture, demos, group activities and discussions with comprehensive machine-based practical programming labs and project work.


Attending students should have practical skills or hands-on experience in developing Java with Eclipse.