Apache Hadoop YARN changes the game for Hadoop applications, enabling a multi-application, multi-workload general purpose data operating system. YARN is:
-
Flexible
Store data once and interact with it in multiple ways from batch to interactive to real time and streaming.
Architected to enable new workloads.
-
Shared
Re-use key platform services for reliability, redundancy and security across multiple workloads.
Multi-tenant architecture shares core resources while isolating services and data.
-
Efficient
Do more with less: 30%+ increased efficiency on existing resource utilization.
Share and segment applications based on cluster resource management.
This set of resources is intended to get you up and running developing apps for YARN.
STEP 1. Understand the motivations and architecture for YARN.
Apache Hadoop YARN is the data operating system for Hadoop 2.0. YARN enables a user to interact with all data in multiple ways simultaneously, making Hadoop a true multi-use data platform and allowing it to take its place in a modern data architecture. Find out more about the concepts and specifics of YARN.
Get an overview of Apache Hadoop YARN concepts in this slide deck.
Concepts
- Introducing Apache Hadoop YARN
- Apache Hadoop YARN – Background and an Overview
- Apache Hadoop YARN – Concepts and Applications
- Apache Hadoop YARN – ResourceManager
- Apache Hadoop YARN – NodeManager
Building Apps
- Running existing applications on Hadoop 2 YARN
- Stabilizing YARN APIs for Apache Hadoop 2
- Management of Application Dependencies
- Resource Localization in YARN: Deep Dive
- Simplifying user-logs management and access in YARN
STEP 2. Explore example applications on YARN.
The simple applications in this section show how to build and deploy apps against the YARN APIs and are a simple way to get started. These apps can be easily replicated in the Hortonworks Sandbox VM environment.
-
Simple YARN App. This ‘Hello World’ app for YARN runs n copies of a unix command.
-
Distributed Shell. This fuller example implements a distributed shell on YARN.
- MemcacheD on YARN. A tutorial showing how to deploy the very popular MemcacheD framework on YARN.
STEP 3. Examine real world applications YARN.
These applications are richer applications built on YARN and demonstrate real-world use and deployment.
-
MapReduce on YARN The official codebase for Apache Hadoop MapReduce on YARN (MR2)
-
HBase on YARN. Efforts to deploy HBase on YARN.
FURTHER RESOURCES
The following resources can also assist with developing Hadoop-based Apps on YARN.
-
Get Started with Hadoop 2.0 with this reference presentation
TRAINING
Hortonworks also provides training and certification for Hadoop.