Sunday 19 October 2014

Functional Overview of Yarn




The three main yarn components work together to deliver a new level of functionality to Apache Hadoop.

  • The ResourceManager acts as a pure scheduler controlling the use of cluster resources in the form of resource containers (e.g., CPUs, memory). 
  • User applications are under the control of an application-specific ApplicationMaster (itself a container) that must negotiate the use of additional containers with the ResourceManager at run time.
  • Once the ApplicationMaster has been given resources, it works with the per-node NodeManagers to start and monitor containers on the cluster nodes. 
  • Containers are flexible and can be released and requested as the application progresses.
  • ResourceManager uses three scheduling options: FIFO, capacity, and fair share to best match the user needs with the available cluster resources.
A Container thus represents a resource (memory, CPU) on a single node in a given cluster. A container is supervised by the NodeManager and scheduled by the ResourceManager.
Each application starts out as an ApplicationMaster, which is itself a container (often referred to as container 0). Once started as a container, the ApplicationMaster must negotiate with the ResourceManager for more containers.



No comments:

Post a Comment