Learning YARN

Moving beyond MapReduce - learn resource management and big data processing using YARN
Preview in Mapt

Learning YARN

Akhil Arora, Shrey Mehrotra

Moving beyond MapReduce - learn resource management and big data processing using YARN
Mapt Subscription
FREE
$29.99/m after trial
eBook
$25.20
RRP $35.99
Save 29%
Print + eBook
$44.99
RRP $44.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$25.20
$44.99
$29.99 p/m after trial
RRP $35.99
RRP $44.99
Subscription
eBook
Print + eBook
Start 30 Day Trial

Frequently bought together


Learning YARN Book Cover
Learning YARN
$ 35.99
$ 25.20
Mastering Machine Learning Algorithms Book Cover
Mastering Machine Learning Algorithms
$ 35.99
$ 25.20
Buy 2 for $35.00
Save $36.98
Add to Cart

Book Details

ISBN 139781784393960
Paperback278 pages

Book Description

Today enterprises generate huge volumes of data. In order to provide effective services and to make smarter and more intelligent decisions from these huge volumes of data, enterprises use big-data analytics. In recent years, Hadoop has been used for massive data storage and efficient distributed processing of data. The Yet Another Resource Negotiator (YARN) framework solves the design problems related to resource management faced by the Hadoop 1.x framework by providing a more scalable, efficient, flexible, and highly available resource management framework for distributed data processing.

This book starts with an overview of the YARN features and explains how YARN provides a business solution for growing big data needs. You will learn to provision and manage single, as well as multi-node, Hadoop-YARN clusters in the easiest way. You will walk through the YARN administration, life cycle management, application execution, REST APIs, schedulers, security framework and so on. You will gain insights about the YARN components and features such as ResourceManager, NodeManager, ApplicationMaster, Container, Timeline Server, High Availability, Resource Localisation and so on.

The book explains Hadoop-YARN commands and the configurations of components and explores topics such as High Availability, Resource Localization and Log aggregation. You will then be ready to develop your own ApplicationMaster and execute it over a Hadoop-YARN cluster.

Towards the end of the book, you will learn about the security architecture and integration of YARN with big data technologies like Spark and Storm. This book promises conceptual as well as practical knowledge of resource management using YARN.

Table of Contents

Chapter 1: Starting with YARN Basics
Introduction to MapReduce v1
Shortcomings of MapReducev1
An overview of YARN components
The YARN architecture
How YARN satisfies big data needs
Projects powered by YARN
Summary
Chapter 2: Setting up a Hadoop-YARN Cluster
Starting with the basics
The Hadoop-YARN single node installation
An overview of web user interfaces
The Hadoop-YARN multi-node installation
An overview of the Hortonworks and Cloudera installations
Summary
Chapter 3: Administering a Hadoop-YARN Cluster
Using the Hadoop-YARN commands
Configuring the Hadoop-YARN services
Managing the Hadoop-YARN services
Monitoring the YARN services
Understanding ResourceManager's High Availability
Monitoring NodeManager's health
Summary
Chapter 4: Executing Applications Using YARN
Understanding application execution flow
Submitting a sample MapReduce application
Handling failures in YARN
YARN application logging
Summary
Chapter 5: Understanding YARN Life Cycle Management
An introduction to state management analogy
The ResourceManager's view
The NodeManager's view
Analyzing transitions through logs
Summary
Chapter 6: Migrating from MRv1 to MRv2
Introducing MRv1 and MRv2
High-level changes from MRv1 to MRv2
The migration steps from MRv1 to MRv2
Running and monitoring MRv1 apps on YARN
Summary
Chapter 7: Writing Your Own YARN Applications
An introduction to the YARN API
Writing your own application
Summary
Chapter 8: Dive Deep into YARN Components
Understanding ResourceManager
Understanding NodeManager
The YARN Timeline server
The web application proxy server
YARN Scheduler Load Simulator (SLS)
Handling resource localization in YARN
Summary
Chapter 9: Exploring YARN REST Services
Introduction to YARN REST services
ResourceManager REST APIs
NodeManager REST APIs
MapReduce ApplicationMaster REST APIs
MapReduce HistoryServer REST APIs
How to access REST services
Summary
Chapter 10: Scheduling YARN Applications
An introduction to scheduling in YARN
An overview of queues
Types of queues
An introduction to schedulers
Summary
Chapter 11: Enabling Security in YARN
Adding security to a YARN cluster
Working with ACLs
Other security frameworks
Summary
Chapter 12: Real-time Data Analytics Using YARN
The integration of Spark with YARN
The integration of Storm with YARN
The integration of HAMA and Giraph with YARN
Summary

What You Will Learn

  • Explore YARN features and offerings
  • Manage big data clusters efficiently using the YARN framework
  • Create single as well as multi-node Hadoop-YARN clusters on Linux machines
  • Understand YARN components and their administration
  • Gain insights into application execution flow over a YARN cluster
  • Write your own distributed application and execute it over YARN cluster
  • Work with schedulers and queues for efficient scheduling of applications
  • Integrate big data projects like Spark and Storm with YARN

Authors

Table of Contents

Chapter 1: Starting with YARN Basics
Introduction to MapReduce v1
Shortcomings of MapReducev1
An overview of YARN components
The YARN architecture
How YARN satisfies big data needs
Projects powered by YARN
Summary
Chapter 2: Setting up a Hadoop-YARN Cluster
Starting with the basics
The Hadoop-YARN single node installation
An overview of web user interfaces
The Hadoop-YARN multi-node installation
An overview of the Hortonworks and Cloudera installations
Summary
Chapter 3: Administering a Hadoop-YARN Cluster
Using the Hadoop-YARN commands
Configuring the Hadoop-YARN services
Managing the Hadoop-YARN services
Monitoring the YARN services
Understanding ResourceManager's High Availability
Monitoring NodeManager's health
Summary
Chapter 4: Executing Applications Using YARN
Understanding application execution flow
Submitting a sample MapReduce application
Handling failures in YARN
YARN application logging
Summary
Chapter 5: Understanding YARN Life Cycle Management
An introduction to state management analogy
The ResourceManager's view
The NodeManager's view
Analyzing transitions through logs
Summary
Chapter 6: Migrating from MRv1 to MRv2
Introducing MRv1 and MRv2
High-level changes from MRv1 to MRv2
The migration steps from MRv1 to MRv2
Running and monitoring MRv1 apps on YARN
Summary
Chapter 7: Writing Your Own YARN Applications
An introduction to the YARN API
Writing your own application
Summary
Chapter 8: Dive Deep into YARN Components
Understanding ResourceManager
Understanding NodeManager
The YARN Timeline server
The web application proxy server
YARN Scheduler Load Simulator (SLS)
Handling resource localization in YARN
Summary
Chapter 9: Exploring YARN REST Services
Introduction to YARN REST services
ResourceManager REST APIs
NodeManager REST APIs
MapReduce ApplicationMaster REST APIs
MapReduce HistoryServer REST APIs
How to access REST services
Summary
Chapter 10: Scheduling YARN Applications
An introduction to scheduling in YARN
An overview of queues
Types of queues
An introduction to schedulers
Summary
Chapter 11: Enabling Security in YARN
Adding security to a YARN cluster
Working with ACLs
Other security frameworks
Summary
Chapter 12: Real-time Data Analytics Using YARN
The integration of Spark with YARN
The integration of Storm with YARN
The integration of HAMA and Giraph with YARN
Summary

Book Details

ISBN 139781784393960
Paperback278 pages
Read More

Read More Reviews

Recommended for You

Hadoop Essentials Book Cover
Hadoop Essentials
$ 23.99
$ 16.80
Building Recommendation Engines Book Cover
Building Recommendation Engines
$ 39.99
$ 28.00
Apache Spark 2 for Beginners Book Cover
Apache Spark 2 for Beginners
$ 31.99
$ 22.40
Mastering Ubuntu Server Book Cover
Mastering Ubuntu Server
$ 39.99
$ 28.00
Hadoop 2.x Administration Cookbook Book Cover
Hadoop 2.x Administration Cookbook
$ 39.99
$ 28.00
Hadoop: Data Processing and Modelling Book Cover
Hadoop: Data Processing and Modelling
$ 69.99
$ 49.00