logo bigbox
Groundbreaking solutions. data transformation.

Whether your business is early in its journey or well on its way to digital transformation, our solutions and technologies help chart a path to success.

Learn more...

Why Bigbox

Choosing Bigbox
Reasons why companies choose us
About Us
Get closer with us
Events
Read the latest stories and product updates
Locations
Check our locations
Partners
see our partners
Resources
Check our resources

KONFIGURASI MANAJEMEN BERBASIS GRAPHICAL USER INTERFACE


Manajemen User



BigBox has two types of users and groups: Local and LDAP/AD. When LDAP is not defined, bigbox automatically set up FreeIPA as LDAP Server and sync all user created locally.

When the external LDAP server exist, to access LDAP/AD users and groups, the BigBox Server must be configured to communicate with an LDAP server or an Active Directory domain.

In the user management menu, admins can choose whether to use external LDAP or FreeIPA as internal LDAP. If FreeIPA is selected, the user will also be created in FreeIPA. But if external LDAP is used, then the user must be created first

Users and groups are created, managed, and deleted by BigBox administrators and maintained in the BigBox User Management. An administrator assigns permissions to users and groups. These permissions determine user and group privileges. Privileges determine what a user or group is allowed to see or do when using BigBox suite.



Ambari User and Group Permissions

Table

Description automatically generated with medium confidence

Ambari User and Group Permissions

Ambari features four levels of user and group permissions. They are no permission, Read-Only permission, Operator permission, and Admin permission. These permissions determine how a user can use Ambari to interact with Hadoop services and configurations. Services are such things as HDFS, YARN, Hive, and so on. Services, for example, can be stopped and started. Configurations affect running services and the cluster topology.

The default for newly created Ambari users and groups is no permissions. These users may log in to Ambari but cannot see any service or cluster information, or perform any actions.

Users or groups with Read-Only permission may log in to Ambari and view, but not modify, service and configuration information.

Those users and groups with Operator permission may log in to Ambari and view service and configuration information. They may also start, restart, and stop existing services as well as add new services. They may also modify current configurations or revert to previous configurations.

Ambari users with Admin permission may do anything. They may log in, view information, and manage services and configurations. In addition, they may also create new users and groups, manage group membership, and assign permissions to users and groups. They may even create other Ambari users with Admin permission. By default during installation, an Ambari admin user account is created and assigned Admin permission.

In addition to the four permission types, users and groups may be assigned access to one or more Ambari Views. An Ambari View is a optional, custom “plug-in” tool added to the Ambari Web UI and designed to perform one or more specific tasks.


Admin Privileges for Local Versus LDAP/AD Users


Table

Description automatically generated

Admin Privileges for Local Versus LDAP/AD Users

As an Ambari user with Admin permission, you can create new local users, delete local users, change local user passwords, and edit user settings. You can also control certain privileges for local and LDAP/AD users. However, many user and group management functions are not available to an Ambari administrator for LDAP/AD users and groups. This is because you need an LDAP/AD user or group management tool to manage LDAP/AD users and groups. Ambari can import these users and groups and can assign them Ambari permissions, but Ambari cannot administer the user and groups themselves.

The table shown here lists the privileges available as well as those not available to an Ambari administrator user.



Managing Users, Groups and Permissions

Graphical user interface, application, Teams

Description automatically generated

Select Manage Ambari on the admin Menu

Select Manage Ambari on the admin menu button to open the browser page to manage users, groups, and permissions.


Click the Users button or the Users link to manage users. Click the Groups button or the Groups link to manage groups. Click the Manage Permissions button or the Permissions link to manage user and group permissions.


Listing, Modifying, or Creating a User


Graphical user interface, text, application, chat or text message

Description automatically generated

Working with Users

Clicking the Users link opens the User page. This page lists any existing users. You may click any existing user to open the page to modify that user’s settings. This includes changing the user’s password.

Click the Create Local User button to create a new local user.


Creating a New User



Graphical user interface, application

Description automatically generated

Creating a Local User


The Create Local User page is used to create a new local user. LDAP/AD is not a choice because LDAP/AD users and groups must be created using LDAP/AD management tools.

Type the new user name.

The new user may be configured as Active or Inactive. Only active users may log in to Ambari. Click Active to switch to Inactive. Click Inactive to switch to Active.

The user may also be assigned Ambari Admin permission. The default is No. Click No to switch to Yes. Click Yes to switch to No.

Provide a password for the new user. Users may not change their own passwords. Only a user with Admin permissions may change passwords.

Click Save to add the new user.


Listing, Modifying, or Creating a Group



Graphical user interface, text, application, chat or text message

Description automatically generated

Working with Groups


Clicking either the Groups button or the Groups link opens the Groups page. This page lists any existing groups. You may click any existing group to open the page to add or remove members or to delete the group.

Click the Create Local Group button to create a new Local group.


Creating a New Group


Graphical user interface, text, application, email

Description automatically generated

Creating a Local Group

Use the Create Local Group page to create a new local group. LDAP/AD is not a choice because LDAP/AD users and groups must be created using LDAP/AD management tools.

Type the new group name.

Click Save to complete adding the new group.

The new group will appear on the lists of groups displayed in the browser window. To add users as members of the group, click the group name. A browser page opens that enables you to add user names as group members.

Managing User and Group Permissions



Graphical user interface

Description automatically generated

Managing User and Group Permissions

Clicking the Permissions link opens the cluster’s Permissions page. Use this page to assign Operator or Read-Only permissions to any user or group.

Permissions are additive. A user assigned Read-Only permission who belongs to a group with Operator permission will get Operator privileges. More restrictive user-level permissions do not override more generous group-level permissions.

To add the first user or group to a field, click the Add User or Add Group box. To add a second user or group, click your mouse pointer in the gray area of a box and the pencil icon will be displayed. Click the pencil icon to add another user or group name. The X buttons are used to cancel an add operation or delete an existing user or group. The check button is used as a save button.

MANAJEMEN NODE, CLUSTER ATAU MULTI-CLUSTER


Monitoring and Managing Your Cluster


Cluster Dashboard

Use the Dashboard to view the operating status of your cluster. Each metrics widget displays status information for a single service in your BigLake cluster. The Dashboard displays all metrics for the HDFS, YARN, HBase, and Storm services, and cluster-wide metrics by default. You can add and remove individual widgets, and rearrange the Dashboard by dragging and dropping each widget to a new location in the dashboard.


A screenshot of a computer

Description automatically generated



Note
Each Service installed in your cluster also has a Service-specific dashboard. Refer to the Managing Services
Modifying the Service Dashboard section for more information.


Widget Descriptions

The Dashboard includes metrics for the following services. View Metrics that indicate the operating status of your cluster on the Dashboard. Each metrics widget displays status information for a single service in your BigLake cluster. The Dashboard displays all metrics for the HDFS, YARN, HBase, and Storm services, and cluster-wide metrics by default. You can add and remove individual widgets, and rearrange the dashboard by dragging and dropping each widget to a new location in the dashboard.


The Dashboard includes metrics for the following services:

Ambari Service Metrics and Descriptions

Figure 7.a. Service Metrics and Descriptions



Ambari Service Metrics and Descriptions

Figure 7.b. Service Metrics and Descriptions


Widget Descriptions

To see more detailed information about a service, hover your cursor over a Metrics widget.
More detailed information about the service displays, as shown in the following example:


Graphical user interface, text, application, chat or text message

Description automatically generated

Figure 8. Widget Details

  • To remove a widget from the mashup, click the white X.

  • To edit the display of information in a widget, click the pencil


Widget Details

The HDFS Links widgets for which links to more metrics information, such as thread stacks, logs and native component UIs are available. For example, you can link to NameNode, Secondary NameNode, and DataNode components for HDFS, using the links shown in the following example:


Graphical user interface, text, application, chat or text message

Description automatically generated

Linking to Service UIs

Figure 9. Linking to Service UIs

Choose the More drop-down to select from the list of links available for each service. The Dashboard includes additional links to metrics for the following services:

Link to More Metrics for Yava Services

Figure 10. Link to More Metrics for BigLake Services


Linking to Service UIs

Cluster-wide metrics display information that represents your whole cluster. The Dashboard shows the following cluster-wide metrics:

Graphical user interface, application

Description automatically generated

Figure: Viewing Cluster-Wide Metrics


Ambari Cluster-Wide Metrics and Descriptions

Figure: Cluster-Wide Metrics and Descriptions

  • To remove a widget from the dashboard, click the white X.

  • Hover your cursor over each cluster-wide metric to magnify the chart or itemize the widget display.

  • To remove or add metric items from each cluster-wide metric widget, select the item on the widget legend.

  • To see a larger view of the chart, select the magnifying glass icon.

BigLake displays a larger version of the widget in a pop-out window, as shown in the following example:


Graphical user interface

Description automatically generated

Figure: Metrics Cluster-Wide Widgets Large Version

Use the pop-up window in the same ways that you use cluster-wide metric widgets on the dashboard.
To close the widget pop-up window, choose OK.



Modifying the Cluster Dashboard

You can customize the Dashboard in the following ways:


Adding a Widget to the Dashboard

To replace a widget that has been removed from the dashboard:

  1. Select the Metrics drop-down, as shown in the following example:


Graphical user interface, text, application, chat or text message

Description automatically generated


Figure: Adding a Widget to the Dashboard


  1. Choose Add.

  2. Select a metric.

  3. Choose Apply.


Customizing Widget Display

To customize the way a service widget displays metrics information:


Figure: Customizing Widget Display


  1. Follow the instructions in the Customize Widget pop-up to customize widget appearance. In this example, you can adjust the thresholds at which the HDFS Capacity bar chart changes color, from green to orange to red.

  2. To save your changes and close the editor, choose Apply.

  3. To close the editor without saving any changes, choose Cancel.


Note
Not all widgets support editing.



Viewing Cluster Heatmaps

Heatmaps provides a graphical representation of your overall cluster utilization using simple color coding.


Figure: Viewing Cluster Heatmaps

A colored block represents each host in your cluster. To see more information about a specific host, hover over the block representing the host in which you are interested. A pop-up window displays metrics about BigLake components installed on that host. Colors displayed in the block represent usage in a unit appropriate for the selected set of metrics. If any data necessary to determine state is not available, the block displays “Invalid Data”. Changing the default maximum values for the heatmap lets you fine tune the representation. Use the Select Metric drop-down to select the metric type.


Graphical user interface, application, Teams

Description automatically generated

Figure: Cluster Metrics mode heatmaps

Heatmaps supports the following metrics:


Cluster Heatmaps Supports Metrics

Figure 19. Cluster Heatmaps Supports Metrics


Managing Hosts

Use Ambari Hosts to manage multiple BigLake components such as DataNodes, NameNodes, NodeManagers and RegionServers, running on hosts throughout your cluster. For example, you can restart all DataNode components, optionally controlling that task with rolling restarts. Ambari Hosts supports filtering your selection of host components, based on operating status, host health, and defined host groupings.


Performing Host-Level Actions

  • Hosts
    lists selected, filtered or all hosts options, based on your selections made using Hosts home and Filters.

  • Objects
    lists component objects that match your host selection criteria.

  • Operations
    lists all operations available for the component objects you selected.

For example, to restart DataNodes on one host:

  1. In Hosts, select a host running at least one DataNode.

  2. In Actions, choose Selected Hosts > DataNodes > Restart, as shown in the following image.


Figure 21. Performing Host-Level Actions


  1. Choose OK to confirm starting the selected operation.

  2. Optionally, use Monitoring Background Operations to follow, diagnose or troubleshoot the restart operation.


Decommissioning Masters and Slaves

Decommissioning is a process that supports removing a component from the cluster. You must decommission a master or slave running on a host before removing the component or host from service. Decommissioning helps prevent potential loss of data or service disruption. Decommissioning is available for the following component types:

  • DataNodes

  • NodeManagers

  • RegionServers

Decommissioning executes the following tasks:

  • For DataNodes, safely replicates the HDFS data to other DataNodes in the cluster.

  • For NodeManagers, stops accepting new job requests from the masters and stops the component.

  • For RegionServers, turns on drain mode and stops the component.


How to Decommission a Component

To decommission a component using Ambari Web, browse Hoststo find the host FQDN on which the component resides.
Using Actions, select HostsComponent Type, then choose Decommission.

For Example


Figure 22. Decommission a Component

The UI shows “Decommissioning” status while steps process, then “Decommissioned” when complete.

UI Status

Figure 23. UI Status


How to Delete a Component

To delete a component using Ambari Web, on Hostschoose the host FQDN on which the component resides.

  1. In Components, find a decommissioned component.

  2. Stop the component, if necessary.
    Note : A decommissioned slave component may restart in the decommissioned state.

  3. For a decommissioned component, choose Delete from the component drop-down menu.
    Note : Restarting services enables Ambari to recognize and monitor the correct number of components.

Deleting a slave component, such as a DataNode does not automatically inform a master component, such as a NameNode to remove the slave component from its exclusion list. Adding a deleted slave component back into the cluster presents the following issue; the added slave remains decommissioned from the master’s perspective. Restart the master component, as a work-around.


Deleting a Host from a Cluster

Deleting a host removes the host from the cluster. Before deleting a host, you must complete the following prerequisites:

  • Stop all components running on the host.

  • Decommission any DataNodes running on the host.

  • Move from the host any master components, such as NameNode or ResourceManager, running on the host.

  • Turn Off Maintenance Mode, if necessary, for the host.


How to Delete a Host from a Cluster

  1. In Hosts, click on a host name.

  2. On the Host-Details page, select Host Actions drop-down menu.

  3. Choose Delete.

If you have not completed prerequisite steps, a warning message similar to the following one appears:


Delete a Host from a Cluster

Figure 24. Delete a Host from a Cluster


Setting Maintenance Mode

Maintenance Mode supports suppressing alerts and skipping bulk operations for specific services, components and hosts in an Ambari-managed cluster. You typically turn on Maintenance Mode when performing hardware or software maintenance, changing configuration settings, troubleshooting, decommissioning, or removing cluster nodes. You may place a service, component, or host object in Maintenance Mode before you perform necessary maintenance or troubleshooting tasks.

Maintenance Mode affects a service, component, or host object in the following two ways:

  • Maintenance Mode suppresses alerts, warnings and status change indicators generated for the object

  • Maintenance Mode exempts an object from host-level or service-level bulk operations

Explicitly turning on Maintenance Mode for a service implicitly turns on Maintenance Mode for components and hosts that run the service. While Maintenance Mode On prevents bulk operations being performed on the service, component, or host, you may explicitly start and stop a service, component, or host having Maintenance Mode On.


Setting Maintenance Mode for Services, Components, and Hosts

For example, examine using Maintenance Mode in a 3-node, Ambari-managed cluster installed using default options. This cluster has one data node, on host c6403. This example describes how to explicitly turn on Maintenance Mode for the HDFS service, alternative procedures for explicitly turning on Maintenance Mode for a host, and the implicit effects of turning on Maintenance Mode for a service, a component and a host.


Adding Hosts to a Cluster

To add new hosts to your cluster, browse to the Hosts page and select Actions >+Add New Hosts. The Add Host Wizard provides a sequence of prompts similar to those in the Ambari Install Wizard. Follow the prompts, providing information similar to that provided to define the first set of hosts in your cluster.


Adding Hosts to a Cluster

Figure 25. Adding Hosts to a Cluster


Rack Awareness

Ambari can manage Rack information for hosts. By setting the Rack ID, Ambari can display the hosts in heatmaps by Rack ID, as well users can filter & find hosts based on Rack ID on the Hosts page.


Figure: Rack Awareness


If HDFS is installed in your cluster, Ambari will pass this Rack ID information to HDFS via a topology script. Ambari generates a topology script at /etc/hadoop/conf/topology.py and sets the net.topology.script.file.name property in core-site automatically. This topology script reads a mappings file /etc/hadoop/conf/topology_mappings.data that Ambari automatically generates. When you make changes to Rack ID assignment in Ambari, this mappings file will be updated when you push out the HDFS configuration.



Managing Services

Use Services to monitor and manage selected services running in your Hadoop cluster. All services installed in your cluster are listed in the leftmost Services panel.


Starting and Stopping All Services

To start or stop all listed services at once, select Actions, then choose Start All or Stop All, as shown in the following example:


Starting and Stopping All Services

Figure 27. Starting and Stopping All Services


Adding a Service

The Ambari install wizard installs all available Hadoop services by default. You may choose to deploy only some services initially, then add other services at later times. For example, many customers deploy only core Hadoop services initially. Add Service supports deploying additional services without interrupting operations in your Hadoop cluster. When you have deployed all available services, Add Service displays disabled.

Adding a Service to your Hadoop cluster

This example shows the Falcon service selected for addition.

  1. Choose Services.
    Choose an available service. Alternatively, choose all to add all available services to your cluster. Then, choose Next. The Add Service wizard displays installed services highlighted green and check-marked, not available for selection.


Choose Services

Figure 28. Choose Services


Example, Adding Storm Service


Adding Storm Service

Figure 29. Adding Storm Service


  1. In Assign Masters, confirm the default host assignment. Alternatively, choose a different host machine to which master components for your selected service will be added. Then, choose Next. The Add Services Wizard indicates hosts on which the master components for a chosen service will be installed. A service chosen for addition shows a grey check mark.
    Using the drop-down, choose an alternate host name, if necessary.

  • A green label located on the host to which its master components will be added, or

  • An active drop-down list on which available host names appear.


Figure: Assign Master


  1. In Assign Slaves and Clients, accept the default assignment of slave and client components to hosts. Then, choose Next.
    Alternatively, select hosts on which you want to install slave and client components. You must select at least one host for the slave of each service being added.


Host Role Required For Adding Service

Figure: Host Role Required For Adding Service


The Add Service Wizard skips and disables the Assign Slaves and Clients step for a service requiring no slave nor client assignment. memerlukan slave atau client assignment.


Figure: Assign Slaves dan Client


  1. In Customize Services, accept the default configuration properties.
    Alternatively, edit the default values for configuration properties, if necessary. Choose Override to create a configuration group for this service. Then, choose Next.

Figure: Customize Services


  1. In Review, make sure the configuration settings match your intentions. Then, choose Deploy


Figure 34. Review Services


  1. Monitor the progress of installing, starting, and testing the service. When the service installs and starts successfully, choose Next


 Install, Start and Test

Figure 35. Install, Start and Test


  1. Summary displays the results of installing the service. Choose Complete.


Summary

Figure 36. Summary


  1. Restart any other components having stale configurations.


Editing Service Config Properties

Select a service, then select Configs to view and update configuration properties for the selected service. For example, select MapReduce2, then select Configs. Expand a config category to view configurable service properties. For example, select General to configure Default virtual memory for a job’s map task.




Figure: Editing Service Config Properties



Viewing Service Summary and Alerts

After you select a service, the Summary tab displays basic information about the selected service.


Figure: Viewing Service Summary and Alerts


Select one of the View Host links, as shown in the following example, to view components and the host on which the selected service is running.


Service Condition In Running

Figure 39. Service Condition In Running


Alerts and Health Checks

On each Service page, in the Summary area, click Alerts to see a list of all health checks and their status for the selected service. Critical alerts are shown first. Click the text title of each alert message in the list to see the alert definition. For example, On the HBase > Services, click Alerts. Then, in Alerts for HBase, click HBase Master Process.


Alerts for HBase

Figure 40. Alerts for HBase



Modifying the Service Dashboard

Depending on the Service, the Summary tab includes a Metrics section which is by default populated with important service metrics to monitor.


Figure: Modifying the Service Dashboard


This section of Metrics is customizable. You can add and remove widgets from the Dashboard as well as create new widgets. Widgets can be private only to you and your dashboard or shared in a Widget Browser library for other Ambari users to add/remove the widget from their Dashboard.

Important
You must have the Ambari Metrics service installed to be able to view, create, and customize the Service Dashboard. Only HDFS, Hive, HBase, and YARN have customizable service dashboards.


ADDING OR REMOVING A WIDGET
  1. Click on the “ + ” to launch the Widget Browser. Alternatively, you can choose the Actions menu in the Metrics header to Browse Widgets.

  2. The Widget Browser displays the available widgets to add to your Service Dashboard. This is a combination of shared widgets and widgets you have created.


Figur: Adding or Removing a Widget


  1. If a widget is not already added, you can click Add.



Monitoring Background Operations

Use Background Operations to monitor progress and completion of bulk operations such as rolling restarts.
Background Operations opens by default when you run a job that executes bulk operations.


Figure: Background Operation Running


MANAJEMEN RESOURCE DAN BEBAN (WORKLOAD MANAGEMENT)


Planning a Hadoop Cluster Deployment


Cost versus Performance

Planning for a new cluster is not trivial. The core design challenge is cost versus performance goals. Sufficient hardware is required to meet performance goals but too much hardware needlessly increases costs. Hardware sizing help is available.

Customized help is available by engaging the Professional Services team.

Be sure to consider:

  • Workload Type

  • Storage

  • Hardware

  • Operating Systems

  • Software

  • Databases


Consider Processing Workload Type


Processing Workload Type

Hadoop clusters commonly support three different types of distributed processing workloads: interactive, batch, and real-time. Each of these has unique characteristics and places a different type of load on the cluster resources. Understanding these workload types can be helpful when attempting to size hardware for a cluster.

Interactive processing involves the processing of data with the continual exchange of information between the cluster and a user. It is commonly used for the analysis of existing historical data—data older than 15 minutes—or for data entry. It places sporadic loads on the cluster resources that are often difficult to predict in advance.

Batch processing is commonly used to analyze historical data. Historical data is primarily stored on disk and is moved to memory for processing. Batch jobs are run sporadically or periodically and run from a few seconds to multiple hours. They place sporadic to periodic loads on the cluster resources. It is much easier to predict the resource loads for those batch jobs that are repeated at regular time intervals.

Real-time processing is commonly used to ingest a continuous stream of data, process it, and output it to storage. Real-time data is typically initially ingested into memory and then moved to disk after processing. Real-time jobs are always running unless manually stopped. They are primarily used by automated systems to prevent certain outcomes or to optimize operations. Real-time processing can consume both computational and I/O resources depending on the application and job.


Consider Processing Workload Patterns


Cluster Workload Patterns

Computational power, network I/O bandwidth, disk I/O bandwidth, and disk space are the most important parameters to consider for accurate hardware sizing. Cluster workloads often fall into one of three categories: compute intensive, I/O intensive, or balanced.

A compute intensive workload is typically bound by CPU or memory constraints and is characterized by the need for a large number of CPUs and large amounts of memory. Examples of compute intensive workloads commonly include HBase, Spark, or Storm jobs.

An I/O intensive workload is often bound by disk I/O bandwidth or perhaps network I/O bandwidth. Examples of I/O intensive workloads commonly include MapReduce jobs. Because historically Hive and Pig jobs have been converted to MapReduce jobs, Hive and Pig jobs have also been I/O intensive. With the current HDP release, Hive and Pig jobs are now converted to Tez jobs by default. Tez is more disk I/O efficient than MapReduce, which reduces but does not eliminate the overall disk I/O bandwidth requirements.

A balanced workload is characterized by an even blend of both computational and I/O bandwidth needs. A cluster running a mix of batch, interactive, and real-time data processing jobs often has a balanced workload. If you are unsure what your workload with be, plan for a balanced workload.


Planning for Workloads


Hardware Sizing Guidelines for a Balanced Workload

Storage Calculator


Storage Calculator

The storage calculator is designed to provide a rough estimate of the amount of HDFS storage space required when designing a cluster. It accounts for not only the initial dataset size and any expected year-over-year growth, but also accounts for data that is temporarily created as a result of data processing.


Hardware Sizing

Achieving optimal results from a Hadoop implementation begins with choosing the correct-sized hardware. The effort involved in the planning stages can pay off dramatically in terms of the improved performance and decreased total cost of ownership associated with the installed cluster.

Consider Deploying a Small Pilot Cluster

Because it is difficult to predict hardware resource usage without benchmarking and testing, We recommends installing a small pilot cluster for the testing and benchmarking of applications. Such a pilot cluster can yield valuable and reliable information used to determine hardware sizing requirements.

The best results are obtained by benchmarking your applications in the pilot cluster. If that is not possible, then run benchmark applications in the pilot cluster and use the results to extrapolate from that how your applications will perform. Terasort, DFSIO, and HiBench are commonly used but there are other benchmark utilities.

Even if a cluster is not initially optimally sized, the modular nature of Hadoop components make future cluster modification easier. A cluster can be resized by adding or removing nodes.


Hardware Guidelines for Master Nodes

Master nodes run master service components. As a result, availability is a primary concern. Availability is enhanced through redundancy. Where possible, all hardware components should be configured for redundancy.

Use RAID 10 storage for both the operating system and all data disks. Configure dual, bonded Ethernet NICs. Consider dual power supplies and cooling fans. Also use ECC-protected memory.

Some Hadoop master service components support high availability configurations across multiple hosts. You may even consider virtualizing the master servers to gain the benefits of live virtual machine migration or virtual machine high availability solutions like VMware HA.

While not all master service components have the same CPU, memory, storage, or network hardware requirements, consider using the same hardware specifications for all master nodes. This enables an administrator to more easily migrate master service components to any master node as a result of maintenance requirements or following a system failure.


Hardware Guidelines for Worker Nodes

Worker nodes perform data processing so throughput is more important than availability. Worker nodes are already redundant within the cluster so hardware redundancy in the individual nodes is not as necessary. If a worker node fails, Hadoop automatically takes action to protect data and restart any failed processing jobs. For example, by default, HDFS maintains three copies of all data blocks and ensures that not all copies reside on the same worker node. If a worker node fails, HDFS automatically makes new copies of all data blocks that resided on the failed machine.

Because throughput is so important for performance, plan for parallel computing and data paths. This includes using dual-CPU socket servers, using multiple disk drives (as many as 8-12) and disk controllers, using fast drives with a fast disk interconnect, and using multiple, bonded Ethernet NICs.

To simplify cluster configuration, try to use the same hardware specification for all worker nodes. Many Hadoop configuration properties are tied to the number of CPUs, the amount of memory, or the number of disks on the worker node. If different groups of worker nodes have different specifications, then these groups of worker nodes must use separate configuration files with different configuration property settings. The good news is that the Ambari Configuration Groups feature accommodates this, but there is still some additional administrative effort involved.


Network Design Guidelines


Network Design Guidelines


Network availability and sufficient network bandwidth are critical for cluster operation.

To help avoid cluster failure, avoid single points of network failure. This means using dual, bonded network ports, dual top-of-the-rack switches, and dual core switches.

Network bandwidth is the most challenging parameter to estimate because Hadoop workloads greatly vary from cluster to cluster and even within the same cluster at different times. It has been typical to see dual 1 Gb Ethernet ports on the worker nodes, but you might need more. Using 10 Gb Ethernet ports helps to ensure that your network bandwidth will be sufficient well in the the future, but is more expensive to purchase. In any case, to help ensure that your cluster receives all available network bandwidth you should dedicate the networks switches to the cluster.

You should also consider the effect of a worker node failure. HDFS maintains three copies of all data blocks and ensures that not all copies reside on the same worker node. If a worker node fails, HDFS automatically makes additional copies of all data blocks that resided on the failed machine. This can result in significant additional network traffic as many of these data blocks will have to be copied across the network. For example, if a worker node with 10 terabytes of data fails, the cluster will produce approximately 10 terabytes of network traffic to recover.


Ambari and Metrics Collector Hardware Guidelines


Table

Description automatically generated

Memory and Disk Requirements Based on Cluster Size

The Ambari Server host does not require a large amount of memory. The minimum required is 1 gigabyte. However, the standalone Ambari Metrics Collector was added starting with Ambari 2.0.

The memory and storage requirements for the Metrics Collector are based on the number of cluster nodes. The table illustrates the memory and storage space requirements for various cluster sizes. The Ambari Server and Ambari Metrics Collector can be co-located on the same machine.


Hardware Testing

When purchasing a large number of systems, there is always the possibility that some hardware will fail early—often within a few hours of operation. If the cluster is already installed then the hardware failure will disrupt cluster operation. To help avoid this, hardware should always be thoroughly tested before being placed into service. Testing the hardware sufficiently takes many hours to complete so have your hardware provider do it, if possible.

If you have to perform the testing, then use a vendor-supplied diagnostic test utility where available. Another option is to install an operating system and use operating system utilities.


A few examples of Linux commands include fio, dd, and hdparm. The fio is the flexible I/O tester, dd is a disk-to-disk copy utility, and hdparm gets or sets SATA/IDE device parameters. Depending on the command and options used, these utilities can test various hardware subsystems. Use the operating system documentation to learn how to run these or other utilities.


Supported Operating Systems

The following 64-bit operating systems are tested and supported:

  • CentOS 6, 7

  • Red Hat Enterprise Linux (RHEL) 6, 7

  • Oracle Linux 6, 7

  • SUSE Linux Enterprise Server (SLES) 11, SP3

  • Debian 6, 7 (support in HDP 2.3.2 maintenance release)

  • Ubuntu Precise 12.04, 14.04 (support in HDP 2.3.2 maintenance release)

  • Windows Server 2008, 2012


Required Software Packages

The following software packages are required for a Hadoop cluster:

  • yum (CentOS or RHEL)

  • zypper (SLES)

  • php_curl (SLES)

  • apt-get (Ubuntu)

  • reposync

  • rpm (CentOS, RHEL, or SLES)

  • scp

  • curl

  • wget

  • unzip

  • chkconfig

  • tar

  • Java software, one of the following:

  • Oracle JDK 1.8

  • Oracle JDK 1.7 u51 or higher

  • OpenJDK 1.8

  • OpenJDK 1.7 u51 or higher


OS Pre-Configuration

Required

There are several required configuration changes that must be completed before installing HDP.

  • Configure NTP on all cluster nodes to ensure synchronized time.

  • Configure all cluster nodes for forward and reverse DNS lookups.

  • Configure the system that will run Ambari for password-less SSH access to cluster nodes. If this is not possible for security or other reasons, manually install and register the Ambari agents before HDP installation.

  • Open HDP-specific network ports or disable the firewall.

  • Disable IPv6 on all cluster nodes

  • For the duration of the installation process disable Security Enhanced Linux (SELinux). It can be re-enabled following installation.


Recommended

There are a number of ways to tune operating systems to enhance cluster performance.

  • Linux file systems record the last access time for all files but there is a small performance cost associated with this. To disable last access time recording, use the noatime option when mounting a file system. Use the instructions in your vendor documentation to add the noatime mount option.

  • The ext3 and ext4 file systems normally reserve five percent of their disk space for the exclusive use of the root user. This can lead to an excessive waste of space with multiple terabyte-sized storage. You may disable or lower this reservation when creating a file system, or afterwards by tuning the file system. Use the instructions in your vendor documentation to change the root-reserved space.

  • Linux kernels have a feature named transparent huge pages that is not recommended for Hadoop workloads. Use the instructions in your operating system documentation to disable it.

  • Ethernet jumbo frames increase an Ethernet packet’s maximum payload from 1500 bytes to approximately 9000 bytes. This payload increase increases network performance. Use the instructions in your vendor documentation to enable jumbo frames.

  • BIOS-based power management commonly has the ability to increase or decrease CPU clock speeds under certain conditions. Because Hadoop operates as a cluster of machines, having some machines running slower clock speeds can have an adverse affect on total cluster processing throughput. To avoid this, use the instructions in your vendor documentation to disable BIOS-based power management.

  • Linux systems also place limits on the total number of files that a process may have open at the same time. It also places limits on the total number of processes that a user may run at the same time. These limits could interfere with cluster operation. Use the instructions in your vendor documentation to increase these limits, as necessary.


Supported Databases


Table

Description automatically generated

Supported Databases

Some Apache frameworks require a database to maintain configuration information and metadata. The table lists the default database for each framework along with any other databases that are supported by the framework. Default databases are automatically installed when installing HDP using Ambari. If you have an existing database and you would like to connect to it during HDP installation, you must perform specific installation configuration steps. These steps are listed in the online HDP manual installation documentation.

To ease administration, consider choosing a single database type for all framewor ks. Database administrators should implement high availability and regularly back up the databases. Heavy use of Falcon plus Oozie, or Ambari, might require dedicated instances.


PEMANTAUAN(MONITORING) DAN NOTIFIKASI BERBENTUK ALERT ATAU NOTIFIKASI MESSAGING


Monitoring and Alerts

Ambari monitors cluster health and can alert you in the case of certain situations to help you identify and troubleshoot problems. You manage how alerts are organized, under which conditions notifications are sent, and by which method. This section provides information on:

  1. Managing Alerts

  2. Configuring Notifications

  3. List of Predefined Alerts


Managing Alerts

Ambari predefines a set of alerts that monitor the cluster components and hosts. Each alert is defined by an Alert Definition, which specifies the Alert Typecheck interval and thresholds. When a cluster is created or modified, Ambari reads the Alert Definitions and creates Alert Instances for the specific items to watch in the cluster. For example, if your cluster includes HDFS, there is an alert definition to watch the “DataNode Process”. An instance of that alert definition is created for each DataNode in the cluster.

TERMS AND DEFINITIONS

The following basic terms help explain the concepts associated with Alert Ambari: Terminology


Terms And Definitions

Figure 66. Terms And Definitions

Definition Alert and Instance

Alert thresholds and the threshold units are dependent on alert type. The following table lists the types of alerts, their possible status and if the thresholds are configurable:


Alert Types

Figure 67. Alert Types

Modifying an Alert

  1. Browse to the Alerts section in Ambari Web.

  2. Find the alert definition and click to view the definition details.

  3. Click Edit to modify the name, description, check interval and thresholds (as applicable).

  4. Click Save.

  5. Changes will take effect on all alert instances at the next check interval.

How to see the list of alert instances

  1. Browse to the Alerts section in Ambari Web.

  2. Find the alert definition and click to view the definition details.

  3. List of alert instances displayed, or

  4. Find a specific host from Ambari’s website to see a list of certain instances of alerts in the host.

Enabling or Disabling Alerts

  1. Browse to the Alerts section in Ambari Web.

  2. Find the alert definition. Click the Enabled or Disabled text to enable/disable the alert.

  3. Alternatively, you can click on the alert to view the definition details and click Enabled or Disabled to enable/disable the alert.

  4. You will be prompted to confirm enable/disable.


Configuring Notifications

With Alert Groups and Notifications, you can create groups of alerts and setup notification targets for each group. This way, you can notify different parties interested in certain sets of alerts via different methods. For example, you might want your Hadoop Operations team to receive all alerts via EMAIL, regardless of status. And at the same time, have your System Administration team receive all RPC and CPU related alerts that are Critical only via SNMP. To achieve this scenario, you would have an Alert Notification that handles Email for all alert groups for all severity levels, and you would have a different Alert Notification group that handles SNMP on critical severity for an Alert Group that contains the RPC and CPU alerts.

Ambari defines a set of default Alert Groups for each service installed in the cluster. For example, you will see a group for HDFS Default. These groups cannot be deleted and the alerts in these groups are not modifiable. If you choose not to use these groups, just do not set a notification target for them.

Creating or Editing Notifications

  1. Browse to the Alerts section in Ambari Web.

  2. Under the Actions menu, click Manage Notifications.

  3. The list of existing notifications is shown.

  4. Click + to “Create new Alert Notification”. The Create Alert Notification dialog is displayed.

  5. Enter the notification name, select the groups to which the notification should be assigned (all or a specific set), select the Severity levels that this notification responds to, include a description, and choose the method for notification (EMAIL or SNMP).

  • For EMAIL: provide information about your SMTP infrastructure such as SMTP Server, Port, To/From address and if authentication is required to relay messages through the server. You can add custom properties to the SMTP configuration based on the Javamail SMTP options.


SMTP JavaMail options

Figure 68. SMTP JavaMail options


  • For SNMP: select the SNMP version, Community, Host, and Port where the SNMP trap should be sent. Also, the OID parameter must be configured properly for SNMP trap context. If no custom, or enterprise-specific OID will be used, we recommend the following:

Enterprise-Specific OID

Figure 69. Enterprise-Specific OID


Note
Only SNMPv1 and SNMPv2c should be chosen for SNMP Version. SNMP v3 is not supported nor functional at this time.

  1. After completing the notification, click Save.

Creating or Editing Alert Groups

  1. Browse to the Alerts section in Ambari Web.

  2. From the Actions menu, choose Manage Alert Groups

  3. The list of existing groups (default and custom) is shown.

  4. Choose + to “Create Alert Group”. Enter the Group a name and click Save.

  5. By clicking on the custom group in the list, you can add or delete alert definitions from this group, and change the notification targets for the group.

Dispatching Notifications

When an alert is enabled and the alert status changes (for example, from OK to CRITICAL or CRITICAL to OK), Ambari will send a notification (depending on how the user has configured notifications). For EMAIL notifications: Ambari will send an email digest that includes all alert status changes.

For example: if two alerts go CRITICAL, Ambari sends one email that says “Alert A is CRITICAL and Ambari B alert is CRITICAL”. Ambari will not send another email notification until status has changed again.

For SNMP notifications: Ambari will fire an SNMP trap per alert status change. For example: if two alerts go CRITICAL, Ambari will fire two SNMP traps, one for each alert going OK -> CRITICAL. When the alert changes status from CRITICAL -> OK, another trap is sent.

Viewing Alert Status Log

In addition to dispatching alert notifications, Ambari writes alert status changes to a log on the Ambari Server host. Alert status changes will be written to the log regardless if EMAIL or SNMP notifications are configured.

  1. On the Ambari Server host, browse to the log directory:
    cd /var/log/ambari-server/

  2. View the ambari-alerts.log file.

  3. Log entries will include the time of the status change, the alert status, the alert definition name and the response text.


List of Predefined Alerts

HDFS Service Alerts


HDFS Service Alert

Figure 70. HDFS Service Alert


NameNode HA Alert


NameNode HA Alert

Figure 71. NameNode HA Alert


YARN Alert


YARN Alert

Figure 72. YARN Alert


MapReduce2 Alert


MapReduce2 Alert

Figure 73. MapReduce2 Alert


HBase Service Alert


HBase Service Alert

Figure 74. HBase Service Alert


Hive Alert


Hive Alert

Figure 75. Hive Alert


Oozie Alert


Oozie Alert

Figure 76. Oozie Alert


ZooKeeper Alert


ZooKeeper Alert

Figure 77. ZooKeeper Alert


Ambari Alert


Ambari Alert

Figure 78. Ambari Alert



JOB ACTIVITY

Sdfsdf







RESOURCE DAN AVAILABILITY





SYSTEM ERROR/WARNING

Alert Definitions and Alert Instances


Ambari Alert Definitions and Alert Instances

Ambari predefines a set of Alert Definitions designed to monitor the cluster and hosts. Ambari uses an Alert Definition to create Alert Instances that perform the actual service component or host checks. A single Alert Definition might be used to create one or hundreds of Alert Instances.

For example, if there were 60 DataNodes, then the Datanode process Alert Definition would be used to create 60 DataNode process Alert Instances. The number of Alert Instances also depends on the services installed in the cluster. For example, if HBase is not installed there would not be any HBase Alert Instances based on the HBase Alert Definitions. The status of such an alert would be UNKNOWN.

Each Alert Definition includes several attributes:

  • The service name being monitored. Services include HDFS, YARN, MapReduce2, HBase, Hive, Oozie, ZooKeeper, and Ambari Alerts.

  • The state of the Alert Definition. It can be enabled or disabled. When a cluster is created or modified, Ambari reads the Alert Definitions and creates Alert Instances for the specific components to watch. If an Alert Definition is enabled, Ambari creates Alert Instances for each appropriate component or host. If an Alert Definition is disabled, all its associated Alert Instances are disabled.

  • The pre-defined name of the Alert Definition. The name of each Alert Instance is the same as its Alert Definition.

  • A brief description of the Alert Definition.

  • There are five Alert Definition types that include PORT, METRIC, AGGREGATE, WEB, and SCRIPT. The characteristics of each are described later in this lesson.

  • One or more thresholds. Alert Definition thresholds determine the status of an Alert Instance. For example, two different thresholds might determine when the status of an Alert Instance might transition from OK to WARNING, or from WARNING to CRITICAL.

  • The check interval defines how often, in minutes, Ambari checks the status of an alert.

  • Alert Definition groups enable an administrator to create groups of alerts and configure notifications for each group. This determines how, and to whom, alerts are sent. Notifications are sent by either email or SNMP when an Alert Instance transitions from one status to another. There are no configured notifications, by default.


Alert Types


Table

Description automatically generated

Ambari Alert Types

The table lists and describes the characteristics of the five alert types. Only PORT, METRIC, and AGGREGATE alerts have configurable WARNING and CRITICAL thresholds. They are configurable in the Ambari Web UI when editing an Alert Definition.

PORT alerts transition from a status of OK to either WARNING or CRITICAL based on the number of seconds it takes to receive a response from a port. The response time is checked at the configured check interval.

METRIC alerts transition from a status of OK to either WARNING or CRITICAL based on a change in a measurement unit. Measurement units can be such things as a percentage, time, the number of HDFS directories, the number of DataNodes, and so on. Whatever the measurement unit is, it is checked at the configured check interval.

AGGREGATE alerts transition from a status of OK to either WARNING or CRITICAL based on a percentage measured during the most recent check at the configured check interval.

WEB and SCRIPT alerts do not have configurable thresholds in the Ambari Web UI.

For WEB alerts, the alert attempts to access a URL and the alert status is determined based on the HTTP response. The status is OK if the HTTP response is less than 400. The status is WARNING if the response is 400 or above. The status is CRITICAL when the alert cannot connect to the URL. Because the status depends on a non-configurable HTTP response, the thresholds cannot be changed.

However, the text message displayed the Response column can be modified.


For SCRIPT alerts, the alert executes a script and the script determines status of OK or CRITICAL. The thresholds and response text message are built-into the script so neither can be modified in the Ambari Web UI. This is why the descriptions of SCRIPT alerts include information about their thresholds, because this information is not visible otherwise.


Viewing Alert Definitions



Graphical user interface, application, website

Description automatically generated

Viewing Alert Definitions

Starting with Ambari 2.0, there is a new Alerts page in the Ambari Web UI. It displays all the Alert Definitions.

Each Alert Definition can have one or more Alert Instances. Which Alert Definitions result in actual Alert Instances depends on the installed services. How many Alert Instances are created depends on the cluster’s size and service configuration. Click an Alert Definition name to view or edit the Alert Definition.

If any Alert Instance has transitioned to a WARNING or CRITICAL status, the Status column will display WARN or CRIT for that Alert Definition.

The State column enables an administrator to disable or enable all Alert Instances associated with a particular Alert Definition. Click the current state to toggle between enabled and disabled.

Editing an Alert Definition


Graphical user interface, text, application, chat or text message

Description automatically generated

Editing an Alert Definition

While viewing an Alerts Definition, click Edit to modify its configuration. You can modify the check Interval, the WARNING and CRITICAL thresholds, and enable or disable all Alert Instances. You can also modify the text message to be displayed in the Response column.

In the example, the alert will check the response time from the ZooKeeper Service port every minute. If the response time is over 1.5 seconds the status changes to WARNING. If the response time is over 5 seconds the status changes to CRITICAL.

The text message template that appears next to the OK, WARNING, and CRITICAL severity levels generates the message in the Response column when viewing Alert Instances. The variable in the text templates is based on Python string format. Viewing Alert Instances is described and illustrated later in this lesson.

Click the Save button (not shown) to save any modifications.


Viewing All Alert Instances for an Alert Definition


Graphical user interface, application, website

Description automatically generated

Viewing Alert Instances for an Alert Definition

Displaying an Alert Definition also displays all of its Alert Instances. The service name, the host the Instance is running on, and the status of each Instance is also displayed along with the current Response message.


Viewing All Alert Instances for a Host


Viewing All Alert Instances for a Host

An administrator can also view all the Alert Instances running on a specific host by opening the Hosts page in the Ambari Web UI. On the Hosts page, click a specific host and select its Alerts tab. The

Alerts tab lists all the Alert Instances running on that host.


Viewing Alerts in an Alert Group
Graphical user interface, text, application

Description automatically generated

Viewing Alerts in an Alert Group

The Groups drop-down menu is used to filter the list of Alert Definitions displayed on the Alerts page. An administrator can select All to display all Alert Definitions or they can select a specific Alert Group and display only its Alert Definitions. An administrator can also filter the display using the Alert Definition Name text box.


Default and custom groups appear on the Groups drop-down menu. There are default service groups for HDFS, YARN, MapReduce2, HBase, Hive, Oozie, ZooKeeper, and Ambari Alerts.


Alerts, Alert Groups, and Notifications


Alerts, Alert Groups and Notifications

Notifications are configured in order to determine to whom an alert is sent, and how it is sent. Alerts can be sent using SMTP or SNMP. By default, there are no configured notifications.

An administrator can create custom groups of alerts and configure notifications. An alert group can be added as a member to one or more notifications. Ambari comes with default alert groups for Hadoop services like HDFS, YARN, ZooKeeper, and others. These default alert groups cannot be modified or deleted, but they can be duplicated. A duplicate can be modified.

Notifications determine which users will be notified of an alert by email or SNMP. Because an alert group can belong to multiple notifications, when an alert is triggered some recipients might receive a notification via SNMP while others might receive the notification via email. Specific email and SNMP configuration information is configured using the Ambari Web UI.

A notification’s configuration also determines the alert status that will trigger a notification. For example, a transition from OK to WARNING could be sent to one group of users via SNMP while a transition to CRITICAL could be sent to another group of users via email.



Adding and Configuring an Ambari Alert



Graphical user interface, application, website

Description automatically generated

Managing Alert Groups and Notifications


An administrator uses the Alerts > Actions menu button to create, modify, or delete Alert Groups and Notifications. The default alert groups cannot be deleted or modified.


Creating a New Alert Group


Graphical user interface, text, application, chat or text message

Description automatically generated

Creating an Alert Group

Use Manage Alert Groups to create a new Alert Group. The plus (+) and minus (-) buttons create or delete a group. The gear button renames or duplicates a group. The New button adds a group to an existing Notification. You have to know the name of the existing Notification, or at least the first few characters of its name, in order to add it.


Naming an Alert Group


Graphical user interface, application

Description automatically generated

Naming an Alert Group

In the Create Alert Group dialog box, type a unique name for the new Alert Group and click OK. The new Alert Group will appear in the Manage Alert Groups window.

Click the plus (+) button to add new Alert Definitions to the group.


In this example, the new Alert Group named DataNodeAlerts will contain only alerts associated with HDFS DataNodes.


Adding Alert Definitions to an Alert Group


Graphical user interface, application, chat or text message

Description automatically generated

Adding Alert Definitions to an Alert Group

Use the Select Alert Group Definitions window to add Alert Definitions to an Alert Group. There are many Alert Definitions to choose from so use the Service or Component menus to filter the list of displayed Alert Definitions. The Service menu enables you to filter the list of alerts by specific service types. The Component menu enables you to filter the list of alerts by specific service component types.

To add an Alert Definition to an Alert Group, click the Alert Definition’s check box. When the desired Alert Definitions have been selected, click OK.

In this example, only alerts associated with HDFS DataNodes are selected for inclusion in the DataNodesAlerts group.


Viewing and Saving an Alert Group


Graphical user interface, text, application

Description automatically generated

Viewing and Saving an Alert Group

View and confirm the configuration of the new Alert Group. If there is an existing Notification that will work for this group, add it here. To add a Notification click New and type its name.

When finished with this window, click Save.


Creating a New Notification


Graphical user interface, text, application, chat or text message

Description automatically generated

Creating a New Notification

Use Manage Notifications to create a new Notification. The plus (+) and minus (-) buttons create or delete a Notification. The gear button edits or duplicates a Notification.


Configure Notification Details



Graphical user interface, application

Description automatically generated

Notification Details

Type a unique name for the Notification. The name is used to identify the Notification.

Add one or more Alert Groups to the Notification. You can use the Shift key in concert with the mouse pointer to select more than one group. Then select one or more severity levels. All alarms in the selected groups at the selected severity levels will be sent to users using the selected EMAIL or SNMP method.


The method can be either EMAIL or SNMP. The window is context sensitive and will change depending upon which method you select. Selecting EMAIL displays SMTP and email delivery configuration settings. Selecting SNMP displays SNMP configuration settings. Both are shown in the screen captures.

A notification is sent only once when an alert status has changed. The alert status can change only when the measured metric is checked at the alert’s configured check interval. This policy is designed to avoid sending dozens or even hundreds of notifications.

For SNMP, a trap is sent for each alert status change. For example, if two alerts transition to CRITICAL then two traps are sent.

For email, an email digest is sent that includes all status changes. For example, if two alerts transition to CRITICAL then a single email is sent that reports that X alert is CRITICAL and Y alert is CRITICAL.


Email and SNMP Configuration


Table

Description automatically generated

Email and SNMP Configuration


View and Confirm the Notification


Graphical user interface, text, application, chat or text message

Description automatically generated

View and Confirm the Results

The final window displays the name and configuration of the new Notification. View and confirm the result.


Add a Notification to an Alert Group


Add a Notification to an Alert Group

Use Alerts > Actions > Manage Alert Groups to add the new notification to an alert group.


MENAMBAHKAN NODES (SCALING OUT) / PATCHES TANPA MENYEBABKAN DOWNTIME

Adding, Deleting, and Replacing Worker Nodes


Lesson Objectives

After completing this lesson, students should be able to:

  • Identify reasons to add, replace, and delete worker nodes

  • Add a worker node

  • Configure and run the HDFS Balancer

  • Delete a worker node

  • Move a master component


Working with Cluster Nodes


Reasons to Add, Delete or Replace Worker Nodes

There are many reasons to add, delete, or replace a worker node. Several reasons are illustrated here.

  • A cluster might need to grow over time to meet increased resource demand. A cluster can grow by adding more nodes or by adding additional hardware resources to existing nodes.

  • Hardware can and will fail. Failed hardware needs to be repaired or replaced. In either case, the node must be removed from the cluster. A new or repaired node must be added back in to the cluster.

  • Hardware becomes obsolete over time and must be replaced by newer hardware. The obsolete hardware will need to be removed and replaced by the newer hardware.

  • Software must be periodically upgraded for a variety of reasons. Reasons include improved reliability, stronger security, and additional functionality. To perform a software upgrade, a node must be removed from the cluster, upgraded, and then added back to the cluster.

  • Hardware can be upgraded to add additional CPU, memory, storage, and network resources. To perform a hardware upgrade, a node must be removed from the cluster, upgraded, and then added back in to the cluster.

Adding, deleting, or replacing a node will most often involve the worker nodes due to their larger number and the fact that the worker nodes provide most of the cluster’s computational and storage resources.


Recommendations When Adding or Replacing Nodes

Use these recommendations when adding or replacing worker nodes:

  • First, do not skip proper hardware burn-in testing. Detecting hardware problems before adding a node to a cluster helps to avoid the additional inconvenience and downtime associated with removing a failing or failed node from a cluster.

  • Hardware models from a manufacturer improve over time. Motherboards support more and faster CPUs, more memory is possible, and network and disk controllers change. The result is that newer hardware purchased is often different from previous hardware purchases, even if they are the same model.

  • As the system hardware resources change the cluster configuration settings must typically also change. As new sets of worker nodes are added to a cluster, the result is that the existing systems might require one set of configuration settings while the newer systems might require a different set. Ambari configuration groups was designed to address this scenario.

  • Because of configuration issues, the recommendation is to purchase groups of identical new systems and then use Ambari configuration groups. Ambari configuration groups distribute different sets of configuration files to different sets of systems as directed by the administrator. Using Ambari configuration groups was described in another lesson.

  • Optionally it is also possible to use YARN node labels to ensure that an application runs on only certain nodes. For example, a Spark application uses memory extensively. If newer nodes have significantly more and faster memory, then YARN node labels could be used to ensure that the Spark application runs only on the newer nodes. YARN node labels were described in another lesson.


Adding a Worker Node Using the Ambari Web UI



Graphical user interface, application

Description automatically generated

Add New Hosts -‐ Ambari Web UI

The Ambari Web UI can be used to add new nodes to a cluster. The process is wizard driven. To start the process, log in to the Ambari Web UI and click the Hosts page, then the Actions menu button, and then select Add New Hosts.


Select Nodes for Ambari Agents


Graphical user interface, text, application

Description automatically generated

Installing an Ambari Agent

Ambari agents must be installed on each node and then registered with the Ambari Server before the Ambari Server can install HDP on the nodes. Type the fully-qualified domain name (FQDN) of each node in the Target Hosts text box. If you choose to use short host names instead, the installer with open a warning window when you click Register and Configure. Use of FQDNs is highly recommended to ensure proper cluster operation in all situations.

The Host Registration Information section includes two radio buttons. Which radio button is selected depends on whether or not you have pre-configured the Ambari Server machine with password-less SSH access to the cluster nodes.

  • If password-less SSH has been pre-configured, select the first radio button and click Browse. Use the Browse window to locate the pre-configured SSH RSA private key file that enables the Ambari Server to log in to each node.

  • If password-less SSH log in has not been pre-configured, select the second radio button. This opens a window warning you that you must manually install and configure Ambari agent software on each node. Once installed and properly configured, an agent will automatically register with the Ambari Server.

After the choices have been made, click Register and Confirm.


Install and Register Ambari Agents


Graphical user interface, application

Description automatically generated

Confirm Hosts Window

The Confirm Hosts window enables you to monitor the progress of the Ambari agent installation and registration process. Each node with a successful agent registration displays Success and a green progress bar.


If you decide not to include one or more nodes in the cluster, you may click Remove next to that host before proceeding.

Ambari performs configuration checks on each node once an agent has been installed.


If Ambari detects any potential configuration problems they can be viewed by clicking the link Click here to see the warnings. A window opens with detailed information about potential issues. Some

warnings are serious and must be resolved before proceeding while some warnings can be potentially ignored. In each case, a warning must be evaluated for its impact on cluster operation.

When ready to proceed with the installation process, click Next.

Assign Slaves and Clients

Graphical user interface, application

Description automatically generated

Add Host Wizard

Use the Assign Slaves and Clients window to choose where to run service worker components and Hadoop client software.

The DataNode is the HDFS service worker component.


An HDFS NFS Gateway machine can be used as a method to access HDFS. The NodeManager is the YARN service worker component.

Client represents the Hadoop client software. Client software is used by user and application clients to access cluster services and resources. For example, client software includes the HDFS Shell that is used to access HDFS storage.

Make the required selections and click Next.

Choose Configuration Groups


Graphical user interface

Description automatically generated

Choosing Configuration Groups

The configuration settings that should be applied to a new DataNode might vary based on the size and amount of the DataNode’s compute, storage, and network resources. For this reason, you are offered the choice of choosing a specific configuration group for each service. Configuration groups are described in another lesson.

Review and Deploy


Graphical user interface, text, application

Description automatically generated

Review and Deploy

The Review window enables you to print the configuration and deploy the nodes to the cluster. Click Deploy when ready to install the software.

Install, Start, and Test


Graphical user interface, application

Description automatically generated

Install, Start and Test

The Install, Start and Test window displays the installation progress of each node.

If any node fails to install properly, the error displayed in the Message column is a hypertext link. Click the link to get more information about the error. Once you resolve the issue, you can use Ambari to install the node again.


The HDFS Balancer


HDFS Balancer

When a new DataNode is added to a cluster, its will not have data blocks on its disk drives. This DataNode will be severely under-utilized during HDFS read operations. The primary method to move some of the existing data blocks to the new DataNode is to run the HDFS balancer utility.


The HDFS balancer utility moves data blocks from over-utilized to under-utilized DataNodes. The balancer is programed to run with a default threshold of ten percent. This means that the balancer will move data blocks if the disk usage on any DataNode is different for the overall HDFS usage by plus or minus than ten percent. The default threshold can be overridden when the administrator runs the balancer.


Rebalancing HDFS storage could move a significant amount of data across the network. For this reason, the amount of network bandwidth consumed by the balancer on each DataNode is configurable. The default value is set by the property:


dfs.datanode.balance.bandwidthPerSec in the hdfs-site.xml file. The default is set to 6,250,000 bytes per second.


Running the Balancer Using Ambari


Graphical user interface, application

Description automatically generated

Running the Balancer in Ambari Web UI

The balancer can be run by an administrator or operator from the Ambari Web UI or from a command- line prompt. When launching the balancer using Ambari, the Web UI interface prompts the user for the balancer threshold.


The default is ten percent but another value can be chosen.


Running the Balancer Using CLI

  • From the command line as the HDFS superuser. (su – hdfs)

  • Using the default threshold:

hdfs balancer

  • The balancer will exit when done.

  • Changing the threshold to 5 percent:

hdfs balancer –threshold 5

  • Display other options:

hdfs balancer -help

The balancer can also be run from a command-line prompt. It must be run with HDFS superuser privileges.

  • To run the balancer using the default settings, type the command hdfs balancer. The balancer will run until complete.

  • To run the balancer with a threshold other than the default, type the command hdfs balancer –threshold 5.

The balancer will run with a threshold of five percent rather than ten percent.

The balancer does have other, less common options. The view these options, type the command hdfs balancer –help.


Monitoring DataNode Balance


Table

Description automatically generated

NameNode UI Datanodes Tab

An administrator must be able to view DataNode balance in order to determine whether they should run the balancer. The NameNode UI’s Datanodes tab is used to view balance information. The relevant columns are highlighted here.

Similar information is available by becoming the HDFS superuser and generating an HDFS report using

hdfs dfsadmin –report.

You can also examine the HDFS Heatmaps in the Ambari Web UI. These can visually indicate under or over-utilized DataNodes.

Decommissioning and Re-commissioning a Worker Node


Decommissioning and Recommissioning a Worker Node

Decommissioning and recommissioning is a multi-step process. Worker nodes normally run both a DataNode and a NodeManager, and both are typically commissioned or decommissioned together.


Data Node - Node Manager Termination Differences

With the replication level set to three, HDFS is resilient to individual DataNode failures. However, there is a high chance of data loss when you terminate multiple DataNodes without decommissioning them first. Decommissioning multiple DataNodes should be accomplished on a schedule that permits the replication the data of blocks that reside on DataNodes being taken out of service. For additional data safety, consider decommissioning on a single DataNode at a time.

Decommissioning a NodeManager is different. If a NodeManager is shut down, the ResourceManager will reschedule the tasks on other NodeManagers in the cluster. However, decommissioning a NodeManager might be required in situations where you want a NodeManager to stop to accepting new tasks, or when the tasks take time to execute but you still want to be agile in your cluster management.


Selecting a Worker Node


Graphical user interface, application

Description automatically generated

The Hosts Tab

Use the Hosts page to select the worker node to decommission. In a large cluster, the list of worker nodes could be very long. Use the Name text box or the Filter button to select a filter to reduce the number of nodes displayed.

Once the node to be decommissioned has been found in the list, click the node’s name.


Turning on Maintenance Mode


Graphical user interface

Description automatically generated

Host Actions -‐ Turn on Maintenance Mode

Although it is optional, we recommends turning on maintenance mode. Maintenance Mode affects a service, component, or host object in the following two ways:

  • Maintenance mode suppresses alerts, warnings, and status change indicators generated for the object

  • Maintenance mode exempts an object from host-level or service-level bulk operations

You typically turn on Maintenance Mode when performing hardware or software maintenance, changing configuration settings, troubleshooting, decommissioning, or removing cluster nodes.


Decommissioning

Decommissioning is a process that supports removing a worker service component from a cluster. You must decommission the worker running on a host before removing the component or host from service in order to avoid potential loss of data or processing disruption


Decommissioning a Data Node


Graphical user interface

Description automatically generated

Decommissioning a DataNode

Decommissioning a DataNode safely replicates the HDFS data to other DataNodes in the cluster.

Ambari updates the /etc/hadoop/conf/dfs.exclude file with the hostname of the DataNode when an administrator decommissions a DataNode. This file is defined by the dfs.hosts.exclude property in the hdfs-site.xml file.


Decommissioning a Node Manager


Graphical user interface

Description automatically generated

Decommissioning and NodeManager

Decommissioning a NodeManager stops a NodeManager from accepting new job requests from the ResourceManager.

Ambari updates the /etc/hadoop/conf/yarn.exclude file with the hostname of the NodeManager when an administrator decommissions a NodeManager. This file is defined by the yarn.resourcemanager.nodes.exclude-path property in the yarn-site.xml file.

Monitoring Decommissioning


Graphical user interface

Description automatically generated

Monitoring Decommissioning

DataNode status can be live or dead. A live DataNode can be decommissioning or decommissioned. A dead DataNode can also be decommissioned. These states are reported using different terms in different management interfaces.

The primary ways to view DataNode status are to use:

  • Ambari Web UI’s Hosts page

  • NameNode UI's Data Nodes tab


Performing Maintenance

Stopping All Components

Once all cluster components are shut down, you can perform scheduled maintenance activities.


Graphical user interface

Description automatically generated

Stopping All Components

Starting All Components

Once you have fixed the hardware, upgraded the hardware or software, or addressed whatever the issue was, you can restart the node’s service components.


Graphical user interface

Description automatically generated

Stopping All Components


Re-commissioning

Re-commissioning is a process that enables a worker service component in a cluster. Re-commissioning a NodeManager

Recommissioning the NodeManager enables the ResourceManager to assign jobs to the worker node.


Graphical user interface

Description automatically generated

Re-‐commissioning a Node Manager

When an administrator recommissions a NodeManager, Ambari updates the

/etc/hadoop/conf/yarn.exclude file to remove the hostname of the NodeManager. This file is defined by the yarn.resourcemanager.nodes.exclude-path property in the yarn-site.xml file.


Re-commissioning a DataNode

Recommissioning the DataNode enables the NameNode to access storage on the worker node.


Graphical user interface, application

Description automatically generated

Re-‐commissioning a DataNode

When an administrator recommissions a DataNode, Ambari updates the

/etc/hadoop/conf/dfs.exclude file to remove the hostname of the DataNode. This file is defined by the dfs.hosts.exclude property in the hdfs-site.xml file.


Rebalancing HDFS

Once a DataNode has been recommissioned, run the balancer to ensure that it is not under-utilized during HDFS read operations. The balancer was described earlier in this lesson.

The balancer can be launched from the Ambari Web UI or from a command-line prompt. Use the NameNode UI or an hdfs dfsadmin –report to verify DataNode balance.


Deleting Worker Nodes


Deleting Worker Nodes

Deleting a worker node is a multi-step process even though Ambari automates much of the actual work. The general steps are illustrated in the diagram above.

The Ambari Web UI includes different ways to accomplish this task using different drop-down menus in various UI locations.


Selecting a Node to Delete


Graphical user interface, text, application, chat or text message

Description automatically generated

Selecting a Worker Node to Delete

Select the worker node to delete. In a large cluster, the list of worker nodes could be very long. Use the Filter button or the Name text box to select a filter to reduce the number of nodes displayed.

Once the node to be deleted has been found in the list, click the node name.


Turning On Maintenance Mode


Graphical user interface

Description automatically generated

Turning on Maintenance Mode

Although it is optional, We recommends turning on maintenance mode. Maintenance Mode affects a service, component, or host object in the following two ways:

  • Maintenance mode suppresses alerts, warnings, and status change indicators generated for the object

  • Maintenance mode exempts an object from host-level or service-level bulk operations

You typically turn on Maintenance Mode when performing hardware or software maintenance, changing configuration settings, troubleshooting, decommissioning, or removing cluster nodes.


Decommissioning

Decommissioning is a process that supports removing a worker service component from a cluster. You must decommission the worker running on a host before removing the component or host from service in order to avoid potential loss of data or disruption on processing.


Decommissioning the Data Node


Graphical user interface

Description automatically generated

Decommissioning the DataNode

Decommissioning a DataNode safely replicates the HDFS data to other DataNodes in the cluster.

Graphical user interface

Description automatically generated

Decommissioning the NodeManager


Decommissioning the NodeManager

Decommissioning a NodeManager stops a NodeManager from accepting new job requests from the ResourceManager.

Stopping All Components


Graphical user interface

Description automatically generated

Stopping the DataNode, the NodeManager and Ambari Metrics

Stop any components running on the node. If you do not, Ambari displays warning messages when you attempt to delete the node.


Stopping the Ambari Agent


Text

Description automatically generated

ambari-agent stop Command

Stop the Ambari agent running on the node. If you do not, Ambari displays warning messages when you attempt to delete the node. Use the command ambari-agent stop to shut down the Ambari agent.


Deleting the Node


Graphical user interface

Description automatically generated

Deleting a Worker Node

The final step is to actually delete the node. This removes the node from the Ambari database. Ambari will no longer expect to manage this host.


Confirm the Deletion

When you delete a host, Ambari displays this important information.


Graphical user interface, text, application

Description automatically generated

Confirming the Worker Node Deletion

Read the information and click OK to confirm the deletion.


Manual Decommissioning and Recommissioning

DataNodes and NodeManagers can be manually decommissioned and recommissioned using command-line commands.

  • To decommission nodes manually:

Add the worker node’s hostname to the dfs.exclude and yarn.exclude files.

Then run the hdfs dfsadmin –refreshNodes and yarn rmadmin –refreshNodes

commands.

  • To recommission nodes manually:

Remove the worker node’s hostname from the dfs.exclude and yarn.exclude files. Then run the hdfs dfsadmin –refreshNodes and yarn rmadmin –refreshNodes commands again.


IMPORTANT

Manually editing configuration files is not compatible with Ambari administration. Manually modified configuration files are overwritten by the information in the Ambari database when the HDFS or YARN services are restarted. Manual decommissioning and recommissioning is not correctly recognized by Ambari.


Moving a Master Component


Graphical user interface, application

Description automatically generated

Examples of Moving Master Components

The Ambari Web UI can be used to move many of the master service components. The ability to move a master service component creates an opportunity to perform hardware or software maintenance on existing nodes with limited cluster downtime.

To enquire whether Ambari can move a specific master component:

  • Browse to the Services page

Click the Service Actions menu button

Data Zone


Add Zone Page


Add Zone Template. Merupakan page dimana user pertama kali mengakses menu data lake, pada page ini uer dapat menekan tombol add zone untuk menambah kan zone.


Data Lake Page


Tampilan default template zone. Kondisi user setelah menambahkan template zone, namun pada setiap zone belum memiliki directory yang ditambahkan.


Data Lake Page


Tampilan ketika user telah menambahkan directory pada masing masing zone


New Zone Modal

Modal untuk menambahkan detail zone baru.


Edit Zone Modal


Modal untuk mengedit detail zone.


New Directory Moda


Modal untuk menambahkan detail directory baru.


Edit Directory Modal


Modal untuk mengedit directory.


Directory Modal


Modal list directory.


Tooltip Menu on Directory list



Directory Detail


Modal detail directory. Menampilkan detail dari masing masing directory.



Copyright © 2021 BigBox. All rights reserved. Various trademarks held by their respective owners. Privacy Policy  |  Terms & Conditions