Skip to content

Install and configure Apache Hadoop

  1. Prerequisites
  2. Add hadoop hadoop user with password hadoop
  3. Configure passwordless SSH
  4. Download Hadoop
  5. Setting up the environment variable
  6. Configure Hadoop
    1. Setting up the Java environment variable
    2. Configure core-site.xml
    3. Configure hdfs-site.xml
  7. Configure mapred-site.xml
    1. Configure yarn-site.xml
  8. Format HDFS NameNode
  9. Start the Hadoop Cluster
  10. Check if all components works correctly
    1. Check Hadoop components
    2. Check HDFS
    3. Access Hadoop Web Interface
  11. Stop the Hadoop components


Prerequisites

Update and upgrade your system:

To install Hadoop you need Java.

You need also both the OpenSSH Server and OpenSSH Client package. Install them with this command:


Add hadoop hadoop user with password hadoop


Configure passwordless SSH


Download Hadoop


Setting up the environment variable

Now you worka as a normal user.

Run text editor

and paste

Activate the environment variables with the following command:


Configure Hadoop


Setting up the Java environment variable

Run text editor

and paste at the end of file:


Configure core-site.xml

Run text editor:

and add to configuration section the following code:


Configure hdfs-site.xml

Run text editor:

add to configuration section the following code:


Configure mapred-site.xml

Run text editor:

add to configuration section the following code:


Configure yarn-site.xml

Run text editor:

add to configuration section the following code:


Format HDFS NameNode


Start the Hadoop Cluster


Check if all components works correctly


Check Hadoop components


Check HDFS

But nosql user is not allowed to do any action on HDFS:


Access Hadoop Web Interface

  • Access the Hadoop NameNode using the URL http://localhost:9870. You will see the following screen:
  • Access the individual DataNodes using the URL http://localhost:9864. You will see the following screen:
  • Access the YARN Resource Manager using the URL http://localhost:8088. You will see the following screen:


Stop the Hadoop components

Again nosql user is not allowed to do any action on Hadoop: