Complete installation of Hadoop (and network disk data)

##Hadoop installation

(please leave a message for advice on the inadequacies of the first learning experience)

Environment tools (VMware, MobaXterm, CentOS7, JDK8, Hadoop 2.9.2)

The required tool environment can be extracted in this link
Link: https://pan.baidu.com/s/1VCXtS6fm6YvHMtBFrgNx6Q
Extraction code: xpgu
1, VMware installation
The installation process is detailed in the above link
Here is a supplement to the method of setting IP after installing the system image (I use automatic acquisition of dynamic IP here)
After configuration, install MobaXterm tool to operate the virtual machine
Create a session and write the obtained ip address
2, JDK installation
1. Configure host name

We can install JDK in our usr directory

2. Drag to usr for installation

[root@localhost usr]# rpm -ivh jdk-8u171-linux-x64.rpm
 In preparation...                          ################################# [100%]
Upgrading/install...
   1:jdk1.8-2000:1.8.0_171-fcs        ################################# [100%]
Unpacking JAR files...
        tools.jar...
        plugin.jar...
        javaws.jar...
        deploy.jar...
        rt.jar...
        jsse.jar...
        charsets.jar...
        localedata.jar...
[root@localhost usr]#

By default, JDK is installed in / usr/java path

3. Configure environment variables
**Here, we configure environment variables into our own user settings

 [root@localhost usr]# vi .bashrc
 #Press i to display insert for editing
 JAVA_HOME=/usr/java/latest
 PATH=$PATH:$JAVA_HOME/bin 
 CLASSPATH=. 
 export JAVA_HOME 
 export PATH 
 export CLASSPATH
 #After copying the above code, press esc shift+zz to save and exit
 #Reload variables using the source command
 [root@localhost usr]# source ~/.bashrc

4. Configure host name

[root@localhost ~]# vi /etc/hostname
#After entering, delete the code and replace it with the following
CentOS 
#After the modification, the host name needs to be reboot ed and restarted
[root@localhost ~]#reboot

5. Configure the mapping relationship between host name and IP

(1) View ip

[root@CentOS ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:1b:63:cd brd ff:ff:ff:ff:ff:ff
    inet 192.168.154.137/24 brd 192.168.154.255 scope global noprefixroute dynamic ens33
       valid_lft 1780sec preferred_lft 1780sec
    inet6 fe80::fc9c:25a3:3d54:b9e6/64 scope link noprefixroute
       valid_lft forever preferred_lft forever
```powershell

(2) Set the mapping relationship between host name and ip

[root@CentOS ~]# vi /etc/hosts
#At the bottom, add the ip and host name you have seen on the
192.168.154.137 CentOS

6. Configure SSH password free login (login mode between Linux systems)
(1) Generate the public-private key pair required for authentication

[root@CentOS ~]#  ssh-keygen -t rsa
#After entering the above command, some codes will appear. We can just press enter here
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:xQZsB1qmLTmdShXT9XWNDV9F8hWmgLe8o1gK7NY8m58 root@CentOS
The key's randomart image is:
+---[RSA 2048]----+
|       .O+.o. oB@|
|       Xo*o...+=*|
|      B.+.* ... o|
|     . + o o     |
|    . . S   .    |
|     o   . o     |
|    . + + . .    |
|     o *...      |
|    .  o+E       |
+----[SHA256]-----+
[root@CentOS ~]#

(2) Add the trust list, and then realize password free authentication

[root@CentOS ~]# ssh-copy-id CentOS
#After inputting the above command, you will find that you can select yes/no. This is our first time to set it. Enter yes here and press enter

(3) Test whether to set ssh password free successfully

[root@CentOS ~]# ssh root@CentOS
#If no password is required, the ssh connection is successful
Last failed login: Fri Sep 25 14:19:39 CST 2020 from centos on ssh:notty There was 1 failed login attempt since the last successful login. Last login: Fri Sep 25 11:58:52 2020 from 192.168.73.1
#If you want to log out, just enter the exit command
#We pay attention when making some connections!!! be careful!!! be careful!!! Be sure to turn off the firewall
#Shut down service
[root@CentOS ~]# systemctl stop firewalld.service 
#Turn off the automatic startup
[root@CentOS ~]# systemctl disable firewalld.service 
# View firewall status
[root@CentOS ~]# firewall-cmd --state
not running

Here, our JDK and basic environment are installed

2, Install Hadoop

In the above network disk connection, I prepared a hadoop-2.9.2 tar. gz
This version is also a version that Xiaobian is learning. If you have better suggestions, please send a private letter or leave a message

1. Add hadoop-2.9.2 tar. GZ directly under the root

2. Unzip hadoop-2.9.2 tar. gz

#Hadoop-2.9.2 tar. GZ decompress it to usr under the C drive letter
[root@CentOS ~]# tar -zxf hadoop-2.9.2.tar.gz -C /usr/
#After executing the above command, it will be stuck for more than ten seconds. At this time, just wait patiently

3. Configure the environment variable HADOOP_HOME

[root@CentOS ~]#  vi .bashrc
#Just configure it like this
JAVA_HOME=/usr/java/latest
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
CLASSPATH=.
HADOOP_HOME=/usr/hadoop-2.9.2/
 export JAVA_HOME
 export PATH
 export CLASSPATH
 export HADOOP_HOME
 # Reload HADOOP_HOME environment variable 
[root@CentOS ~]# source .bashrc

4. Configuration file
(1) Configure core site xml

[root@CentOS ~]# cd /usr/hadoop-2.9.2/ 
[root@CentOS hadoop-2.9.2]# vi etc/hadoop/core-site.xml
#Edit after entering the file
<configuration>
<!--nn Access portal-->
 <property>
        <name>fs.defaultFS</name>
        <value>hdfs://CentOS:9000</value>
 </property>
 <!--hdfs Working base directory-->
         <property> <name>hadoop.tmp.dir</name>
        <value>/usr/hadoop-2.9.2/hadoop-${user.name} </value>
 </property>
</configuration>

(2) Configure HDFS site xml

[root@CentOS ~]# cd /usr/hadoop-2.9.2/ 
[root@CentOS hadoop-2.9.2]# vi etc/hadoop/hdfs-site.xml
#Edit after entering the file
<configuration>
<!--block Replica factor-->
 <property> <name>dfs.replication</name>
         <value>1</value>
 </property>
 <!--to configure Sencondary namenode Physical host-->
 <property>
         <name>dfs.namenode.secondary.http- address</name>
         <value>CentOS:50090</value>
 </property>
</configuration>

(3) Configure slave text file

[root@CentOS ~]#vi /usr/hadoop-2.9.2/etc/hadoop/slaves 
#Replace localhost with CentOS

Here we have basically completed the installation and configuration of Hadoop
Finally, we have a little left. We need to start the HDFS system and format the HDFS system once
*

3, Start HDFS system (format it)

(1) When you start the HDFS system for the first time, you need to format the system to prepare for subsequent startup. Here, you need to pay attention to this only when you start it for the first time. You can ignore this step when you start HDFS again in the future!

[root@CentOS~]#hdfs namenode -format
#After executing the command, some INFO log files will appear. We can see that success appears below, indicating that we have succeeded

Create the image file that needs to be loaded when the NameNode service starts in HDFS.
(2) Start HDFS service
The startup script is placed in the sbin directory. Because we have set the sbin directory to PATH, we can directly use start DFS The SH script starts HDFS. If you want to shut down the HDFS system, you can use stop DFS sh

[root@CentOS ~]# start-dfs.sh
#If the yes/no option appears during startup, we write yes
 After successful startup, users can use JDK Self contained jsp Command view java The process is normal and can be seen DataNode,NameNode, SecondaryNameNode Three services.
[root@CentOS hadoop-2.9.2]# jps
8912 DataNode
8769 NameNode
9276 SecondaryNameNode
9903 Jps

Finally, users can access the WEB page embedded in the NameNode service to view the running status of HDFS. By default, the listening port of the service is 50070. The access effect is as follows:
Here, our HADOOP will be completed according to the

Tags: Java Linux Big Data CentOS Hadoop

Posted by CrusaderSean on Sun, 15 May 2022 07:54:17 +0300