CDH6.3.2 installing elasticsearch7 9.0 (super detailed, with my own bag)

Xiaobai installed some custom services on cdh for the first time, and stepped on a lot of holes in the process. Some posts on the Internet are the same. I feel that everyone is copying and pasting and then publishing directly. I have summarized the pitfalls and solutions encountered in the process. I hope this blog can help you. At the same time, I also attach the ready-made parcel package and csd files.

Make Parcel package and csd file

This part refers to

However, I stepped on a lot of holes in the process and read the shell script carefully

1. Download cm_ext

Cloudera offers_ Ext tool to verify the generated csd and parcel

mkdir -p ~/github/cloudera
cd ~/github/cloudera
git clone cm_ext
mvn package

Description: and build_ csd. The path of executing jar package in SH script file is ~ / github/cloudera /

2. Configure Java,maven and other environment variables

Must be in CM_ Check the environment variables in the EXT directory and echo: java_home, maven_home, javac.

Pit: the environment variable is configured in / etc/profile, but in cm_ Echo $Java in Ext directory_ Home discovery still returns a null value.
Solution: export Java directly from the command line under the directory_ HOME…

Download elastscearch3. Install package

ES Download:

cd /root/github/cloudera/
mkdir elasticsearch
cd elasticsearch

4. Download the script for making Parcel package and CSD file

cd /root/github/
git clone

5. Make and verify the Parcel package and CSD file of Elasticsearch

$ cd elasticsearch-parcel
$ POINT_VERSION=5 VALIDATOR_DIR=/root/github/cloudera/cm_ext OS_VER=el7 PARCEL_NAME=ElasticSearch ./ /root/github/cloudera/elasticsearch/elasticsearch-7.9.0-linux-x86_64.tar.gz
$ VALIDATOR_DIR=/root/github/cloudera/cm_ext CSD_NAME=ElasticSearch ./

Several issues need to be considered before making:
(I changed 1 and 2 and did not manage 3. If 1 and 2 are not modified, the problem is that es can be installed on cdh, but the startup is unsuccessful or the status is gray unknown health after startup)

Pit 1: could not find java in JAVA_HOME or bundled at /usr/java/latest/bin/java
Comment out export JAVA_HOME=/usr/java/latest

*This problem needs to be modified before making the parcel package and csd file, otherwise the csd file will be made again

Pit 2: incompatible versions found during installation on cdh
Elasticsearch parcel directory

  • parcel-src/meta/parcel.json :
    Modified as: "dependencies": "CDH (> = 5.0), CDH (< 7.0)",

  • csd-src/descriptor/service.sdl :
    Modified as: "cdhVersion": {"min": 6}

Pit 3: es configuration problem (do not have to change, the parameters of the old version can still be used at present)
Elasticsearch parcel directory
The script will generate elasticsearch. XML after config() YML profile

# original
        echo ""
        while IFS= read -r line; do hosts=$hosts`echo $line | awk -F':' '{print $1}'`", " ; done <
        hosts=" "$hosts"localhost]"
        echo $hosts

The generated elasticsearch YML file: data-lakes
discovery.zen.minimum_master_nodes: 1
http.cors.allow-origin: /.*/
http.cors.enabled: true
http.port: 9200 /var/lib/elasticsearch/
path.logs: /var/log/elasticsearch/ cdh-master ["cdh-master", _local_] [cdh-master, cdh-slave01, localhost]

Starting from 7.0, es abandoned the original zen discovery configuration item and introduced a new "cluster.initial_master_nodes" configuration item.
In the development environment, you can set up multiple es nodes on the same host. By default, you can set up an ES cluster. In the production environment, the ES node will be deployed on different hosts, and auto bootstrap cannot work, so you need to configure cluster initial_ master_ nodes´╝îdiscovery.seed_hosts specifies the master node so that the ES node can join the cluster correctly.
*1. Version change reference:
2. Configuration reference:
3. Chinese version:*
(pay special attention to whether to write FQDN or IP)

Modify control After sh:

        echo "discovery.seed_hosts"
        while IFS= read -r line; do hosts=$hosts`echo $line | awk -F':' '{print $1}'`", " ; done <
        hosts="discovery.seed_hosts: "$hosts"localhost]"
        echo $hosts
        echo "cluster.initial_master_nodes"
        while IFS= read -r line; do hosts=$hosts`echo $line | awk -F':' '{print $1}'`", " ; done <
        nodes="cluster.initial_master_nodes: "$hosts"localhost]"
        echo $nodes
        echo "Creating elasticsearch.yml"
        echo "" > elasticsearch.yml
        while IFS= read -r line; do echo ${line%=*}": "${line#*=} >> elasticsearch.yml ; done <
        echo $hosts >> elasticsearch.yml
        echo $nodes >> elasticsearch.yml
        cp -uf elasticsearch.yml $ES_HOME/config/ data-lakes
discovery.zen.minimum_master_nodes: 1
http.cors.allow-origin: /.*/
http.cors.enabled: true
http.port: 9200 /var/lib/elasticsearch/
path.logs: /var/log/elasticsearch/ cdh-master ["cdh-master", _local_]
discovery.seed_hosts:[cdh-master, cdh-slave01, localhost]
cluster.initial_master_nodes:[cdh-master, cdh-slave01, localhost]

6. View Parcel package and csd file

Directory at this time

CDH6 installation and deployment ES service

1. Move csd jar package

cp /root/github/cloudera/elasticsearch-parcel/build-csd/ELASTICSEARCH-1.0.jar /opt/cloudera/csd

2. Deploy to httpd service

mkdir -p /var/www/html/elasticsearch
cd /var/www/html/elasticsearch
cp /root/github/cloudera/elasticsearch-parcel/build-parcel/ELASTICSEARCH-0.0.5.elasticsearch.p0.5-el7.parcel ./
cp /root/github/cloudera/elasticsearch-parcel/build-parcel/manifest.json ./elasticsearch

The file directory is as follows

Open browser http://cdh-master/elasticsearch/ Verify it

3. Empower CM users

cd /opt/cloudera/csd
chown -R cloudera-scm:cloudera-scm ./*
cd /opt/cloudera/parcel-repo
chown -R cloudera-scm:cloudera-scm ./*

4.CM web page restart service

systemctl restart cloudera-scm-server

Pit 1: the first time I used another method to make parcel and csd files, everything seemed to be going well, and I kept fail ing when restarting the service.
Later, it was found that the csd file was incompatible. If the csd file was deleted, it could be started normally. This also shows that the csd file is useless.

5. Add remote Parcel repository URL

6. Download, distribute and activate

Click download - > distribute - > activate

7. Restart cm add service

During startup, the following pits appeared:

Pit 1: UserException

Exception in thread "main" org.elasticsearch.bootstrap.BootstrapException: org.elasticsearch.cli.UserException: unable to create temporary keystore at [/opt/cloudera/parcels/ELASTICSEARCH/config/elasticsearch.keystore.tmp], write permissions required for [/opt/cloudera/parcels/ELASTICSEARCH/config] or run [elasticsearch-keystore upgrade]

All nodes in the cdh cluster execute

chmod 777 /opt/cloudera/parcels/ELASTICSEARCH/config/

Pit 2: startup exception: failed to obtain node locks

It is found that there is an elasticsearch process in the background, so the lock cannot be obtained

#Check your port usage
netstat -alnp | grep 9200
#Terminate process
kill -9 24776

Conclusion, online disk

Finally, it started successfully

I spent two days installing es and stepped on a lot of holes. The first is because of the offline environment (which I am also helpless). The second is because I found the wrong tutorial at the beginning. Installing custom services on cdh is really troublesome!!!

Extraction code: zig2

Tags: Big Data ElasticSearch cloudera

Posted by Unholy Prayer on Thu, 28 Apr 2022 10:20:45 +0300