Hadoop learning road Hadoop cluster construction and simple application

1, Concept understanding Master-slave structure: in a cluster, some nodes will act as the master server, and other servers will act as slave servers. At present, this architecture mode is called master-slave structure. Classification of master-slave structure: 1. One master and many slaves 2. Multi master and multi slave HDFS and YARN in H ...

Posted by fullyscintilla on Mon, 04 Apr 2022 16:37:28 +0300

Flink state programming

summary   the core of Flink processing mechanism is "stateful streaming computing". We have also mentioned "state" many times in previous chapters. Whether it is simple aggregation, window aggregation or the application of processing functions, there will be state. Previously, we have briefly introduced stateful flow p ...

Posted by freakyG on Sun, 03 Apr 2022 13:47:02 +0300

hadoop introduction deployment document

hadoop deployment document Introduction to hadoop What is hadoop 1) Hadoop is a distributed system infrastructure developed by the Apache foundation. 2) It mainly solves the problems of massive data storage and massive data analysis and calculation. 3) In a broad sense, Hadoop usually refers to a broader concept - Hadoop ecosystem. hadoop ...

Posted by php_joe on Fri, 01 Apr 2022 17:37:55 +0300

node connection MongoDB and common operation records

Personal interface for the back end of the whole stack project preparation I think it is necessary to understand some concepts in advance Install mongoDB Library npm i mongodb Install a mongodb Library Connect database // If the database address is not changed, it is the default address const url = "mongodb://localhost:27017"; ...

Posted by goodrunb on Thu, 31 Mar 2022 11:06:18 +0300

[Flink knowledge summary] [Join topic]

Video address: https://www.bilibili.com/video/av92215954/ Document address: https://files.alicdn.com/tpsservice/e4356097e11364edadb5627a892ee53b.pdf Application scenario of Join Exposure related clicks are involved in almost all company apps; Dimension splicing between two streams of data; Widen the watch, etcIn the e-commerce scenario, t ...

Posted by Sharif on Wed, 30 Mar 2022 12:15:51 +0300

day 05 DQL data query language --- connection query --- enter the room

Because sql statements are not case sensitive, all commands in this article use lowercase for convenience Previous contents day 01 first knowledge of Mysql and DDL data definition languageday 02 DML data operation languageday 03 DQL data query language -- a first glimpseday 04 DQL data query language - single table query - a slight succ ...

Posted by kampbell411 on Tue, 29 Mar 2022 07:33:25 +0300

The most complete in history - Kafka manager configuration and installation Kerberos (ambari HDP) authentication

This article uses Ambari's kafka to configure kafka manager. CDH and open source can follow the same steps. kafka is the to enable kerberos authentication. Kafka manager function First, let's take a look at the role of Kafka Manager: Manage multiple clusters Easily check cluster status (subject, consumer, offset, ag ...

Posted by jdbfitz on Mon, 28 Mar 2022 09:49:32 +0300

Java master Zhenjing application framework volume: PDF ultra clear version of Java Web core framework

Content introduction Java master Sutra: Java Web core framework (application framework volume) Author: edited by Liu zhongbing Java Research Laboratory [book introduction] This book first analyzes the hierarchical design method of Java Web application, selects the application framework, and then explains various Java Web application framewo ...

Posted by DoddsAntS on Sun, 27 Mar 2022 21:14:43 +0300

Big Data Analysis - Matplotlib Introduction (not yet completed)

This tutorial is just a tour of the basic methods used by Maplotlib In the next chapter, we'll take an advanced tutorial Problem area 1. Why use plt. Gcf(). Set_ PLT after facecolor (np.ones(3)* 240/255). Figure will fail. 2. Matplotlib. Introduction to pyplot Matplotlib is Python's drawing library. It works with NumPy, providing an ef ...

Posted by Muppet9010 on Sun, 27 Mar 2022 20:15:51 +0300

Hive Integrated Spark Tutorial (Hive on Spark)

Introduction to Hive Engine Hive engines include: default MR, tez, spark The bottom engine is MR (Mapreduce), which doesn't need to be configured. Hive runs with it Hive on Tez configuration: https://blog.csdn.net/weixin_45417821/article/details/115181000 Hive on Spark: Hive is responsible for both storing metadata and parsing and optim ...

Posted by r4ck4 on Sun, 27 Mar 2022 20:02:20 +0300