The most complete in history - Kafka manager configuration and installation Kerberos (ambari HDP) authentication

This article uses Ambari's kafka to configure kafka manager. CDH and open source can follow the same steps. kafka is the to enable kerberos authentication. Kafka manager function First, let's take a look at the role of Kafka Manager: Manage multiple clusters Easily check cluster status (subject, consumer, offset, ag ...

Posted by jdbfitz on Mon, 28 Mar 2022 09:49:32 +0300

Java master Zhenjing application framework volume: PDF ultra clear version of Java Web core framework

Content introduction Java master Sutra: Java Web core framework (application framework volume) Author: edited by Liu zhongbing Java Research Laboratory [book introduction] This book first analyzes the hierarchical design method of Java Web application, selects the application framework, and then explains various Java Web application framewo ...

Posted by DoddsAntS on Sun, 27 Mar 2022 21:14:43 +0300

Big Data Analysis - Matplotlib Introduction (not yet completed)

This tutorial is just a tour of the basic methods used by Maplotlib In the next chapter, we'll take an advanced tutorial Problem area 1. Why use plt. Gcf(). Set_ PLT after facecolor (np.ones(3)* 240/255). Figure will fail. 2. Matplotlib. Introduction to pyplot Matplotlib is Python's drawing library. It works with NumPy, providing an ef ...

Posted by Muppet9010 on Sun, 27 Mar 2022 20:15:51 +0300

Hive Integrated Spark Tutorial (Hive on Spark)

Introduction to Hive Engine Hive engines include: default MR, tez, spark The bottom engine is MR (Mapreduce), which doesn't need to be configured. Hive runs with it Hive on Tez configuration: https://blog.csdn.net/weixin_45417821/article/details/115181000 Hive on Spark: Hive is responsible for both storing metadata and parsing and optim ...

Posted by r4ck4 on Sun, 27 Mar 2022 20:02:20 +0300

Data warehouse | COUNT DISTINCT data tilt optimization

What is data skew Data skew is very common in the MapReduce programming model. It is that a large number of the same key s are assigned to a partition, resulting in very slow running of individual tasks, which affects the execution efficiency of the whole task. The root cause of data skew is that the amount of data processed by a few workers ...

Posted by dazzclub on Sun, 27 Mar 2022 11:33:48 +0300

Spark GraphX Programming Guide

Spark series interview questionsSpark interview question (I)Spark interview questions (II)Spark interview questions (III)Spark interview questions (IV)Spark interview question (V) -- data skew tuningSpark interview question (VI) -- spark resource optimizationSpark interview question (VII) -- Spark Program Development and optimizationSpark inter ...

Posted by alecks on Sun, 27 Mar 2022 11:02:36 +0300