MySQL 8.0 window function ranking and topN problems

MySQL 8.0 window function ranking and topN problems

I've been using mysql5 7. Unexpectedly, a very useful function - window function is added to 8.0. Here, the window function, its usage and problems that can be solved are summarized as follows

 

what

Window function --- also known as OLAP function --- online analytical processing

The window function is moved from Oracle. It seems that Oracle's acquisition of mysql is also beneficial

It mainly solves the problems of ranking and topN in business

 

how

Basic syntax:

Window function name() over (partition by Group column names order by Sort column name) as ranking

(as ranking is a common naming method. Of course, you can use other names)

Classification:

  • Some of MySQL's original aggregate functions can also be used in the syntax of window functions as window functions, such as sum(),avg(),max(),min(),count(), etc

    This paper mainly discusses the serial number function combined with the problem

  • Q: "can you filter the results returned by the window function directly?"

    Answer: "because the occurrence position (order) of the window function is after where and group by, so it can't" (please note that what is said here is "direct". Of course, you can take the result of the window function as a temporary table and then select...where...)

 

Difference from group by (important)

Number of rows

For example, the table math has the following fields: class, student ID and score of each student in a certain subject. Let's not rank first, but count the number of people in each class first

If used

select class, count(stu_id)
from math
group by class
order by class

# So the result is
class	count(stu_id)
class1	3
class2	1
class3	2

In other words, using group by may change the number of rows in the table - one row shows one group

 

If you use the window function - the number of rows in the original table is the number of rows

select class, count(stu_id) over (partition by class order by class) as stu_num
from class

# The result is
class	stu_num
class1	3
class1	3
class1	3
class2	1
class3	2
class3	2

Note: Parton by grouping will not reduce the number of rows in the original table

 

Hierarchical aggregate function

sum, avg, count, etc. can also be used in the window function, which aggregates the data of each row and above, such as:

# Use sum as the result of aggregation function --- hierarchical sum of the total scores of students in each class
class	stu_id	Score
1		001		100
1		002		200
1		003		300
2		004		100
2		005		200

# If you use the traditional group by, there are only class and sum(score)

 

Ordinal function

For the ranking of grades and rankings, there will be different ranking methods in life, such as how to rank those with the same score. So there are different serial number functions: if you have the following results

table marks:
stu_id	marks
1		5
2		5
3		4
4		2
5		1

dense_rank() -- the same score and rank, not the next rank. This is often used in daily life

select stu_id, dense_rank() over (order by marks) as dense_ranking
from marks

# The result is
stu_id	marks	ranking
1		5		1
2		5		1
3		4		2
4		2		3
5		1		4

rank() -- the same score and rank, but it will "occupy the rank". Use the above data:

select stu_id, rank() over (order by marks) as ranking
from marks

# The result is
stu_id	marks	ranking
1		5		1
2		5		1
3		4		3
4		2		4
5		1		5

row_number() -- the same score, different ranking, also ranking, as the name suggests, just like the line number

select stu_id, row_number() over (order by marks) as row_number
from marks

# The result is
stu_id	marks	ranking
1		5		1
2		5		2
3		4		3
4		2		4
5		1		5

 

Ranking problem

https://leetcode-cn.com/problems/rank-scores/

(mysql 8)

select 
    a.Score as Score,
    (select count(distinct b.Score) from Scores as b where b.Score>=a.Score) as `Rank`
from Scores as a
order by Score desc;

(conventional solution)

select 
	a.Score as Score, 
	(select count(b.score) from Scores as b where b.Score>=a.Score) as `Rank`
from Scores as a
order by Score desc;

Idea: for each score, the number of scores greater than the score in the table is the ranking of scores

 

topN problem

Suppose there is a table Scores

Stu_id		subject		score 
1			math		99
2			chinese		100
1			english		66
2			math		80
2			english		80
1			chinese		98

We want to know that each student's best two courses

select *, row_number() over (partition by Stu_id order by score) as ranking
from Scores
where ranking<2;

Stu_id		subject		score	ranking 
1			math		99		1
1			chinese		98		2
1			english		66		3
2			chinese		100		1
2			math		80		2
2			english		80		3
# Filtering by ranking < 2

 

Tags: MySQL

Posted by fazlionline on Mon, 23 May 2022 10:10:26 +0300