New features of PostgreSQL 13: FETCH FIRST WITH TIES

Hello, everyone. I'm Mr. Tony, who only talks about technology and doesn't cut his hair.

PostgreSQL global development team announced PostgreSQL 13 on September 24, 2020 Official release . Next, I will analyze the function enhancements brought by this new version through a series of articles. This article first introduces a new feature related to SQL development: the FETCH FIRST clause is added With options , which can be used to return more data rows with the same ranking.

If you find the article useful, you are welcome to pay attention ❤️, comment 📝, give the thumbs-up 👍

test data

First, we create a simple test table t and insert some data:

CREATE TABLE t(name varchar(20), score int);
INSERT INTO t VALUES('Zhang San', 80), ('Li Si', 77), ('Wang Wu', 80), ('Zhao Liu', 90), ('Sun Qi', 80);

SELECT * FROM t;
name |score|
-----|-----|
Zhang San  |   80|
Li Si  |   77|
Wang Wu  |   80|
Zhao Liu  |   90|
Sun Qi  |   80|

The table contains only two fields of name and score, of which three students have the same score.

Implementation before PostgreSQL 12

Postgre SQL 12 and earlier versions implemented the FETCH Clause:

OFFSET start { ROW | ROWS }
FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY

Where, OFFSET means to skip the specified number of start lines, and the default is 0; FETCH means to return the result of count ROWS at most, and the default value is 1; ROW and ROWS are synonyms, and FIRST and NEXT are synonyms; ONLY means no more data is returned.

For example, the following statement returns the two students with the highest scores:

SELECT *
FROM t
ORDER BY score DESC
FETCH FIRST 2 ROWS ONLY;

name |score|
-----|-----|
Zhao Liu  |   90|
Zhang San  |   80|

Although the two records with the highest scores are indeed returned, it can be seen from the data in the table that there are two students with a score of 80. If the ranking is calculated, they should also be the second.

In addition, it should be noted that if the ORDER BY clause is not specified, the query will return 2 random rows of records:

SELECT *
FROM t
FETCH FIRST 2 ROWS ONLY;

name |score|
-----|-----|
Zhang San  |   80|
Li Si  |   77|

📝 PostgreSQL also supports another clause that limits the number of query results: limit {count | all} offset start, which can also achieve the same function.

PostgreSQL 13 enhancements

In order to solve the problem of returning data with the same ranking, PostgreSQL 13 enhances the FETCH FIRST clause:

OFFSET start { ROW | ROWS }
FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } { ONLY | WITH TIES }

Where, the WITH TIES option means to return additional data rows with the same ranking as the last row, and the ORDER BY clause must be specified at the same time. For example, the following statement returns all the students with the top 2 scores:

SELECT *
FROM t
ORDER BY score DESC
FETCH FIRST 2 ROWS WITH TIES;

name |score|
-----|-----|
Zhao Liu  |   90|
Zhang San  |   80|
Wang Wu  |   80|
Sun Qi  |   80|

If the ORDER BY clause is not specified when using the WITH TIES option, an error is returned:

SELECT *
FROM t
FETCH FIRST 2 ROWS WITH TIES;
RROR: WITH TIES cannot be specified without ORDER BY clause

matters needing attention

When using the WITH TIES option, it should be noted that the number of results returned by the FETCH FIRST statement is no longer a determined value. For example, in the above statement, although count is specified as 2, 4 rows of data are actually returned. If we need to realize this function based on the actual number of rows to be paged each time; Corresponding to the above example, the first page returns 4 rows of data, and the statement when querying the second page is as follows:

SELECT *
FROM t
ORDER BY score
OFFSET 4
FETCH FIRST 2 ROWS WITH TIES;

Where OFFSET 4 is the number of rows actually returned on the previous page.

In addition, other fields for sorting display cannot appear in the ORDER BY clause in this case. If we want to find out all the students with the top 2 scores, and sort the students with the same scores according to their names and Pinyin:

SELECT *
FROM t
ORDER BY score DESC, name COLLATE "zh_CN"
FETCH FIRST 2 ROWS WITH TIES;

name |score|
-----|-----|
Zhao Liu  |   90|
Sun Qi  |   80|

Since the name field appears in ORDER BY, other students WITH a score of 80 are no longer considered to be ranked the same data, unless their name is also called "Sun Qi". To solve this problem, you can add a subquery or use a general table expression (WITH clause):

WITH d AS (
  SELECT *
  FROM t
  ORDER BY score DESC
  FETCH FIRST 2 ROWS WITH TIES
)
SELECT *
FROM d
ORDER BY score DESC, name COLLATE "zh_CN";

name |score|
-----|-----|
Zhao Liu  |   90|
Sun Qi  |   80|
Wang Wu  |   80|
Zhang San  |   80|

Among them, the first ORDER BY is used for score ranking, and the second ORDER BY is used for final sorting display.

📝 According to the SQL standard, in addition to using count to limit the number of returned rows, N PERCENT can also be used to limit the number of returned rows by percentage. PostgreSQL does not implement this function at present.

summary

As the most advanced open source database in the world, the new version of PostgreSQL adds the support of the FETCH FIRST clause for WITH TIES, which provides more convenience for application development. For more information about the new features of PostgreSQL 13, please refer to This article.

Tags: PostgreSQL

Posted by hmemnon on Fri, 13 May 2022 15:23:45 +0300