In the front is a summary of common date functions, followed by a continuous date sql problem and its solution routine.
1. Current date and time
select current_timestamp -- 2020-12-05 19:16:29.284
2. Get the current date, which is 2020-12-05
SELECT current_date; ## OR SELECT current_date(); -- 2020-12-05
3. Obtain the timestamp under unix system
SELECT UNIX_TIMESTAMP(); -- 1524884881
4. The current is 2020-12-05
select substr(current_timestamp, 0, 10); -- current_timestamp
5. The current is 2020-12-05
select date_sub(current_date, 1); --2020-12-04
6. Yyyy MM DD HH: mm: SS interception date
select to_date("2017-10-22 10:10:10"); -- 2017-10-22 select date_format("2017-10-22" "yyyy-MM") -- 2017-10
7. Days difference between two dates
select datediff("2017-10-22", "2017-10-12"); -- 10 select datediff("2017-10-22 10:10:10", "2017-10-12 23:10:10"); -- 10 select datediff("2017-10-22 01:10:10", "2017-10-12 23:10:10"); -- 10
8. Time interception
select from_unixtime(cast(substr("1504684212155", 0,10) as int)) dt; -- 2017-09-06 15:50:12
9. Timestamp to date
Syntax: to_date(string timestamp)
select to_date(from_unixtime(UNIX_TIMESTAMP())); -- 2018-04-28 select FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd 10:30:00'); -- 2018-04-28 10:30:00 select concat(date_sub(current_date,1),' 20:30:00'); -- 2018-04-27 20:30:00 -- hive version 1.2.0 select date_format(date_sub(current_date,1),'yyyy-MM-dd 20:30:00');
10. Date increase
Note: only two original date formats are supported: yyyy MM DD yyyy MM DD HH: mm: SS
Otherwise, date is required_ Format to turn
date_add next_day
11. Additional questions
There is an active member table. The daily partition dimension is the member ID, which can be used with device_ ID instead, ask how to calculate the number of active members for three consecutive days in the last seven days. The structure of the table (dws.dws_member_start_day) is as follows (dt is the partition, the date format is yyyy MM DD, and each partition has a unique device_id):
device_id string dt string
Solution routine
1. First think about the date functions DateDiff and date that can be used_ sub/date_ add
2. A ranking function is used for continuous dates and continuous problems, but the value of the ranking function is a numerical value, which can be easily grouped only if it is mapped to the continuity of the date. For example, you can map the date to a continuous number or the number to a continuous date. The operations to achieve these two are through the previous datedff and date_ The principle of sub combination is that the continuous integer can be obtained by subtracting the date from the date, and the continuous date can be obtained by subtracting the integer from a date, where date_sub can be a continuous date obtained by reverse sorting.
3. This problem can be solved by subtracting successive sorting dates or sorting IDS, and then grouping
1. Add a column of sorting sequence number on the basis of the original table
SELECT device_id, dt, row_number() over(PARTITION BY device_id ORDER BY dt) ro FROM dws.dws_member_start_day
2. Convert the serial number to a continuous date, or convert the date to a continuous number, and then it becomes gid
-- 2.1 Convert serial number to continuous date SELECT device_id, dt, datediff(dt, date_add('2020-07-20', row_number() over(PARTITION BY device_id ORDER BY dt))) gid FROM dws.dws_member_start_day -- 2.2 Convert date to serial number SELECT device_id, dt, (datediff(dt, '2020-07-21') - row_number() over(PARTITION BY device_id ORDER BY dt)) gid FROM dws.dws_member_start_day
3. Group screening
SELECT device_id,count(1) FROM (SELECT device_id, dt, datediff(dt, date_add('2020-07-20', row_number() over(PARTITION BY device_id ORDER BY dt))) gid FROM dws.dws_member_start_day WHERE datediff(dt, CURRENT_DATE) BETWEEN -7 AND 7 ) tmp GROUP BY device_id, gid HAVING count(1) < 3
Wu Xie, the third master, is a rookie in the field of big data and artificial intelligence.
Please pay more attention