Select first row in each GROUP BY group?

On Oracle 8i+, SQL Server 2005+, PostgreSQL 8.4+, DB2, Firebird 3.0+, Teradata, Sybase, Vertica:

WITH summary AS (
    SELECT p.id, 
           p.customer, 
           p.total, 
           ROW_NUMBER() OVER(PARTITION BY p.customer 
                                 ORDER BY p.total DESC) AS rk
      FROM PURCHASES p)
SELECT s.*
  FROM summary s
 WHERE s.rk = 1

Supported by any database:

But you need to add logic to break ties:

  SELECT MIN(x.id),  -- change to MAX if you want the highest
         x.customer, 
         x.total
    FROM PURCHASES x
    JOIN (SELECT p.customer,
                 MAX(total) AS max_total
            FROM PURCHASES p
        GROUP BY p.customer) y ON y.customer = x.customer
                              AND y.max_total = x.total
GROUP BY x.customer, x.total
WITH consecutive_sessions AS ( -- First, get all study sessions with row numbers to identify consecutive sessions SELECT ss.student_id, ss.subject, ss.session_date, ss.hours_studied, ROW_NUMBER() OVER (PARTITION BY ss.student_id ORDER BY ss.session_date) as session_order FROM study_sessions ss ), session_groups AS ( -- Group consecutive sessions by checking date gaps SELECT cs1.*, -- Check if there's a gap > 2 days from previous session CASE WHEN LAG(session_date) OVER (PARTITION BY student_id ORDER BY session_date) IS NULL THEN 1 WHEN DATEDIFF(session_date, LAG(session_date) OVER (PARTITION BY student_id ORDER BY session_date)) <= 2 THEN 0 ELSE 1 END as is_new_group FROM consecutive_sessions cs1 ), grouped_sessions AS ( -- Create group numbers for consecutive session sequences SELECT *, SUM(is_new_group) OVER (PARTITION BY student_id ORDER BY session_date) as group_id FROM session_groups ), pattern_analysis AS ( -- For each group of consecutive sessions, analyze the pattern SELECT student_id, group_id, COUNT(DISTINCT subject) as unique_subjects, COUNT(*) as total_sessions, SUM(hours_studied) as total_hours, -- Create the subject sequence string to check for repeating patterns GROUP_CONCAT(subject ORDER BY session_date SEPARATOR '|') as subject_sequence FROM grouped_sessions GROUP BY student_id, group_id HAVING COUNT(*) >= 6 -- At least 6 sessions for 2 complete cycles of 3 subjects AND COUNT(DISTINCT subject) >= 3 -- At least 3 different subjects ), valid_patterns AS ( -- Check if the pattern actually repeats for at least 2 complete cycles SELECT student_id, group_id, unique_subjects as cycle_length, total_sessions, total_hours, subject_sequence, -- Check if we have at least 2 complete cycles CASE WHEN total_sessions >= (unique_subjects * 2) THEN 1 ELSE 0 END as has_complete_cycles FROM pattern_analysis ), cycle_validation AS ( -- Validate that the pattern actually repeats correctly SELECT gs.student_id, gs.group_id, vp.cycle_length, vp.total_hours, -- Check if the subject sequence repeats correctly COUNT(*) as session_count, -- Get the first cycle pattern GROUP_CONCAT( CASE WHEN gs.session_order <= vp.cycle_length THEN gs.subject ELSE NULL END ORDER BY gs.session_date SEPARATOR '|' ) as first_cycle, -- Check if subsequent sessions follow the same pattern COUNT(CASE WHEN gs.session_order > vp.cycle_length AND gs.subject = ( SELECT subject FROM grouped_sessions gs2 WHERE gs2.student_id = gs.student_id AND gs2.group_id = gs.group_id AND gs2.session_order = ((gs.session_order - 1) % vp.cycle_length) + 1 ) THEN 1 ELSE NULL END) as matching_pattern_count FROM grouped_sessions gs JOIN valid_patterns vp ON gs.student_id = vp.student_id AND gs.group_id = vp.group_id WHERE vp.has_complete_cycles = 1 GROUP BY gs.student_id, gs.group_id, vp.cycle_length, vp.total_hours HAVING session_count >= (cycle_length * 2) AND matching_pattern_count = (session_count - cycle_length) -- All sessions after first cycle match pattern ) -- Final result with student information SELECT s.student_id, s.student_name, s.major, cv.cycle_length, cv.total_hours as total_study_hours FROM cycle_validation cv JOIN students s ON cv.student_id = s.student_id ORDER BY cv.cycle_length DESC, cv.total_hours DESC
最新发布
11-21
This is a MySQL query that selects the student ID (s_id), assigns a sequential number to each row (i), and calculates the rank of each student based on their sum of scores (sum_score). The query uses a subquery to first group the scores by student ID and calculate the sum of scores for each student. This subquery is then joined with a variable initialization subquery that sets the initial values of @k, @i, and @score to 0. The variable @k is used to keep track of the current rank while iterating over the rows. The variable @i is used to assign a sequential number to each row. The variable @score is used to compare the sum_score of the current row with the sum_score of the previous row. The CASE statement is used to check if the sum_score of the current row is equal to the sum_score of the previous row. If they are equal, then the rank remains the same. If they are not equal, then the rank is updated to the current sequential number. Here is a breakdown of the query: 复制 SELECT a.s_id, -- Select the student ID @i:=@i+1 AS i, -- Assign a sequential number to each row @k:=(case when @score=a.sum_score then @k else @i end) as rank, -- Calculate the rank a.sum_score AS score -- Select the sum of scores for each student FROM (SELECT s_id,SUM(s_score) AS sum_score FROM score GROUP BY s_id ORDER BY sum_score DESC) a, -- Subquery to calculate sum of scores for each student (SELECT @k:=0,@i:=0,@score:=0) s -- Subquery to initialize variables Note that the use of variables in this query is not recommended, as it can lead to unexpected results if the variables are not reset properly. It is better to use a subquery or a window function to calculate the rank. 翻译
02-27
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值