Hive 刷題——惡意取消訂單使用者統計

晓枫的春天發表於2024-03-26

問題描述

給出了訂單表資料,找出惡意退單使用者(30 min 內兩次取消訂單使用者明細)

示例資料

order_id    user_id    order_status     operate_time
1101         a         已支付        2024-03-01 10:00:00
1102         a         已取消        2024-03-01 10:10:00
1103         a         待支付        2024-03-01 10:20:00
1104         b         已取消        2024-03-01 10:30:00
1105         a         待確認        2024-03-01 10:50:00
1106         a         已取消        2024-03-01 11:00:00
1107         b         已取消        2024-03-01 11:40:00
1108         b         已取消        2024-03-01 11:50:00
1109         b         已支付        2024-03-01 12:00:00
1110         b         已取消        2024-03-01 12:11:00
1111         c         已取消        2024-03-01 12:20:00
1112         c         已取消        2024-03-01 12:30:00
1113         c         已取消        2024-03-01 12:55:00
1114         c         已取消        2024-03-01 13:00:00

參考實現

with temp as (select 1101 as order_id, 'a' as user_id, "已支付" as order_status, "2023-01-01 10:00:00" as operate_time
              union all
              select 1102 as order_id, 'a' as user_id, "已取消" as order_status, "2023-01-01 10:10:00" as operate_time
              union all
              select 1103 as order_id, 'a' as user_id, "待支付" as order_status, "2023-01-01 10:20:00" as operate_time
              union all
              select 1104 as order_id, 'b' as user_id, "已取消" as order_status, "2023-01-01 10:30:00" as operate_time
              union all
              select 1105 as order_id, 'a' as user_id, "待確認" as order_status, "2023-01-01 10:50:00" as operate_time
              union all
              select 1106 as order_id, 'a' as user_id, "已取消" as order_status, "2023-01-01 11:00:00" as operate_time
              union all
              select 1107 as order_id, 'b' as user_id, "已取消" as order_status, "2023-01-01 11:40:00" as operate_time
              union all
              select 1108 as order_id, 'b' as user_id, "已取消" as order_status, "2023-01-01 11:50:00" as operate_time
              union all
              select 1109 as order_id, 'b' as user_id, "已支付" as order_status, "2023-01-01 12:00:00" as operate_time
              union all
              select 1110 as order_id, 'b' as user_id, "已取消" as order_status, "2023-01-01 12:11:00" as operate_time
              union all
              select 1111 as order_id, 'c' as user_id, "已取消" as order_status, "2023-01-01 12:20:00" as operate_time
              union all
              select 1112 as order_id, 'c' as user_id, "已取消" as order_status, "2023-01-01 12:30:00" as operate_time
              union all
              select 1113 as order_id, 'c' as user_id, "已取消" as order_status, "2023-01-01 12:55:00" as operate_time
              union all
              select 1114 as order_id, 'c' as user_id, "已取消" as order_status, "2023-01-01 13:00:00" as operate_time)
select user_id
from (select user_id,
             sum(1)
                 over (partition by user_id order by unix_timestamp(operate_time) range between 1800 preceding and current row ) cnt
      from temp
      where order_status = '已取消') t
where cnt >= 2
group by user_id;

相關文章