๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
SQL/์‹ค๋ฌด SQL

[PostgreSQL] 8. N์ผ ํ›„ ์ง€์†๋ฅ  ์‚ฐ์ถœํ•˜๊ธฐ(์ข…ํ•ฉ์ฟผ๋ฆฌ ์ฒจ๋ถ€)

by ISLA! 2024. 2. 1.
๐ŸŽฏ ๋“ฑ๋ก ์‹œ์ ์„ ๊ธฐ์ค€์œผ๋กœ ์ผ์ • ๊ธฐ๊ฐ„ ๋™์•ˆ ์‚ฌ์šฉ์ž๊ฐ€ ์ง€์†ํ•ด์„œ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š”์ง€๋ฅผ ์กฐ์‚ฌํ•  ๋•Œ
์ง€์†๋ฅ ๊ณผ ์ •์ฐฉ๋ฅ ์„ ์‚ฌ์šฉํ•˜๋ฉด ๊ฒฝํ–ฅ์„ ์‰ฝ๊ฒŒ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ๋‹ค.

 

์ง€์†๋ฅ ๊ณผ ์ •์ฐฉ๋ฅ ์ด๋ž€?

  • ์ง€์†๋ฅ  : ๋“ฑ๋ก์ผ ๊ธฐ์ค€์œผ๋กœ ์ดํ›„ ์ง€์ •์ผ ๋™์•ˆ ์‚ฌ์šฉ์ž๊ฐ€ ์„œ๋น„์Šค๋ฅผ ์–ผ๋งˆ๋‚˜ ์ด์šฉํ–ˆ๋Š”์ง€ ๋‚˜ํƒ€๋‚ด๋Š” ์ง€ํ‘œ
    • ๋“ฑ๋ก์ผ ์ดํ›„ ๋งค์ผ ์„œ๋น„์Šค๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋”๋ผ๋„, ํŒ์ • ๋‚ ์งœ ์•ˆ์— ์‚ฌ์šฉํ–ˆ๋‹ค๋ฉด ์ง€์†์ž๋กœ ์ทจ๊ธ‰ํ•จ
    • ์ง‘๊ณ„ ๋ฐฉ๋ฒ• : ๋“ฑ๋ก์ž์™€ ์‚ฌ์šฉ์ž ์ˆ˜๋ฅผ ์ง‘๊ณ„ํ•˜๊ณ , (์‚ฌ์šฉ์ž ์ˆ˜ / ๋“ฑ๋ก ์ˆ˜) ๋กœ ๊ตฌํ•œ๋‹ค.
  • ์ •์ฐฉ๋ฅ  : ๋“ฑ๋ก์ผ ๊ธฐ์ค€์œผ๋กœ ์ดํ›„ ์ง€์ •ํ•œ 7์ผ ๋™์•ˆ ์‚ฌ์šฉ์ž๊ฐ€ ์„œ๋น„์Šค๋ฅผ ์‚ฌ์šฉํ–ˆ๋Š”์ง€ ๋‚˜ํƒ€๋‚ด๋Š” ์ง€ํ‘œ
    • 7์ผ์ด๋ผ๋Š” ๊ธฐ๊ฐ„ ๋™์•ˆ ํ•œ ๋ฒˆ์ด๋ผ๋„ ์„œ๋น„์Šค๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค๋ฉด ์ •์ฐฉ์ž๋กœ ๋‹ค๋ฃธ
    • 7์ผ ์ •์ฐฉ๋ฅ ์€ ๋“ฑ๋ก ํ›„ 1์ผ๋ถ€ํ„ฐ 7์ผ๊นŒ์ง€์˜ ์ •์ฐฉ๋ฅ ์„ ๊ธฐ์ค€์œผ๋กœ ์‚ฐ์ถœํ•จ
    • 14์ผ ์ •์ฐฉ๋ฅ ์€ ์ดํ›„ 7์ผ(๋“ฑ๋ก์ผ๋ถ€ํ„ฐ 8์ผ ํ›„๋ถ€ํ„ฐ 14์ผ๊นŒ์ง€์˜ ์ •์ฐฉ๋ฅ )
    • ์ง‘๊ณ„ ๋ฐฉ๋ฒ• : ์‚ฌ์šฉ์ž ์ˆ˜ / ๋“ฑ๋ก ์ˆ˜ 
  • ์‚ฌ์šฉ์ž์—๊ฒŒ ๊ธฐ๋Œ€ํ•˜๋Š” ์‚ฌ์šฉ ์‚ฌ์ดํด์€ ์„œ๋น„์Šค๋ณ„๋กœ ๋‹ค๋ฅด๋‹ค. 
    • ์†Œ์…œ ๊ฒŒ์ž„๊ณผ ๋‰ด์Šค ์‚ฌ์ดํŠธ, SNS๋Š” ์‚ฌ์šฉ์ž๊ฐ€ ํ‹ˆํ‹ˆ์ด ๊ณ„์† ์ด์šฉํ•˜๊ธฐ๋ฅผ ๊ธฐ๋Œ€ํ•  ๊ฒƒ์ด๋‹ค ๐Ÿ‘‰ ์ง€์†๋ฅ 
    • ์ด์ปค๋จธ์Šค ์‚ฌ์ดํŠธ๋Š” ๋งค์ผ ์‡ผํ•‘ํ•˜๋ผ ๊ฒƒ์„ ๊ธฐ๋Œ€ํ•  ๊ฒƒ์ด๋‹ค ๐Ÿ‘‰ ์ •์ฐฉ๋ฅ 
    • ๋ฆฌ๋ทฐ ์‚ฌ์ดํŠธ๋Š” ์‚ฌ์šฉ์ž๊ฐ€ ๋ฌด์–ธ๊ฐ€๋ฅผ ์ฒดํ—˜ํ•  ๋•Œ๋งˆ๋‹ค ์‚ฌ์šฉํ•˜๊ธฐ๋ฅผ ๊ธฐ๋Œ€ํ•  ๊ฒƒ์ด๋‹ค ๐Ÿ‘‰ ์ •์ฐฉ๋ฅ 

 

์‹ค์Šต ๋ฐ์ดํ„ฐ(mst_user)
action_log ๋ฐ์ดํ„ฐ


๋‚ ์งœ๋ณ„ N์ผ ์ง€์†๋ฅ  ์ถ”์ด

  • ์ง€์†๋ฅ ์„ ์˜ฌ๋ฆด ์ˆ˜ ์žˆ๋Š” ๋Œ€์ฑ…์˜ ํšจ๊ณผ๊ฐ€ ์˜๋„ํ•œ ๋Œ€๋กœ ์ˆ˜ํ–‰๋˜๋Š”์ง€๋ฅผ ํ™•์ธํ•˜๋ ค๋ฉด, ์ง€์†๋ฅ ์„ ๋‚ ์งœ์— ๋”ฐ๋ผ ์ง‘๊ณ„ํ•œ ๋ฆฌํฌํŠธ๋ฅผ ๋งŒ๋“ค์–ด์•ผ ํ•œ๋‹ค.
  • ์„œ๋น„์Šค๋ฅผ ๋” ํ™œ๋ฐœํ•˜๊ฒŒ ํ•˜๋ ค๋ฉด ๋“ฑ๋ก ๋‹ค์Œ ๋‚ ์˜ ์ง€์†๋ฅ ์„ ๋†’์ด๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•˜๋‹ค.

โ–ถ๏ธŽ ๋‹ค์Œ๋‚ (1์ผ) ์ง€์†๋ฅ  ์ง‘๊ณ„

  • ๋‹ค์Œ๋‚  ์ง€์†๋ฅ ์€ '์ง€์ •ํ•œ ๋‚ ์งœ์— ๋“ฑ๋กํ•œ ์‚ฌ์šฉ์ž ์ค‘ ๋‹ค์Œ๋‚ ์—๋„ ์„œ๋น„์Šค๋ฅผ ์‚ฌ์šฉํ•œ ์‚ฌ๋žŒ์˜ ๋น„์œจ'์ด๋‹ค.
  • ์ง€์ •ํ•œ ๋‚ ์งœ ๋‹ค์Œ์— ์‚ฌ์šฉํ•œ ์‚ฌ์šฉ์ž์— 1, ์‚ฌ์šฉํ•˜์ง€ ์•Š์€ ์‚ฌ์šฉ์ž์— 0์ด๋ผ๋Š” ํ”Œ๋ž˜๊ทธ๋ฅผ ๋ถ™์ด๊ณ , ์ด๋Ÿฌํ•œ ๊ฐ’์— AVG ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•ด ํ‰๊ท ์„ ๊ตฌํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ๊ฐ„๋‹จํ•˜๋‹ค.
  • ๋‹จ, ๋‹ค์Œ๋‚  ์ง€์†๋ฅ ์„ ์ง‘๊ณ„ํ•˜๋ ค๋ฉด ๋‹ค์Œ๋‚ ์˜ ๋กœ๊ทธ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ชจ๋‘ ์Œ“์—ฌ ์žˆ์–ด์•ผ ํ•œ๋‹ค!!
    ๐Ÿ‘‰ ๋“ฑ๋ก์ผ ๋‹ค์Œ๋‚  ์ž‘์—…ํ•œ ์‚ฌ์šฉ์ž๊ฐ€ ์ •๋ง๋กœ 0์ธ ๊ฒฝ์šฐ์™€, ๋กœ๊ทธ๊ฐ€ ์•„์ง ์Œ“์ด์ง€ ์•Š์•„์„œ 0๋ช…์œผ๋กœ ๋‚˜์˜ค๋Š” ๊ฒฝ์šฐ๋ฅผ ๊ตฌ๋ถ„ํ•ด์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ
    ๐Ÿ‘‰ ์—ฌ๊ธฐ์„œ๋Š” ๋กœ๊ทธ ์ง‘๊ณ„ ๊ธฐ๊ฐ„ ์ค‘ ๊ฐ€์žฅ ์ตœ์‹  ๋‚ ์งœ๋ฅผ ์ถ”์ถœํ•˜๊ณ , ์ด ์ผ์ž๋ฅผ ๋„˜๋Š” ๊ธฐ๊ฐ„์˜ ์ง€์†๋ฅ ์€ NULL๋กœ ์ถœ๋ ฅํ•˜๊ฒŒ ํ•œ๋‹ค.
  1. ๋กœ๊ทธ์˜ ์ตœ๊ทผ ๋‚ ์งœ ์ถ”์ถœํ•˜๊ธฐ
  2. ์‚ฌ์šฉ์ž๋ณ„ ๋“ฑ๋ก์ผ๊ณผ ๋‹ค์Œ๋‚ ์˜ ๋‚ ์งœ๋ฅผ ์ง‘๊ณ„ํ•˜๊ธฐ
    โ–ถ๏ธŽ ๋“ฑ๋ก ๋‹ค์Œ ๋‚ ์งœ : CAST(u.register_date::date + '1 day'::INTERVAL, date)
WITH 
action_log_with_mst_users AS (
	SELECT u.user_id
		, u.register_date
	 	, cast(a.stamp AS date) AS action_date --์•ก์…˜ ๋‚ ์งœ ์ž๋ฃŒํ˜• ๋ณ€ 
		, max(cast(a.stamp AS date)) OVER () AS last_date -- ์ „์ฒด ๋กœ๊ทธ์—์„œ ๊ฐ€์žฅ ์ตœ๊ทผ ์•ก์…˜ ๋‚ ์งœ 
		, cast(u.register_date::date + '1 day'::INTERVAL AS date) AS next_day_1
	FROM mst_users AS u
	LEFT JOIN action_log AS a 
	ON u.user_id = a.user_id
)
SELECT *
FROM action_log_with_mst_users
ORDER BY register_date
;

 

3. ์ง€์ •ํ•œ ๋‚ ์˜ ๋‹ค์Œ ๋‚ ์— ์•ก์…˜์„ ํ–ˆ๋Š”์ง€ 0๊ณผ 1๋กœ ํ‘œ์‹œํ•˜๊ธฐ
  → ์ง€์ •ํ•œ ๋‚ ์˜ ๋‹ค์Œ๋‚ ์ด ๋กœ๊ทธ์˜ ๊ฐ€์žฅ ์ตœ๊ทผ ๋‚ ์งœ๋ฅผ ๋„˜๋Š” ๊ฒฝ์šฐ์—๋Š” NULL๋กœ ์ฒ˜๋ฆฌ

  • ๋“ฑ๋ก์ผ ๋‹ค์Œ๋‚ ์ด ๋กœ๊ทธ์˜ ์ตœ์‹  ๋‚ ์งœ ์ด์ „์ธ์ง€ ๋จผ์ € ์กฐ๊ฑด์„ case when ์œผ๋กœ ๊ฑธ๊ณ , ๋“ฑ๋ก์ผ ๋‹ค์Œ ๋‚ ์— ์•ก์…˜์ด ์ผ์–ด๋‚ฌ๋Š”์ง€ ํ•œ ๋ฒˆ ๋” case when์œผ๋กœ ํ™•์ธํ•˜์—ฌ 1, 0์„ ๋ฐ˜ํ™˜ ๐Ÿ‘‰ SIGN ํ•จ์ˆ˜
WITH 
action_log_with_mst_users AS (
	SELECT u.user_id
		, u.register_date
	 	, cast(a.stamp AS date) AS action_date --์•ก์…˜ ๋‚ ์งœ ์ž๋ฃŒํ˜• ๋ณ€ 
		, max(cast(a.stamp AS date)) OVER () AS lastest_date -- ์ „์ฒด ๋กœ๊ทธ์—์„œ ๊ฐ€์žฅ ์ตœ๊ทผ ์•ก์…˜ ๋‚ ์งœ 
		, cast(u.register_date::date + '1 day'::INTERVAL AS date) AS next_day_1
	FROM mst_users AS u
	LEFT JOIN action_log AS a 
	ON u.user_id = a.user_id
)
-- ์‚ฌ์šฉ์ž์˜ ์•ก์…˜ ํ”Œ๋ž˜๊ทธ๋ฅผ ๊ณ„์‚ฐ --
,user_action_flag AS (
	SELECT user_id
		, register_date
		, SIGN(
			sum(CASE WHEN next_day_1 <= lastest_date THEN
				CASE WHEN next_day_1 = action_date THEN 1 ELSE 0 END
				END)
		) AS next_1_day_actioin 
	FROM action_log_with_mst_users
	GROUP BY user_id, register_date
)
SELECT *
FROM user_action_flag
ORDER BY register_date, user_id ;

 

 

  • ์—ฌ๊ธฐ์„œ ๋‚ ์งœ๋ณ„๋กœ, next_1_day_action์˜ ํ‰๊ท ์„ ๊ตฌํ•˜๊ณ  100์„ ๊ณฑํ•˜๋ฉด 1์ผ ์ดํ›„์˜ ์ง€์†๋ฅ ์„ ๊ตฌํ•  ์ˆ˜ ์žˆ๋‹ค.
-- user_action_flag ๊นŒ์ง€ ๋™์ผ --
SELECT register_date
	, avg(100.0 * next_1_day_actioin) AS repeat_rate_1_day
FROM user_action_flag
GROUP BY register_date
ORDER BY register_date
;

 


๐ŸŽฏ ์ข…ํ•ฉ ์ฟผ๋ฆฌ | ์ง€์†๋ฅ  ์ง€ํ‘œ๋ฅผ ๊ด€๋ฆฌํ•˜๋Š” ๋งˆ์Šคํ„ฐ ํ…Œ์ด๋ธ”์„ ์ž‘์„ฑํ•˜๋Š” ์ฟผ๋ฆฌ

 

1. ์ง€ํ‘œ ๋งˆ์Šคํ„ฐ ์ƒ์„ฑ

  • PostgreSQL์€ VALUES๋กœ ํ…Œ์ด๋ธ”์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค.
WITH 
repeat_interval(index_name, interval_day) AS (
	VALUES
		 ('01 day repeat', 1)
		,('02 day repeat', 2)
		,('03 day repeat', 3)
		,('04 day repeat', 4)
		,('05 day repeat', 5)
		,('06 day repeat', 6)
		,('07 day repeat', 7)
)
SELECT *
FROM repeat_interval
ORDER BY index_name
;

 

2. ๋‚ ์งœ๋ณ„ ์ง€์†๋ฅ  ๊ณ„์‚ฐ ์ฟผ๋ฆฌ

  • action_log_with_index_date ํ…Œ์ด๋ธ”
    • ์œ„์—์„œ ๋งŒ๋“  repeat_interval์„ CROSS JOIN ํ•œ๋‹ค.
    • ๊ฐ ๋กœ๊ทธ์˜ ๋‚ ์งœ๋ฅผ date ํ˜•์‹์œผ๋กœ ๋ฐ”๊ฟ”์ฃผ๊ณ 
    • ์œ ์ €๋ณ„ ๊ฐ€์žฅ ์ตœ์‹  ๋กœ๊ทธ๋„ ๊ตฌํ•ด์ค€๋‹ค (max)
    • ์ง€์†๋ฅ ์„ ๊ณ„์‚ฐํ•˜๊ณ  ์‹ถ์€ ๋‚ ์งœ๋ฅผ interval '1 day'์— ๊ณฑํ•ด์ฃผ๊ณ  ์ด๋ฅผ ๋“ฑ๋ก์ผ์— ๋”ํ•ด์„œ index_date๋ฅผ ๋งŒ๋“ ๋‹ค
  • user_action_flag ํ…Œ์ด๋ธ”
    • ๋“ฑ๋ก์ผ ๋‹ค์Œ ๋‚ ์งœ(๊ธฐ์ค€๋‚ ์งœ)์ธ index_date๊ฐ€ ๊ฐ€์žฅ ์ตœ์‹  ๋กœ๊ทธ ๋‚ ์งœ๋ณด๋‹ค ์ด์ „์ธ์ง€ ํ™•์ธํ•˜๊ณ , 
      index_date๊ฐ€ action_date์™€ ๋™์ผํ•˜๋ฉด 1, ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ฒฝ์šฐ 0์„ ๋ฐ˜ํ™˜
  • ๋งˆ์ง€๋ง‰ SELECT์ ˆ
    • ๋“ฑ๋ก์ผ๊ณผ index_name(ํ™•์ธํ•˜๊ณ  ์‹ถ์€ ์ง€์†๋ฅ  ๋‚ ์งœ ๊ธฐ์ค€)์„ ๊ธฐ์ค€์œผ๋กœ ๊ทธ๋ฃนํ™”ํ•˜์—ฌ ์ง€์†๋ฅ  ํ‰๊ท ์„ ๊ตฌํ•œ๋‹ค.
WITH 
repeat_interval(index_name, interval_date) AS (
	VALUES
		 ('01 day repeat', 1)
		,('02 day repeat', 2)
		,('03 day repeat', 3)
		,('04 day repeat', 4)
		,('05 day repeat', 5)
		,('06 day repeat', 6)
		,('07 day repeat', 7)
)
, action_log_with_index_date AS (
	SELECT u.user_id
		, u.register_date
		, cast(a.stamp AS date) AS action_date
		, max(CAST(a.stamp AS date)) OVER() AS lastest_date
		, r.index_name --๋“ฑ๋ก์ผ๋กœ๋ถ€ํ„ฐ n์ผ ํ›„์˜ ๋‚ ์งœ 
		, cast(cast(u.register_date AS date) + INTERVAL '1 day' * r.interval_date AS date) AS index_date
	FROM mst_users AS u 
	LEFT JOIN action_log AS a ON u.user_id = a.user_id
	CROSS JOIN repeat_interval AS r 
)
, user_action_flag AS (
	SELECT user_id
		, register_date
		, index_name
		, sign(
			sum(CASE WHEN index_date <= lastest_date THEN 
				CASE WHEN index_date = action_date THEN 1 ELSE 0 END 
				END 
				)
			) AS index_date_action
	FROM action_log_with_index_date
	GROUP BY user_id, register_date, index_name, index_date
)
SELECT register_date
	, index_name
	, avg(100.0 * index_date_action) AS repeat_rate
FROM user_action_flag
GROUP BY register_date, index_name
ORDER BY register_date, index_name
;
728x90