๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
Projects/๐Ÿช Convenience Store Location Analysis

[Mini Project] 6. ์ „์ฒด ์ƒ๊ถŒ ๋ฐ์ดํ„ฐ + ๋Œ€์ค‘๊ตํ†ต ์Šนํ•˜์ฐจ ๋ฐ์ดํ„ฐ ํ†ตํ•ฉ

by ISLA! 2023. 9. 11.

๐Ÿง ์ƒ๊ถŒ ๋ฐ์ดํ„ฐ์— ๋Œ€์ค‘๊ตํ†ต ์Šนํ•˜์ฐจ ๋ฐ์ดํ„ฐ ํ•ฉ์น˜๊ธฐ

์—ฐ๋„๋งˆ๋‹ค ๋ฒ„์Šค ์Šนํ•˜์ฐจ/์ง€ํ•˜์ฒ  ์Šนํ•˜์ฐจ ์ •๋ณด๊ฐ€ ๋‹ค๋ฅธ ๋ถ€๋ถ„์ด ์žˆ์–ด 2020~2022๋…„ 3๊ฐœ๋…„์„ ๊ฐ๊ฐ ๋”ฐ๋กœ ์ž‘์—…ํ•˜์˜€๋‹ค.

์˜ˆ์‹œ๋กœ, 2022๋…„ ๋ฐ์ดํ„ฐ๋ฅผ ์ „์ฒด ํ†ตํ•ฉํ•˜๋Š” ๊ณผ์ •์„ ํฌ์ŠคํŒ…ํ•œ๋‹ค.

 

 

๐ŸŒ ์ƒ๊ถŒ ์˜์—ญ ๋‚ด ๋ฒ„์Šค์ •๋ฅ˜์žฅ/์ง€ํ•˜์ฒ ์—ญ๋“ค ํ™•์ธ

  • ๋จผ์ €, ์ƒ๊ถŒ๋ณ„ ์˜์—ญ ๋‚ด์— ํฌํ•จ๋œ ๋ฒ„์Šค์ •๋ฅ˜์žฅ/์ง€ํ•˜์ฒ  ์—ญ ์ˆ˜์™€ ๊ฐ ์˜์—ญ์— ๊ตฌ์ฒด์ ์œผ๋กœ ์–ด๋–ค ๋ฒ„์Šค์ •๋ฅ˜์žฅ๊ณผ ์ง€ํ•˜์ฒ ์—ญ์ด ํฌํ•จ๋˜์–ด ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. (ํŒ€์›๋ถ„์ด ์ž‘์—…ํ•ด์ฃผ์‹ฌ!)
  • ๋ฐ์ดํ„ฐ๋ช… : ALL_filter22

ALL_filter22

 

๐ŸŒ ์—ฐ๋„, ๋ถ„๊ธฐ, ์ƒ๊ถŒ ์ฝ”๋“œ ๊ธฐ์ค€์œผ๋กœ ๋ฒ„์Šค ์Šนํ•˜์ฐจ ์Šน๊ฐ์ˆ˜ & ์ง€ํ•˜์ฒ  ์Šนํ•˜์ฐจ ์Šน๊ฐ์ˆ˜๋ฅผ ์‹œ๊ฐ„๋Œ€๋ณ„๋กœ ํ•ฉํ•˜๊ธฐ

โžก ๊ด€๋ จ์žˆ๋Š” ๋ฐ์ดํ„ฐ ๋ชจ๋‘ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

  • ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์—ฌ๋Ÿฌ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์˜ ์ž๋ฃŒ๋ฅผ ๊ฐ€์ ธ์™€ ํ•ฉ์น˜๋Š” ๊ฒƒ์ด ํ•„์š”ํ–ˆ๋‹ค.
  • ์œ„์˜ ์ž๋ฃŒ์—์„œ ์ƒ๊ถŒ_์ฝ”๋“œ๋ณ„๋กœ ์–ด๋–ค ๋ฒ„์Šค์ •๋ฅ˜์žฅ๊ณผ ์ง€ํ•˜์ฒ ์—ญ์ด ํฌํ•จ๋˜์–ด ์žˆ๋Š”์ง€ ํ™•์ธํ–ˆ์œผ๋‹ˆ, 
    • ์—ฐ๋„, ๋ถ„๊ธฐ๋ณ„๋กœ ํ•ด๋‹น ์ƒ๊ถŒ ๋‚ด์— ๊ฐ ๋ฒ„์Šค์ •๋ฅ˜์žฅ๊ณผ ์ง€ํ•˜์ฒ ์—ญ์„ ์ด์šฉํ•œ ์Šนํ•˜์ฐจ ์Šน๊ฐ์ˆ˜๋ฅผ ์‹œ๊ฐ„๋Œ€๋ณ„๋กœ ๋‚˜ํƒ€๋‚ด์•ผํ–ˆ๋‹ค.
    • ์ด๋ฅผ ์œ„ํ•ด ๋‘ ๊ฐœ ์ž๋ฃŒ ๋ฒ„์Šค ์Šนํ•˜์ฐจ ๋ฐ์ดํ„ฐ / ์ง€ํ•˜์ฒ  ์Šนํ•˜์ฐจ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์™€ ๊ฐ€๊ณตํ–ˆ๋‹ค.
    • ๋ฒ„์Šค ์Šนํ•˜์ฐจ ๋ฐ์ดํ„ฐ์™€ ์ง€ํ•˜์ฒ  ์Šนํ•˜์ฐจ ๋ฐ์ดํ„ฐ๋„ ์—ฐ๋„๋ณ„๋กœ ๋‚˜๋ˆ„์–ด ์ง„ํ–‰
  • ๋ฒ„์Šค ์Šนํ•˜์ฐจ ๋ฐ์ดํ„ฐ
    • bus22

 

  • ์ง€ํ•˜์ฒ  ์Šนํ•˜์ฐจ ๋ฐ์ดํ„ฐ
    • sub22

  • ์ตœ์ข…์ ์œผ๋กœ ๋ณ‘ํ•ฉํ•ด์•ผํ•˜๋Š” ์ƒ๊ถŒ ๋ฐ์ดํ„ฐ

์ผ๋ถ€

 


 

โžก ์—ฐ๋„/๋ถ„๊ธฐ/์ƒ๊ถŒ์ฝ”๋“œ ๋ณ„ ํฌํ•จ๋œ ๋ฒ„์Šค์ •๋ฅ˜์žฅ ์žฌ์ •๋ ฌ

bus_temp = []

# ๋ฐ˜๋ณต๋ฌธ์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ ๋ฒ„์Šค ์ •๋ฅ˜์žฅ์˜ ์ •๋ณด ์ถ”์ถœ
for index, row in ALL_filter22.iterrows():
    ์—ฐ๋„ = row['์—ฐ๋„']
    ๋ถ„๊ธฐ = row['๋ถ„๊ธฐ']
    ์ƒ๊ถŒ์ฝ”๋“œ = row['์ƒ๊ถŒ_์ฝ”๋“œ']
    ๋ฒ„์Šค์ •๋ฅ˜์žฅ๋“ค = row['๋ฒ„์Šค์ •๋ฅ˜์žฅ๋“ค']
    
    # ๊ฐ ๋ฒ„์Šค ์ •๋ฅ˜์žฅ์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜์—ฌ ์ƒˆ๋กœ์šด ํ–‰์„ ์ƒ์„ฑํ•˜๊ณ  ๋ฆฌ์ŠคํŠธ์— ์ถ”๊ฐ€
    for ์ •๋ฅ˜์žฅ in ๋ฒ„์Šค์ •๋ฅ˜์žฅ๋“ค:
        new_row = {'์—ฐ๋„': ์—ฐ๋„, '๋ถ„๊ธฐ': ๋ถ„๊ธฐ, '์ƒ๊ถŒ_์ฝ”๋“œ': ์ƒ๊ถŒ์ฝ”๋“œ, '๋ฒ„์Šค์ •๋ฅ˜์žฅ': ์ •๋ฅ˜์žฅ}
        bus_temp.append(new_row)

# ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ์ƒ์„ฑ
bus_temp = pd.DataFrame(bus_temp)

# ๊ฒฐ๊ณผ ํ™•์ธ
bus_temp = bus_temp.drop_duplicates()
expanded_bus22 = bus_temp
expanded_bus22

expanded_bus22

โžก ์—ฐ๋„/๋ถ„๊ธฐ/์ƒ๊ถŒ์ฝ”๋“œ ๋ณ„ ํฌํ•จ๋œ ์ง€ํ•˜์ฒ ์—ญ ์žฌ์ •๋ ฌ

new_data = []

for index, row in ALL_filter22.iterrows():
    ์—ฐ๋„ = row['์—ฐ๋„']
    ๋ถ„๊ธฐ = row['๋ถ„๊ธฐ']
    ์ƒ๊ถŒ์ฝ”๋“œ = row['์ƒ๊ถŒ_์ฝ”๋“œ']
    ์ง€ํ•˜์ฒ ์—ญ๋“ค = row['์ง€ํ•˜์ฒ ์—ญ๋“ค']
    
    # ๊ฐ ์ง€ํ•˜์ฒ  ์—ญ์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜์—ฌ ์ƒˆ๋กœ์šด ํ–‰์„ ์ƒ์„ฑํ•˜๊ณ  ๋ฆฌ์ŠคํŠธ์— ์ถ”๊ฐ€
    for ์—ญ in ์ง€ํ•˜์ฒ ์—ญ๋“ค:
        new_row = {'์—ฐ๋„': ์—ฐ๋„, '๋ถ„๊ธฐ': ๋ถ„๊ธฐ, '์ƒ๊ถŒ_์ฝ”๋“œ': ์ƒ๊ถŒ์ฝ”๋“œ, '์ง€ํ•˜์ฒ ์—ญ': ์—ญ}
        new_data.append(new_row)
    
new_df = pd.DataFrame(new_data)
new_df = new_df.drop_duplicates()
expanded_sub22 = new_df
expanded_sub22

expanded_sub22


โžก expanded ๋ฐ์ดํ„ฐ์— ์‹œ๊ฐ„๋Œ€๋ณ„ ์Šนํ•˜์ฐจ ์Šน๊ฐ์ˆ˜ ๋ฐ์ดํ„ฐ ๋งคํ•‘

bus_merged22 = pd.merge(expanded_bus22, bus22, how='left', on=['์—ฐ๋„', '๋ถ„๊ธฐ', '๋ฒ„์Šค์ •๋ฅ˜์žฅ'])

bus_merged22 = bus_merged22.fillna(0)
bus_merged22[['00~06_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜', '06~11_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜','11~14_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜', '14~17_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜', '17~21_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜','21~24_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜']] \
    = bus_merged22[['00~06_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜', '06~11_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜','11~14_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜', '14~17_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜', '17~21_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜','21~24_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜']].astype(int)


bus_merged22 = bus_merged22.rename(columns = {
    '00~06_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜': '00~06์‹œ_๋ฒ„์Šค',
    '06~11_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜': '06~11์‹œ_๋ฒ„์Šค',
    '11~14_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜': '11~14์‹œ_๋ฒ„์Šค',
    '14~17_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜': '14~17์‹œ_๋ฒ„์Šค',
    '17~21_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜': '17~21์‹œ_๋ฒ„์Šค',
    '21~24_์Šนํ•˜์ฐจ_์Šน๊ฐ์ˆ˜': '21~24์‹œ_๋ฒ„์Šค',
})


bus_merged22

sub_merged22 = sub_merged22.groupby(['์—ฐ๋„', '๋ถ„๊ธฐ', '์ƒ๊ถŒ_์ฝ”๋“œ']).sum()
# sub_merged22 = sub_merged22.drop(columns = '์ง€ํ•˜์ฒ ์—ญ')
sub_merged22 = sub_merged22.reset_index()

sub_merged22 = sub_merged22.rename(columns = {
    '00-06์‹œ': '00~06์‹œ_์ง€ํ•˜์ฒ ',
    '06-11์‹œ': '06~11์‹œ_์ง€ํ•˜์ฒ ',
    '11-14์‹œ': '11~14์‹œ_์ง€ํ•˜์ฒ ',
    '14-17์‹œ': '14~17์‹œ_์ง€ํ•˜์ฒ ',
    '17-21์‹œ': '17~21์‹œ_์ง€ํ•˜์ฒ ',
    '21-24์‹œ': '21~24์‹œ_์ง€ํ•˜์ฒ '
})

sub_merged22

 

 

โžก 2022๋…„ ์—ฐ๋„/๋ถ„๊ธฐ/์ƒ๊ถŒ์ฝ”๋“œ/์‹œ๊ฐ„๋Œ€๋ณ„ ๋ฒ„์Šค ์Šนํ•˜์ฐจ ์Šน๊ฐ ์ˆ˜ + ์ง€ํ•˜์ฒ  ์Šนํ•˜์ฐจ ์Šน๊ฐ ์ˆ˜ ํ•ฉ์น˜๊ธฐ

  • ์—ฐ๋„, ๋ถ„๊ธฐ, ์ƒ๊ถŒ ์ฝ”๋“œ๋ฅผ ๊ธฐ์ค€์œผ๋กœ left join ์„ ํ•˜๊ณ , ๊ฒฐ์ธก์น˜๋Š” 0์œผ๋กœ ์ฒ˜๋ฆฌ

โžก ์ƒ๊ถŒ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์— ๋Œ€์ค‘๊ตํ†ต ๋ฐ์ดํ„ฐ ํ†ตํ•ฉ

  • ๋‹ค์Œ์œผ๋กœ, ์ตœ์ข… ์ƒ๊ถŒ๋ฐ์ดํ„ฐ ํ”„๋ ˆ์ž„(df22)์— ์ด๋ฅผ ํ†ตํ•ฉํ•ด์•ผํ•จ
  • ์ฐจ๋ก€๋กœ ๋ฒ„์Šค๋ฐ์ดํ„ฐ์™€ ์ง€ํ•˜์ฒ ๋ฐ์ดํ„ฐ๋ฅผ ํ•ฉ์น˜๊ธฐ
  • ๊ฒฐ์ธก์น˜๋Š” 0์œผ๋กœ ์ผ๊ด„ ์ฒ˜๋ฆฌ
merged22_bus= pd.merge(df22, bus_merged22, on=['์—ฐ๋„', '๋ถ„๊ธฐ', '์ƒ๊ถŒ_์ฝ”๋“œ'], how = 'left')
merged22_bus_sub = pd.merge(merged22_bus, sub_merged22, on=['์—ฐ๋„', '๋ถ„๊ธฐ', '์ƒ๊ถŒ_์ฝ”๋“œ'], how = 'left')
merged22_bus_sub = merged22_bus_sub.fillna(0)
merged22_bus_sub

 

๐Ÿš€ ์ƒ๊ถŒ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์— ๋Œ€์ค‘๊ตํ†ต ๋ฐ์ดํ„ฐ ํ†ตํ•ฉ

  • ALL_filter22์— ์žˆ๋˜ ์ƒ๊ถŒ๋ณ„ ์ง€ํ•˜์ฒ ์—ญ ์ˆ˜, ๋ฒ„์Šค์ •๋ฅ˜์žฅ ์ˆ˜ ๋ถˆ๋Ÿฌ์™€์„œ ํ•ฉ์น˜๊ธฐ
  • ๋ถˆํ•„์š”ํ•œ ์ปฌ๋Ÿผ์€ ์‚ญ์ œ
final_2022 = pd.merge(merged22_bus_sub, ALL_filter22, on=['์—ฐ๋„', '๋ถ„๊ธฐ', '์ƒ๊ถŒ_์ฝ”๋“œ'], how = 'left')
final_2022 = final_2022.drop(columns = ['๋ฒ„์Šค์ •๋ฅ˜์žฅ๋“ค', '์ง€ํ•˜์ฒ ์—ญ๋“ค', 'ํด๋ฆฌ๊ณค_์ขŒํ‘œ_x', 'ํด๋ฆฌ๊ณค_์ขŒํ‘œ_y'])

์ตœ์ข… ๋ฐ์ดํ„ฐ ํ”„๋ ˆ์ž„


๋™์ผํ•œ ๊ณผ์ •์„ 2020, 2021, 2022๋…„ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด 3๋ฒˆ ๋ฐ˜๋ณตํ•˜๊ณ , ์ด๋ฅผ ํ†ตํ•ฉํ•˜์—ฌ ์ตœ์ข… ๋Œ€์ค‘๊ตํ†ต+์ƒ๊ถŒ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์œผ๋กœ ๋ฐ˜ํ™˜

์ด๋กœ์จ ์ƒ๊ถŒ๋ฐ์ดํ„ฐํŒ€๊ณผ ์ž‘์—…ํ•  ์ตœ์ข…์ ์ธ ๋Œ€์ค‘๊ตํ†ต(๋ฒ„์Šค, ์ง€ํ•˜์ฒ ) ๋ฐ์ดํ„ฐ ํ†ตํ•ฉ์ด ์™„๋ฃŒ๋˜์—ˆ๋‹ค!

728x90