๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
Projects/๐Ÿช Convenience Store Location Analysis

[Mini Project] 4. ๋Œ€์ค‘๊ตํ†ต(์ง€ํ•˜์ฒ ) ์œ„์น˜ ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ

by ISLA! 2023. 9. 8.

๐Ÿฅ‘ ์ง€ํ•˜์ฒ  ๋ฐ์ดํ„ฐ ๊ฐ€์ ธ์˜ค๊ธฐ

  • ์•„๋ž˜ ๋งํฌ์—์„œ ์ง€ํ•˜์ฒ  ์—ญ์‚ฌ ์œ„์น˜ ์ •๋ณด๋ฅผ ๊ฐ€์ ธ์˜ค๊ธฐ ์œ„ํ•ด API ์‹ ์ฒญ์„ ํ–ˆ๋‹ค.
  • ๊ณต๊ณต๋ฐ์ดํ„ฐ์—์„œ API ํฌ๋กค๋ง๊ณผ url์ด ์•ฝ๊ฐ„ ๋‹ค๋ฅด๋‹ˆ ์•„๋ž˜ ์ฝ”๋“œ๋ฅผ ์ฐธ๊ณ ํ•˜๋ฉด ์ข‹๋‹ค!

https://t-data.seoul.go.kr/category/dataviewopenapi.do?data_id=1036 

 

๋ฐ์ดํ„ฐ ๋ชฉ๋ก ์ƒ์„ธํŽ˜์ด์ง€

 

t-data.seoul.go.kr

import requests

# API ํ‚ค
apikey = '---์ง์ ‘์ž…๋ ฅ-----'

# API ์—”๋“œํฌ์ธํŠธ์™€ ์ฟผ๋ฆฌ ๋งค๊ฐœ๋ณ€์ˆ˜ ์„ค์ •
base_url = 'https://t-data.seoul.go.kr/apig/apiman-gateway/tapi/TaimsKsccDvSubwayStationGeom/1.0'
params = {
    'apiKey': apikey,
    'startRow': '1',
    'rowCnt': '100000'
}

# GET ์š”์ฒญ ๋ณด๋‚ด๊ธฐ
req = requests.get(base_url, params=params)

# ์‘๋‹ต JSON ํŒŒ์‹ฑ
json_df = req.json()

"""[{'outStnNum': '4128',
  'stnKrNm': '์‚ผ์„ฑ์ค‘์•™',
  'lineNm': '9ํ˜ธ์„ (์—ฐ์žฅ)',
  'convX': '127.053282',
  'convY': '37.513011'},
 {'outStnNum': '4124',
  'stnKrNm': '์‚ฌํ‰',
  'lineNm': '9ํ˜ธ์„ ',
  'convX': '127.015259',
  'convY': '37.504206'},"""
    
import pandas as pd
sub = pd.DataFrame(json_df)
sub

โžก ์ปฌ๋Ÿผ๋ช… ๋ณ€๊ฒฝ

sub.drop('outStnNum', axis = 1, inplace = True)
sub.rename(columns = {'stnKrNm':'์ง€ํ•˜์ฒ ์—ญ', 'convX':'์œ„๋„', 'convY':'๊ฒฝ๋„', 'lineNm':'ํ˜ธ์„ ๋ช…'}, inplace = True)
sub.head()

 

โžก ๊ฐ•๋‚จ๊ตฌ ์ถ”์ถœ

  • ๊ฐ•๋‚จ๊ตฌ์— ์†ํ•œ ์ง€ํ•˜์ฒ ์—ญ๋งŒ ๋ฆฌ์ŠคํŠธ๋กœ ์ €์žฅ
  • ํ•ด๋‹น ๋ฆฌ์ŠคํŠธ์™€ ์ผ์น˜ํ•˜๋Š” ์ง€ํ•˜์ฒ ๋งŒ ์ถ”์ถœ
  • ๊ทธ๋Ÿฐ๋ฐ, ์ผ๋ถ€ ์ง€ํ•˜์ฒ  ์œ„๋„์™€ ๊ฒฝ๋„๊ฐ€ ํ˜ธ์„ ์— ๋”ฐ๋ผ ์ƒ์ดํ•œ ๊ฒƒ์„ ํ™•์ธํ•˜์—ฌ ์ด๋ฅผ ๋”ฐ๋กœ ํ™•์ธํ•˜๊ธฐ๋กœ ํ–ˆ๋‹ค.
station_list = [
    "๊ฐ•๋‚จ๊ตฌ์ฒญ",
    "๊ฐ•๋‚จ",
    "๊ฐœํฌ๋™",
    "๊ตฌ๋ฃก",
    "๋…ผํ˜„",
    "๋Œ€๋ชจ์‚ฐ์ž…๊ตฌ",
    "๋Œ€์ฒญ",
    "๋Œ€์น˜",
    "๋„๊ณก",
    "๋งค๋ด‰",
    "๋ด‰์€์‚ฌ",
    "์‚ผ์„ฑ(๋ฌด์—ญ์„ผํ„ฐ)",
    "์‚ผ์„ฑ์ค‘์•™",
    "์„ ๋ฆ‰",
    "์„ ์ •๋ฆ‰",
    "์ˆ˜์„œ",
    "์‹ ๋…ผํ˜„",
    "์‹ ์‚ฌ",
    "์••๊ตฌ์ •๋กœ๋ฐ์˜ค",
    "์••๊ตฌ์ •",
    "์–‘์žฌ",
    "์–ธ์ฃผ",
    "์—ญ์‚ผ",
    "์ผ์›",
    "์ฒญ๋‹ด",
    "ํ•™๋™",
    "ํ•™์—ฌ์šธ",
    "ํ•œํ‹ฐ"
]

gangnam_sub = sub[sub['์ง€ํ•˜์ฒ ์—ญ'].isin(station_list)]
gangnam_sub.head()

 

โžก ์œ„๋„/๊ฒฝ๋„ ์„ธํŠธ๊ฐ€ 2๊ฐœ ์ด์ƒ์ธ ์ง€ํ•˜์ฒ  ์ถ”์ถœ

  • ๊ฐ•๋‚จ๊ตฌ์˜ 28๊ฐœ ์—ญ ์ค‘, 9๊ฐœ์˜ ์—ญ์ด ํ˜ธ์„ ์— ๋”ฐ๋ผ ์œ„๋„์™€ ๊ฒฝ๋„๊ฐ€ ์‚ด์ง ๋‹ค๋ฅธ ๊ฒƒ์„ ํ™•์ธ
  • 9๊ฐœ ๋…ธ์„ ์˜ ์œ„๋„ ๊ฒฝ๋„ ์ฐจ์ด๊ฐ€ ํฌ์ง€ ์•Š์•„ ํ‰๊ท  ๋‚ด๊ธฐ๋กœ ๊ฒฐ์ •
station_count = gangnam_sub['์ง€ํ•˜์ฒ ์—ญ'].value_counts()

station_2pos = station_count[station_count >= 2].index.tolist()
station_2ps_df = gangnam_sub[gangnam_sub['์ง€ํ•˜์ฒ ์—ญ'].isin(station_2pos)]
station_2ps_df.sort_values('์ง€ํ•˜์ฒ ์—ญ')

station2s = station_2ps_df.drop('ํ˜ธ์„ ๋ช…', axis = 1)
station2s.reset_index(drop=True)

โžก ์œ„๋„/๊ฒฝ๋„๋ฅผ object ->> float ๋กœ ์ˆ˜์ • ํ›„, ์œ„๋„/๊ฒฝ๋„ ํ‰๊ท ๊ฐ’ ๋„์ถœ

station2s[['์œ„๋„', '๊ฒฝ๋„']] = station2s[['์œ„๋„', '๊ฒฝ๋„']].astype(float)
double_positions = station2s.groupby('์ง€ํ•˜์ฒ ์—ญ')[['์œ„๋„', '๊ฒฝ๋„']].mean()
display(len(double_positions), double_positions)

 

โžก ์œ„๋„/๊ฒฝ๋„๊ฐ€ 1๊ฐœ ๋‹จ์ผ๊ฐ’์ธ ๊ฐ•๋‚จ ์ง€ํ•˜์ฒ ์—ญ ์ „์ฒ˜๋ฆฌ

  • 2๊ฐœ ์œ„์น˜ ์ •๋ณด๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋˜ ๊ฒฝ์šฐ์™€ ๋™์ผํ•˜๊ฒŒ ์ปฌ๋Ÿผ๋ช… ๋งž์ถ”๊ณ , ๋ฐ์ดํ„ฐ ํƒ€์ž… ๋ณ€๊ฒฝ
double_stations = double_positions['์ง€ํ•˜์ฒ ์—ญ'].values.tolist()
single_stations = list(set(station_list) - set(double_stations))

single_positions = gangnam_sub[gangnam_sub['์ง€ํ•˜์ฒ ์—ญ'].isin(single_stations)]

single_positions.drop('ํ˜ธ์„ ๋ช…', axis = 1, inplace = True)
single_positions[['์œ„๋„', '๊ฒฝ๋„']] = single_positions[['์œ„๋„', '๊ฒฝ๋„']].astype(float)
display(single_positions.head(), len(single_positions))

 

โžก ์ „์ฒด ์ง€ํ•˜์ฒ ์—ญ ๋ถ™์ด๊ธฐ(์œ„์น˜1๊ฐœ + ์œ„์น˜2๊ฐœ)

  • ์–‘์žฌ์—ญ ๋ฐ์ดํ„ฐ๋งŒ ์—†์Œ
  • ์ด๋ฅผ ์ œ์™ธํ•˜๊ณ  27๊ฐœ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณ‘ํ•ฉ : (๊ฒฐ๊ณผ์ด๋ฏธ์ง€๋Š” ์ผ๋ถ€๋งŒ ๋ณด์ž„)
gangnam_sub_positions = pd.concat([single_positions, double_positions], axis = 0, ignore_index = True)
gangnam_sub_positions

 

728x90