Python data structure¶

스택과 큐
튜플과 집합
사전
collection 모듈

스택과 큐(stack & queue with list)¶

스택(Stack)¶

나중에 넣은 데이터를 먼저 반환하도록 설계된 메모리 구조
Last In First Out(LIFO)
Data의 입력을 Push, 출력을 Pop이라고 함

a=[1,2,3,4,5]
a.append(10)
a.append(20)
print(a.pop())  # 20출력
print(a.pop())  # 10출력

20
10

리스트를 사용하여 스택 구조를 구현 가능
push를 append(),pop을 pop()를 사용

word =input("Input a word : ")  # Input Word
word_list=list(word)     # String to List
for i in range(len(word_list)):
    print(word_list.pop())   # 하나씩 빼면서 출력

Input a word : naver
r
e
v
a
n

스택 구조를 활용, 입력된 글자를 역순으로 출력

큐(Queue)¶

먼저 넣은 데이터를 먼저 반환하도록 설계된 메모리 구조
First In First Out(FIFO)
Stack과 반대되는 개념

a =[1,2,3,4,5]
a.append(10)
a.append(20)
print(a)
print(a.pop(0)) # 1 출력
print(a)
print(a.pop(0))# 2 출력
print(a)

[1, 2, 3, 4, 5, 10, 20]
1
[2, 3, 4, 5, 10, 20]
2
[3, 4, 5, 10, 20]

파이썬은 리스트를 사용하여 큐 구조를 활용
put를 append(),get을 pop(0)를 사용

튜플과 집합(tuple & set)¶

튜플(tuple)¶

t =(1,2,3)
print(t +t ,t *2) # (1, 2, 3, 1, 2, 3) (1, 2, 3, 1, 2, 3)

(1, 2, 3, 1, 2, 3) (1, 2, 3, 1, 2, 3)

len(t)# 3
3

3

t[1]=5  # Error 발생

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-5505461864b2> in <module>
----> 1 t[1]=5  # Error 발생

TypeError: 'tuple' object does not support item assignment

값의 변경이 불가능한 리스트
선언시[]가 아닌 ()를 사용
리스트의 연산, 인덱싱, 슬라이싱 등을 동일하게 사용

왜 쓸까?¶

프로그램을 작동하는 동안 변경되지 않은 데이터의 저장
함수의 반환 값등 사용자의 실수에 의한 에러를 사전에 방지

t =(1)  # 일반정수로인식
print(t, type(t))
t =(1,)  # 값이하나인Tuple은반드시"," 를붙여야함
print(t,type(t))

1 <class 'int'>
(1,) <class 'tuple'>

집합(Set)¶

s =set([1,2,3,1,2,3]) # set 함수를사용1,2,3을집합객체생성, a = {1,2,3,4,5} 도가능
s

{1, 2, 3}

s.add(1)# 한원소1만추가,추가, 중복불허로추가되지않음
s

{1, 2, 3}

s.remove(1)# 1 삭제
s

{2, 3}

s.update([1,4,5,6,7]) # [1,4,5,6,7] 추가
s

{1, 2, 3, 4, 5, 6, 7}

s.discard(3)# 3 삭제
s

{1, 2, 4, 5, 6, 7}

s.clear()# 모든원소삭제
s

set()

값을 순서없이 저장,중복 불허 하는 자료형
set 객체 선언을 이용하여 객체 생성

s1 =set([1,2,3,4,5])
s2 =set([3,4,5,6,7])
print(s1.union(s2))  # s1 과s2의합집합
print(s1|s2)

{1, 2, 3, 4, 5, 6, 7}
{1, 2, 3, 4, 5, 6, 7}

print(s1.intersection(s2)) # s1 과s2의교집합
print(s1 & s2)

{3, 4, 5}
{3, 4, 5}

print(s1.difference(s2)) # s1 과s2의차집합
print(s1 - s2)

{1, 2}
{1, 2}

print(s1 ^ s2)  # s1 과 s2의 대칭차집합

{1, 2, 6, 7}

수학에서 활용하는 다양한 집합연산 가능

사전(dictionary)¶

dict¶

데이터를 저장할 때는 구분 지을 수 있는 값을 함께 저장
구분을 위한 데이터 고유 값을 Identifier 또는 Key라고함
Key값을 활용하여, 데이터 값(Value)를 관리함

key와 value를 매칭하여 key로 value를 검색
다른 언어에서는 Hash Table 이라는 용어를 사용
{Key1:Value1, Key2:Value2, Key3:Value3,... }형태

country_code={}  # Dict생성, country_code= dict() 도가능
country_code={'America':1,'Korea':82,'China':86,'Japan':81}
country_code

{'America': 1, 'Korea': 82, 'China': 86, 'Japan': 81}

country_code.items()  # Dict데이터출력

dict_items([('America', 1), ('Korea', 82), ('China', 86), ('Japan', 81)])

country_code.keys()# Dict키값만출력

dict_keys(['America', 'Korea', 'China', 'Japan'])

country_code["German"]=49 # Dict추가
country_code

{'America': 1, 'Korea': 82, 'China': 86, 'Japan': 81, 'German': 49}

country_code.values() # DictValue만출력

dict_values([1, 82, 86, 81, 49])

for k,v in country_code.items():
    print("Key : ",k)
    print("Value : ",v)

Key :  America
Value :  1
Key :  Korea
Value :  82
Key :  China
Value :  86
Key :  Japan
Value :  81
Key :  German
Value :  49

'Korea' in country_code.keys() # Key값에"Korea"가있는지확인

True

82 in country_code.values() # Value값에82가있는지확인

True

Command Analyzer¶

command: 사용자가 서버에 명령어를 입력한 명령어

import csv

def getKey(item):     # 정렬을 위한 함수
    return item[1]  

command_data=[]
with open("./data/command_data.csv","r",encoding='utf-8')as csvfile:
    spamreader=csv.reader(csvfile,delimiter=',',quotechar='"')
    for row in spamreader:
        command_data.append(row)
        
command_counter={}                             # dict생성, 아이디를key값, 입력줄수를value값
for data in command_data:                     # list 데이터를 dict로 변경
    if data[1]in command_counter.keys():      # 아이디가 이미 Key값으로 변경되었을 때
        command_counter[data[1]]+=1           # 기존 출현한 아이디
    else:
        command_counter[data[1]]=1            # 처음 나온 아이디

dictlist=[]                                   # dict를 list로 변경
for key,value in command_counter.items():
    temp =[key,value]
    dictlist.append(temp)
    
sorted_dict=sorted(dictlist,key=getKey,reverse=True)   # list를 입력 줄 수로 정렬
print(sorted_dict[:10])# List의 상위 10개값만 출력

[['bookworm', 8500], ['elsa', 7500], ['fillmore', 7394], ['francis', 5978], ['anton_ego', 5819], ['queen_grimhilde', 5000], ['kristoff', 4934], ['brent_mustangburger', 4838], ['emperor_zurg', 4470], ['tarzan', 4193]]

Collection 모듈¶

deque¶

from collections import deque
deque_list=deque()
for i in range(5):
    deque_list.append(i)
print(deque_list)
deque_list.appendleft(10)
print(deque_list)

deque([0, 1, 2, 3, 4])
deque([10, 0, 1, 2, 3, 4])

Stack과 Queue를 지원하는 모듈
List에 비해 효율적인 = 빠른 자료 저장 방식을 지원함

deque_list.rotate(2)
print(deque_list)
deque_list.rotate(2)
print(deque_list)
print(deque(reversed(deque_list)))
deque_list.extend([5, 6, 7])
print(deque_list)
deque_list.extendleft([5, 6, 7])
print(deque_list)

deque([3, 4, 10, 0, 1, 2])
deque([1, 2, 3, 4, 10, 0])
deque([0, 10, 4, 3, 2, 1])
deque([1, 2, 3, 4, 10, 0, 5, 6, 7])
deque([7, 6, 5, 1, 2, 3, 4, 10, 0, 5, 6, 7])

rotate, reverse등 Linked List의 특성을 지원함
기존 list형태의 함수를 모두 지원함

코드 시간 측정¶

# deque
from collections import deque
import time

start_time =time.process_time()
deque_list =deque()
for i in range(10000):
    for i in range(10000):
        deque_list.append(i)
        deque_list.pop()
print(time.process_time() -start_time, "seconds")

18.0 seconds

# general list
import time
start_time =time.process_time()
just_list =[]

for i in range(10000):
    for i in range(10000):
        just_list.append(i)
        just_list.pop()
print(time.process_time() -start_time, "seconds")

44.078125 seconds

from collections import deque

def deque_list():
    deque_list =deque()
    for i in range(10000):
        for i in range(10000):
            deque_list.append(i)
            deque_list.pop()
%timeit deque_list()

11.1 s ± 526 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

timeit으로 시간을 측정한다.

orderedDict¶

Dict와 달리, 데이터를 입력한 순서대로 dict를 반환함
그러나 dict도 python3.6 부터 입력한 순서를 보장하여 출력함 (이제 사라질 개념...)

defaultdict¶

from collections import defaultdict
d =defaultdict(lambda: 0)    # Default 값을0으로설정합
print(d["first"])

0

Dict type의 값에 기본 값을 지정, 신규값 생성시 사용하는 방법

하나의 지문에 각 단어들이 몇 개나 있는지 세고 싶을경우?¶

text ="""A press release is the quickest and easiest way to get free publicity. 
If well written, a press release can result in multiple published articles about your firm and its products.
And that can mean new prospects contacting you asking you to sell to them. ….""".lower().split()

word_count=defaultdict(lambda: 0)# Default 값을0으로설정합

for word in text:
    word_count[word] +=1
word_count

defaultdict(<function __main__.<lambda>()>,
            {'a': 2,
             'press': 2,
             'release': 2,
             'is': 1,
             'the': 1,
             'quickest': 1,
             'and': 3,
             'easiest': 1,
             'way': 1,
             'to': 3,
             'get': 1,
             'free': 1,
             'publicity.': 1,
             'if': 1,
             'well': 1,
             'written,': 1,
             'can': 2,
             'result': 1,
             'in': 1,
             'multiple': 1,
             'published': 1,
             'articles': 1,
             'about': 1,
             'your': 1,
             'firm': 1,
             'its': 1,
             'products.': 1,
             'that': 1,
             'mean': 1,
             'new': 1,
             'prospects': 1,
             'contacting': 1,
             'you': 2,
             'asking': 1,
             'sell': 1,
             'them.': 1,
             '….': 1})

from collections import OrderedDict
for i, v in OrderedDict(sorted(word_count.items(), key=lambda t:t[1],reverse=True)).items():
    print(i, v)

and 3
to 3
a 2
press 2
release 2
can 2
you 2
is 1
the 1
quickest 1
easiest 1
way 1
get 1
free 1
publicity. 1
if 1
well 1
written, 1
result 1
in 1
multiple 1
published 1
articles 1
about 1
your 1
firm 1
its 1
products. 1
that 1
mean 1
new 1
prospects 1
contacting 1
asking 1
sell 1
them. 1
…. 1

Counter¶

from collections import Counter
ball_or_strike=['B','S','S','B','B','B','S']
c=Counter(ball_or_strike)
print(c)

Counter({'B': 4, 'S': 3})

Sequence type의 data element들의 갯수를 dict형태로 반환

print(c)
print(list(c.elements()))

Counter({'B': 4, 'S': 3})
['B', 'B', 'B', 'B', 'S', 'S', 'S']

c= Counter(cats=4,dogs=8)
print(c)
print(list(c.elements()))

Counter({'dogs': 8, 'cats': 4})
['cats', 'cats', 'cats', 'cats', 'dogs', 'dogs', 'dogs', 'dogs', 'dogs', 'dogs', 'dogs', 'dogs']

Dict type, keyword parameter 등도 모두 처리 가능

c=Counter(a=4,b=2,c=0,d=-2)
d=Counter(a=1,b=2,c=3,d=4)
c.subtract(d)   # c - d
print(c)

Counter({'a': 3, 'b': 0, 'c': -3, 'd': -6})

c=Counter(a=4,b=2,c=0,d=-2)
d=Counter(a=1,b=2,c=3,d=4)
print(c+d)
print(c&d)
print(c|d)

Counter({'a': 5, 'b': 4, 'c': 3, 'd': 2})
Counter({'b': 2, 'a': 1})
Counter({'a': 4, 'd': 4, 'c': 3, 'b': 2})

Set의 연산들을 지원함

namedtuple¶

from collections import namedtuple
Point =namedtuple('Point', ['x', 'y'])
p =Point(11, y=22)
print(p[0] +p[1])

33

x, y=p
print(x, y)
print(p.x+p.y)
print(Point(x=11, y=22))

11 22
33
Point(x=11, y=22)

Tuple 형태로 Data 구조체를 저장하는 방법
저장되는 data의 variable을 사전에 지정해서 저장함

Numpy part II (0)	2021.01.25
Numpy part I (0)	2021.01.25
Python data handling (0)	2021.01.22
File & Exception & Log Handling (0)	2021.01.22
Module and Project (0)	2021.01.21
Python Object Oriented Programming (0)	2021.01.21
pythonic code (0)	2021.01.20
Variable & List (0)	2021.01.19

개발자CuCu

python data structure

Python data structure¶

스택과 큐(stack & queue with list)¶

스택(Stack)¶

큐(Queue)¶

튜플과 집합(tuple & set)¶

튜플(tuple)¶

왜 쓸까?¶

집합(Set)¶

사전(dictionary)¶

dict¶

Command Analyzer¶

Collection 모듈¶

deque¶

코드 시간 측정¶

orderedDict¶

defaultdict¶

하나의 지문에 각 단어들이 몇 개나 있는지 세고 싶을경우?¶

Counter¶

namedtuple¶

'AI > 이론' 카테고리의 다른 글

+ Recent posts

티스토리툴바