= [백준, python] 6800번 - Huffman Encoding | 개발하는 쿼카

[백준, python] 6800번 - Huffman Encoding

백준 문제풀이 시작


문제

There is an ingenious text-compression algorithm called Huffman coding, designed by David Huffman in 1952.

The basic idea is that each character is associated with a binary sequence (i.e., a sequence of 0s and 1s). These binary sequences satisfy the prefix-free property: a binary sequence for one character is never a prefix of another character’s binary sequence.

It is worth noting that to construct a prefix-free binary sequence, simply put the characters as the leaves of a binary tree, and label the “left” edge as 0 and the ”right” edge as 1. The path from the root to a leaf node forms the code for the character at that leaf node. For example, the following binary tree constructs a prefix-free binary sequence for the characters {A, B, C, D, E}:

제목

That is, A is encoded as 00, B is encoded as 01, C is encoded as 10, D is encoded as 110 and E is encoded as 111.

The benefit of a set of codes having the prefix-free property is that any sequence of these codes can be uniquely decoded into the original characters.

Your task is to read a Huffman code (i.e., a set of characters and associated binary sequences) along with a binary sequence, and decode the binary sequence to its character representation.

입력

The first line of input will be an integer k (1 ≤ k ≤ 20), representing the number of characters and associated codes. The next k lines each contain a single character, followed by a space, followed by the binary sequence (of length at most 10) representing the associated code of that character. You may assume that the character is an alphabet character (i.e., ‘a’…‘z’ and ‘A’..‘Z’). You may assume that the sequence of binary codes has the prefix-free property. On the k + 2nd line is the binary sequence which is to be decoded. You may assume the binary sequence contains codes associated with the given characters, and that the k + 2nd line contains no more than 250 binary digits.

출력

On one line, output the characters that correspond to the given binary sequence.


풀이

풀이 확인하기

입력을 받은 뒤 binary sequence를 key로 가지는 dictionary를 만든다. 그리고 0과 1로 이뤄진 문자열을 받아 queue로 만들어 하나씩 pop을 하면서 해당 데이터가 dictionary에 없다면 다시 pop해서 더하고, 해당 데이터가 dictionary에 있으면 변환을 하여 정답 문자열에 더한 뒤 pop한 데이터들을 전부 초기화해주는 과정을 반복한다.

import sys
from collections import deque
input = sys.stdin.readline

num = int(input())

data = {}

for _ in range(num):
    a = input().strip().split()
    data[a[1]] = a[0]

encode = deque(input().strip())

temp = ""
ans = ""
while True:
    temp = temp + encode.popleft()
    if temp in data:
        ans = ans + data[temp]
        temp = ""
    if(len(encode) == 0):
        break

print(ans)