문법 체크
Stream of Tokens |
Abstract Syntax Tree |
IR | |||||
Source | → | Lexer | → | Parser | → | FE' | → |
Parsing
Understanding sentence structure
Parser가 필요한 이유: some program may have recuresive structures
Finite Automata는 기억을 못하기 때문에 recursive 구조를 인식할 수 없다.
=> Push-down Automata
=> FSM + stack(에 기록)
Context-Free Grammar
set of productions로 구성됨
symbol -> symbol symbol ... symbol
Symbol types:
- Terminal: token types
- Non-terminal: a symbol that appears on the left-side of some production
Left-Hand Side (LHS): non-terminal
Right-Hand Side (RHS): terminals or non-terminals
Start Symbol: A special non-termnal
regular expression과의 가장 큰 차이는 recursive 표현 쉽게 가능
End-of-File Marker: $
Derivation
1. Begin with start symbol
2. While non-terminal exist, replace any non-terminal with RHS of production
Left-most derivation - replace left-most non-terminal in each step
Right-most derivation
Parsing Tree
derivation의 Graphical한 표현
Each internal node is labveled with non-terminal
Each leaf node is labeled with a terminal
Abstract Syntax Tree
Parsing tree에서 불필요한 정보를 제거(statement의 type을 지정)하여 간략하게 표현
Prediction-based Approach (Recuresive Descent Parsing)
1. LL(k) Parsing
- Nullable
- First(r)
- Follow(X)
starting symbol은 follow가 없음
Left-Recursion Problem
E→E+T
E→T
=> First(E+T) = First(T)
=> left-recursive cannot be LL(1)
=> solution: rewrite grammar so that it is right-recursive
E→TE'
E'→e
E'→+TE'
Left-Factoring
S→IF E THEN A
S→IF E THEN A ELSE A
=> Two productions begin with the same symbol
=> First(IF E THEN A) = First(IF E THEN A ELSE A )
=> solution: Left-Factoring
S→IF E THEN A V
V→e
V→ELSE A
Bottom-up based approach (Shift-Reduce Parsing)
Parser has stack.
- Shift: push next input token to stack
- Reduce: choose production / pop off RHS(C,B,A) / push LHS(X)
$가 shift 되면 input stream은 정상적으로 parsing된것
1. LR(k) Parsing
- LR(0)
- LR(1)
Lookahead symbol(x)
2. SLR Parsing
follow를 보고 reduce 할지 안할지 판단
3. LALR(k) Parsing
LR(1) parser table이 너무 크기 때문에 나온 아이디어
look ahead만 다른 LR(1)을 합치자
=> LSR보다는 강력하지만 LR(1)보다는 약한 parser가 생성됨
=> 대신 메모리는 상당히 절약
'Compiler' 카테고리의 다른 글
Register Allocation (0) | 2021.11.04 |
---|---|
Alias Analysis (0) | 2021.11.04 |
Dataflow Analysis (0) | 2021.11.03 |
Lexical Analysis (0) | 2021.10.08 |
Intro (0) | 2021.10.08 |