Compiler

Syntax Analysis

Gauss1 2021. 10. 8. 02:18

문법 체크

      Stream of 
Tokens
  Abstract
Syntax Tree
  IR
Source Lexer Parser FE'

 


Parsing

 

Understanding sentence structure

Parser가 필요한 이유: some program may have recuresive structures

 

Finite Automata는 기억을 못하기 때문에 recursive 구조를 인식할 수 없다.

=> Push-down Automata

=> FSM + stack(에 기록)


Context-Free Grammar

 

set of productions로 구성됨

symbol -> symbol symbol ... symbol

 

Symbol types:

- Terminal: token types

- Non-terminal: a symbol that appears on the left-side of some production

 

Left-Hand Side (LHS): non-terminal

Right-Hand Side (RHS): terminals or non-terminals

 

Start Symbol: A special non-termnal

 

regular expression과의 가장 큰 차이는 recursive 표현 쉽게 가능

 

End-of-File Marker: $


Derivation

1. Begin with start symbol

2. While non-terminal exist, replace any non-terminal with RHS of production

 

Left-most derivation - replace left-most non-terminal in each step

Right-most derivation


Parsing Tree

 

derivation의 Graphical한 표현

Each internal node is labveled with non-terminal

Each leaf node is labeled with a terminal


Abstract Syntax Tree

 

Parsing tree에서 불필요한 정보를 제거(statement의 type을 지정)하여 간략하게 표현


Prediction-based Approach (Recuresive Descent Parsing)

1. LL(k) Parsing

 

- Nullable

- First(r)

- Follow(X)

starting symbol은 follow가 없음

 

 

Left-Recursion Problem

  E→E+T

  E→T

=> First(E+T) = First(T)

=> left-recursive cannot be LL(1)

=> solution: rewrite grammar so that it is right-recursive

  E→TE'

  E'→e

  E'→+TE'

 

Left-Factoring

  S→IF E THEN A

  S→IF E THEN A ELSE A 

=> Two productions begin with the same symbol

=> First(IF E THEN A) = First(IF E THEN A ELSE A )

=> solution: Left-Factoring

  S→IF E THEN A V

  V→e

  V→ELSE A


Bottom-up based approach (Shift-Reduce Parsing)

Parser has stack.

- Shift: push next input token to stack

- Reduce: choose production / pop off RHS(C,B,A) / push LHS(X)

$가 shift 되면 input stream은 정상적으로 parsing된것

 

1. LR(k) Parsing

- LR(0)

- LR(1)

Lookahead symbol(x)

 

2. SLR Parsing

follow를 보고 reduce 할지 안할지 판단

 

3. LALR(k) Parsing

LR(1) parser table이 너무 크기 때문에 나온 아이디어

look ahead만 다른 LR(1)을 합치자

=> LSR보다는 강력하지만 LR(1)보다는 약한 parser가 생성됨

=> 대신 메모리는 상당히 절약

'Compiler' 카테고리의 다른 글

Register Allocation  (0) 2021.11.04
Alias Analysis  (0) 2021.11.04
Dataflow Analysis  (0) 2021.11.03
Lexical Analysis  (0) 2021.10.08
Intro  (0) 2021.10.08