Paper Context Free Grammar (CFG) ( English )

Inspired by the natural human language, computer science scientists who helped develop the programming language and provides grammar (programming) formally. Grammar was created independently and called CFG-context (ContextFree Grammar). The resul, with this formal approach, a language compilerprogramming can be made easier and to avoid ambiguity when parsing the languagethem. Example CFG for parser design, for example: B -> BB | (B) | e to identify the  anguagewith only a parenthesis {'(',')'} as its terminal. Parsing process is a processreadings in language appropriate strand specific CFG, this process must abide by the rulesproduction in the CFG. Formally, a CFG is defined [2]: CFG G = (V, T, P, S), where V is a list ofproduction variable T, is a list of symbols or terminals that are used in CFG P, isCFG production rule S, is a variable start CFG production rules can be 'normalized'with its own pattern so unambiguous and simpler, although the normalizationCFG production rules sometimes makes a lot more than before. Techniquenormalization  used in this paper is to CNF (Chomsky Normal Form).


Figure 5.1. Preview parsing natural language (English)Sample design of the CNF CFG language, such as CFG:


S -> aA | bB (1)A-> Baa | baB -> BAA | abCFG (1) is equivalent to CFG below, where the terminal symbol has variable production of its own:S -> DA | EB (2)A -> BDD | EDB -> EAA | DED -> aE -> b


CNF generated from CFG (2) above is:
S -> DA | EB (3)A -> BF | EDB -> EH | DEF -> DDH -> AAD -> aE -> b Once formed CFG in CNF that has been normalized, in the implementationparsing, there are algorithms which are useful for determining whether a particular strand of 'valid',or can be created from the CFG rules that exist. One algorithm that canThe algorithm used is the Cocke-Younger-Kasami (CYK). The algorithm is completedproblem analysis returned a sub-strand should be the same as the analysis of sub-strandindependent of parsing the sub-strands are parsed afterwards. By ProgramDynamically, the desired independence can be achieved when parsing. CYK algorithmincluding in the field of Dynamic Program because these algorithms build status table twodimension when parsing where the determination of subsequent parsing derived or generatedfrom the previous parsing, until the end of the strand. In addition to knowing the validity of the strand in a CFG, a modified CYK algorithm can be used also to buildparse tree.




Figure 5.2 Tree parsing is formed from a natural language

Example 1:
Known grammar G1 = {I ® H ½ H ½ I IA, H ® a ½ b ½ c ½...½ z, A ® 0 ½ 1 ½ 2 ½...½ 9}with I is the initial symbol. Here are two ways to sentence syntax analysis x23b.



way 1 (derivation) means 2 (parsing)I IHIAHIAAHHAAHxAAHx2AHx23Hx23b
A sentence may have more than one tree.Example 2:
Unknown grammar G 2 = {S ® SOS ½ A, O ® * ½ +, A ® 0 ½ 1 ½ 2 ½...½ 9} Sentence: 2 * 3 +7 two trees have the following syntax: A sentence that has more than one syntax tree is called an ambiguous sentence (Ambiguous). Grammar that generates at least an ambiguous sentence is called ambiguous grammar.
5.2 Push Down Automata (PDA)PDA is the engine of TBBK automata are implemented with stackoperation so that there is only "push" and "pop" Stack (stack) is a structuredata that uses the principle of LIFO (Last In First Out). A stack always has the topof stack and stack elements that will go into the stack with a method"Push" and will come out of the stack with a method "pop".



  • Definition: The PDA is a pair 7-tuple M = (Q, S, G, q 0, Z0, d, A), where: Q: set up Stata, S: input alphabet, G: stack alphabet, q 0 Î Q: Stata beginning, Z0 Î G: initial stack symbol, A i Q: Stata receiver set, transition function d: Q × (S e {e}) × G ® 2 Q × G * (Subsets of Q × G *)
  • To Stata q Î Q, the input symbol a Î S, and the stack symbols Xi G, d (q, a, X) = (p, a) mean: PDA transition to Stata p and replacing X on the stack with the string a.
  • Configuring the PDA at any one time expressed as a triple (q, x, a), where: q Î Q: Stata appeared at that time, x Î S *: the input string that has not been read, and a Î G *: a string that states the contents of the stack with the leftmost character conveys the top of stack.
  • Let (p, ay, Xb) is a configuration, in which: a Î S, y Î S *, X Î G, and b Î G *. Let also d (p, a, X) = (q, g) for q Î Q and g Î G *. We may write that: (p, ay, Xb) (q, y, pl).

A PDA is given by:
Q = the set of state = Set of input symbols
T = symbol stack = Transition functionS = initial stateF = final stateZ = top of stack
PDA has two types of transitions, namely that merima input symbol, the symbol top ofstack, and the state. Each option consists of the next state and the symbols. Replacementopersi stack contents is done by push and pop. The second type of transition is a transition.Transitions are not doing the reading input, but only receive top of the stack symboland state. This transition allows the PDA to manipulate the contents of the stack and moveinter-state without reading input.Example 14 (Deterministic PDA):
PDA M = (Q, S, G, q 0, Z 0, d, A) identifier palindrome L = {xcx T ½ x Î (a ½ b) *}, wherex T is a mirror (x), has a tuple Q = {q 0, q 1, q 2}, A = {q 2}, S = {a, b, c},G = {a, b, Z0}, and the transition function d defined by the following table:For example, note that the transition function No. 1 can be expressed as: d (q 0, a,Z0) = (q 0, az 0). In the transition table shows that the PDA will Stata q 0do the PUSH if it gets input a or b and make the transition to a Stata Stata q 1 ifget input c. In Stata q1 PDA will do POP.


Here is the introduction of two strings by the PDA above:
1. abcba: (q 0, abcba, Z 0) (q 0, bcba, AZ 0) (1)

  • (Q 0, cba, Baz 0) (4)
  • (Q1, ba, Baz 0) (9)
  • (Q1, a, az 0) (11)
  • (Q1, e, Z 0) (10)
  • (Q 2, e, Z 0) (12) (accepted)


2. ACB: (q 0, ACB, Z 0) (q 0, cb, AZ 0) (1)(Q1, b, aZ0) (8), (crash ® is rejected)

3. ab: (q 0, ab, Z 0) (q 0, b, aZ0) (1)(Q 0, e, baZ0) (4) (crash ® is rejected)


Acceptance and rejection of the three strings above can be explained as follows:
1. abcba string acceptable for tracing to the Stata recipient (q 2) and the string "abcba"finished reading (string unread = e)
2. string ACB rejected because the final configuration (q 1, b, a Z 0) while the transition functiond (q a, b, a) is not definition
3. ab string is rejected because the final configuration (q 0, e, baZ0) while the transition function d (q 0, e, b) not definition


Illustration of graph transition PDA functions are shown through the following illustration:
• The notation (p, ay, Xb) (q, y, pl) can be expanded into: (p, x, a) * (q, y, b), whichmeans the configuration (q, y, b) is achieved through a number (0 or better) transition.
• There are two ways of receiving a sentence by the PDA, each of which can be seen from


final configuration, as the following explanation:
If M = (Q, S, G, q 0, Z 0, d, A) is a PDA and X is *, then x is received with Statafinal (accepted by final state) by the PDA M if: (q 0, x, Z0) * (q, e, a) for a Î G* And q Î A. x accepted by empty stack (accepted by empty stack) by the PDA Mif: (q 0, x, Z0) * (q, e, e) for q Î Q.


Example 15 (Non-Deterministic PDA):
PDA M = (Q, S, G, q 0, Z0, d, A) identifier palindrome L = {xx T ½ x Î (a ½ b) *} hastuple components follows: Q = {q 0, q1, q 2}, A = {q 2}, S = {a, b}, G = {a, b, Z0},and the transition function d defined by the following table:In the transition table can be seen that at q 0 PDA Stata will do the PUSH ifget input a or b and make the transition to a Stata Stata q 1 if it gets input e.In Stata q a PDA will do POP. Examples 14 and 15 shows that the PDAcan be expressed as the engine 


PUSH-POP.Here is the introduction of string "Baab" by the PDA above:
1. (Q 0, Baab, Z 0) (q 0, AAB, BZ 0) (2 left)(Q 0, ab, ABZ 0) (5 left)(Q1, ab, ABZ 0) (3rd right)(Q1, b, bZ0) (11)(Q1, e, Z 0) (10)(Q 2, e, Z 0) (12) (accepted)
2. (Q 0, Baab, Z 0) (q1, Baab, Z 0) (2 right) (crash ® is rejected)
3. (Q 0, Baab, Z 0) (q 0, AAB, BZ 0) (2 left)(Q 0, ab, ABZ 0) (5 left)(Q 0, b, aabZ 0) (3 left)(Q1, b, aabZ 0) (4 right) (crash ® is rejected)
4. (Q 0, Baab, Z 0) (q 0, AAB, BZ 0) (2 left)(Q 0, ab, ABZ 0) (5 left)(Q 0, b, aabZ 0) (3 left)(Q 0, e, baabZ 0) (4 left)(Q1, e, baabZ 0) (9) (crash ® is rejected)

No comments:

Post a Comment