Computing null nonterminals of a grammar in a functional way (preferably in Scala)

I'm new to functional programming and wondering how to solve the problem of computing a set of null nonterminals in a context-free grammar in a pure functional way without using variable assignments.

A null nonterminal is a nonterminal that directly yields a null, for example, A :: =, or has a body containing null nonterminals, for example A :: = BCD, where all BC and D are null.

I use the following definitions in Scala to define grammar:

case class Grammar(name:String, startSymbol:Nonterminal, rules:List[Rule])
case class Rule(head: Nonterminal, body:List[Symbol])
abstract class Symbol
case class Terminal(c:Char) extends Symbol
case class Nonterminal(name:String) extends Symbol

      

The basic algorithm is to collect all null nonterminals and place them in the set. Then, in each iteration, try to determine which production rules have all zero nonterminals on their body. These nonterminals will be added to the set until a new nonterminal set is added.

I followed this procedure in Scala as:

  def getNullableNonterminals(grammar:Grammar) = {

  var nieuw : Set[Nonterminal] = (for(Rule(head, Nil) <- grammar.rules) yield head) (collection.breakOut)
  var old = Set[Nonterminal]()

  while(old != nieuw) {
    old = nieuw
    for{
        Rule(head, symbols) <- grammar.rules
        if symbols.length > 0
        if symbols.forall( s => s.isInstanceOf[Nonterminal] && old.contains(s.asInstanceOf[Nonterminal]))
       } nieuw = nieuw + head
  }
  nieuw   
}

      

Although the code is much shorter than the equivalent Java version, it uses variables. Any suggestions to rewrite this piece of code in a functional style?

+3


source to share


3 answers


Here's a more idiomatic Scala solution:

object Algorithm {

  def getNullableNonterminals(grammar:Grammar) = {
    loop(grammar, Set())
  }

  @tailrec
  private def loop(grammar: Grammar, nullablesSoFar: Set[Nonterminal]): Set[Nonterminal] = {
    val newNullables = generateNew(grammar, nullablesSoFar)
    if (newNullables.isEmpty)
      nullablesSoFar //no new nullables found, so we just return the ones we have
    else
      loop(grammar, nullablesSoFar ++ newNullables) //add the newly found nullables to the solution set and we keep going
  }

  private def generateNew(grammar: Grammar, nullableSoFar: Set[Nonterminal]) = {
    for {
      Rule(head, body) <- grammar.rules
      if !nullableSoFar.contains(head)
      if body.forall(isNullable(_, nullableSoFar))
    } yield head
  }

  //checks if the symbol is nullable given the current set of nullables
  private def isNullable(symbol: Symbol, provenNullable: Set[Nonterminal]) = symbol match {
    case Terminal(_) => false
    case x@Nonterminal(_) => provenNullable.contains(x)
  }

}

      



The while statement is replaced by the recursive function - loop

. Also, avoid using isInstanceOf

- pattern matching is much better for this.

sealed

As a side note , create a Symbol class , as this can lead to a warning about missing cases in pattern matching.

+1


source


Here is another approach using memoisation ( link , another link ), which avoids the need for fixed point computation, both in your and MAD solution. Moreover, it is a general pattern that can be applied to many scenarios. Have a look at Implementing Scalaz .

def getNullableNonterminals(g: Grammar): Iterable[Nonterminal] = {
  /* Cache that is used by isNullable to memoise results. */
  var cache: Map[Nonterminal, Boolean] = Map()

  /* Assumption: For each nonterminal nt there exists only one rule r
   * such that r.head == nt.
   */
  var rules: Map[Nonterminal, List[Symbol]] = g.rules.map(r => (r.head, r.body)).toMap

  def isNullable(s: Symbol): Boolean = s match {
    case _: Terminal => false
    case nt: Nonterminal =>
      /* Either take the cached result, or compute it and store it in the cache. */
      cache.getOrElse(nt, {
        /* rules(nt) assumes that there is a rule for every nonterminal */
        val nullable = rules(nt) forall isNullable
        cache += ((nt, nullable))
        nullable
      })
  }

  rules.keys filter isNullable
}

      



Test case:

val ta = Terminal('a')
val tb = Terminal('b')

val ntX = Nonterminal("X")
val ntY = Nonterminal("Y")
val ntZ = Nonterminal("Z")
val ntP = Nonterminal("P")
val ntQ = Nonterminal("Q")
val ntR = Nonterminal("R")
val ntS = Nonterminal("S")

val rX = Rule(ntX, ntP :: ntQ :: Nil)
val rY = Rule(ntY, ntP :: ta :: ntQ :: Nil)
val rZ = Rule(ntZ, ntR :: Nil)
val rP = Rule(ntP, ntQ :: Nil)
val rQ = Rule(ntQ, Nil)
val rR = Rule(ntR, tb :: Nil)
val rS = Rule(ntS, ntX :: ntY :: ntZ :: Nil)

val g = Grammar("Test", ntS, List(rX, rY, rZ, rP, rQ, rR, rS))

getNullableNonterminals(g) foreach println
  // Nonterminal(Q), Nonterminal(X), Nonterminal(P)

      

+1


source


Finally, I took the time to write an example of how to do a grammar computation of ambiguity using circular attribute grammars. The code below uses our Kiama language processing library for Scala . You can find a complete source for example code and tests in Kiam. See SemanticAnalysis.scala for main attribution code, for example with a null value .

In short, the approach does the following:

  • presents grammar as an abstract structure of tree syntax,

  • parses names in a tree structure to resolve grammatical symbol references to the definitions of those symbols, and

  • computes nullability as the circle attribute in the resulting DAG structure.

The attribute definitions used are very similar to those used as examples in the article Grammars Circular Attributes by Magnusson and Hedin from LDTA 2003. They implement circular attributes in their JastAdd system and I highly recommend this document for anyone who wants to understand this topic. We use essentially the same algorithms in Kiama.

Here is the AST definition used in this example. Tree

is a Kiama type that provides some general behavior.

sealed abstract class GrammarTree extends Tree

case class Grammar (startRule : Rule, rules : Seq[Rule]) extends GrammarTree

case class Rule (lhs : NonTermDef, rhs : ProdList) extends GrammarTree

sealed abstract class ProdList extends GrammarTree
case class EmptyProdList () extends ProdList
case class NonEmptyProdList (head : Prod, tail : ProdList) extends ProdList

case class Prod (symbols : SymbolList) extends GrammarTree

sealed abstract class SymbolList extends GrammarTree
case class EmptySymbolList () extends SymbolList
case class NonEmptySymbolList (head : Symbol, tail : SymbolList) extends SymbolList

sealed abstract class Symbol extends GrammarTree
case class TermSym (name : String) extends Symbol
case class NonTermSym (nt : NonTermUse) extends Symbol

sealed abstract class NonTerm extends GrammarTree {
    def name : String
}
case class NonTermDef (name : String) extends NonTerm
case class NonTermUse (name : String) extends NonTerm

      

The code below shows the definition of an attribute nullable

. This is falsely triggered and then the loop point is fixed for computation until the value stabilizes. Cases show how to compute an attribute for different node types in an AST.

Kiama's Attribute Designer circular

includes all of the attribute implementation, including storage caching, fixed point detection, etc.

val nullable : GrammarTree => Boolean =
    circular (false) {

        // nullable of the start rule
        case Grammar (r, _) =>
            r->nullable

        // nullable of the right-hand side of the rule
        case Rule (_, rhs) =>
            rhs->nullable

        // nullable of the component productions
        case EmptyProdList () =>
            false
        case NonEmptyProdList (h, t) =>
            h->nullable || t->nullable

        // nullable of the component symbol lists
        case Prod (ss) =>
            ss->nullable
        case EmptySymbolList () =>
            true
        case NonEmptySymbolList (h, t) =>
            h->nullable && t->nullable

        // terminals are not nullable
        case TermSym (_) =>
            false

        // Non-terminal definitions are nullable if the rule in which they
        // are defined is nullable. Uses are nullable if their associated
        // declaration is nullable.
        case NonTermSym (n) =>
            n->nullable
        case n : NonTermDef =>
            (n.parent[Rule])->nullable
        case n : NonTermUse =>
            (n->decl).map (nullable).getOrElse (false)

    }

      

A link attribute decl

is one that connects non-terminal usage with its corresponding definition on the left side of the rule. parent

field is a link from node to its parent in AST.

Since the validity of one rule or character depends on zero others, what you get is a set of attribute events that participate in the dependency cycle. The result is a declarative version of evaluating nullability, which is very similar to the "tutorial" definition. (The example also defines attributes for FIRST and FOLLOW, which specify computations, which are defined in terms of nullability.) Circular attributes combine memoisation and fixed-point computation in a convenient package for these kinds of problems.

0


source







All Articles