Programming Assignment 2: the CKY algorithm and Debruijn Indices (Due Oct 13, 11:59 PM)

Problem 1
Write a program that takes an input string and checks to see if it belongs to a given grammar in CNF via the table method of the CKY algorithm. It should display the table. Here's a suggested outline:
/* a class of sets of strings: will store NONTERMINALS (rep. as strings)
 * that generate specific substrings of input string */ 


class sSet{
//build it or take it from the appropriate package
}

/* a square matrix of sets of terminals.
 * We will only fill upper-left to
 * lower right diagonal plus subdiagonal
 * See SAMPLE OUTPUT at bottom of page
*/ 
class Table{
    private int size;
    private sSet [][]  sar;


    //dim x dim matrix
    public Table(int dim){
    }
      // insert s into set at t(i,j)  
    public void add(String s, int i, int j)
        {
        }

    /* t(i,j):= set ss  */
    public void assign(sSet ss, int i, int j)
        {

        }

// and stuff to help you display the table

Here's all the code for a grammar production class


/* Chomsky Normal Form production (A-->BC or A-->a) data type */
class cnfProduction{
    private String head;
    private String tail1="";
    private String tail2="";
    private boolean isTerminal; //yes it's of the form A-->a

    //for A-->BC
    public cnfProduction(String h,String t1,String t2)
        {head = h; tail1=t1;tail2=t2;isTerminal=false;}
    
    //for A-->a
    public cnfProduction(String h,String t1)
        {head = h; tail1=t1;isTerminal=true;}

    // Grab first two (A,B) or three strings (A,B,C) in array if it
    // has length greater than 2 
    // and make A-->B or A-->B,C, accordingly.
    public cnfProduction(String [] ar)
        {
            
            if (ar.length<2){
                System.out.println("cannot make a CNF"+
                                   "production from less than "+
                                   "two strings.");
                System.exit(0);
            }
            else{
                if(ar.length==2){
                    head=ar[0];
                    tail1=ar[1];
                    isTerminal=true;
                }
                else{
                    head=ar[0];
                    tail1=ar[1];
                    tail2=ar[2];
                }
            }
        }
                

    public String h(){return head;}
    public String t1(){return tail1;}
    public String t2(){return tail2;}
    public String Tail(){return tail1+","+tail2;}//to print head
    public String Head(){return head;}//to print tail
    
    public boolean terminal(){return isTerminal;}

    //print production
    public String toString(){
        return Head()+" ---> "+Tail();
    }
    
}
    

ckyData: A Data Structure for holding the whole schmeer

/* the kitchen sink: input string, its length, the table, the whole
 * grammar as an array of CNF productions and the start symbol
 */
class ckyData{
    private String inString;
    private int size;
    private Table t;
    private cnfProduction [] p;
    private String start;

    public ckyData(String s, String S){
        inString = s;
        start=s;
        size=s.length();
        t = new Table(size);
        
    }
  public ckyData(String s, cnfProduction [] pr ,String S){
        inString = s;
        p =pr;
        start=S;
        size=s.length();
        t = new Table(size);
  }

/* so we can input the grammar as an array of arrays
 * {{head,Tail1,Tail2}, ....*/
    public ckyData(String s, String [] [] spr ,String S){
        inString = s;
        start=S;
        p=new cnfProduction[spr.length];
        size=s.length();
        t = new Table(size);
        
        for(int i =0;i<spr.length;i++)
            p[i] = new cnfProduction(spr[i]);
    }
            

and the method ckyAccept()


  public boolean ckyAccept(){

This method is the point of the whole exercise: it's just 20 or 30 lines. given the input string, array of productions and start symbol, all of which are held in the ckyData fields

  • inString: the input string
  • p: the production array
  • t: the empty table of the right size
  • it should fill the table in (and display it) and check to see if the start symbol is in the set in the lower left-hand corner (see SAMPLE OUTPUT, below)

    The main procedure that drives it all:

    
    public class cky{
        public static void main(String [] args) {
    
    //a CNF for our arithmetical language
            String [] [] cnf={{"S","0"},
                              {"S","1"},
                              {"S","L","E"},
                              {"E","S","P"},
                              {"E","S","T"},
                              {"P","A","F"},
                              {"T","M","F"},
                              {"F","S","R"},
                              {"A","+"},
                              {"M","*"},
                              {"L","("},
                              {"R",")"},};
    
            ckyData d = new ckyData(args[0],cnf,"S");
            d.display(); //shows all productions and the table
            if (d.ckyAccept())
                System.out.println(args[0]+" accepted.");
            else
                System.out.println(args[0]+" rejected.");
        }
                                    
    }
    
    

    Sample Output

    
    <localhost.localdomain> java cky "((0+1)*1)"
    ------------------------------------
    -Parse string ===> ((0+1)*1)   size=9
    Start symbol = S
    Grammar =
    S ---> 0,
    S ---> 1,
    S ---> L,E
    E ---> S,P
    E ---> S,T
    P ---> A,F
    T ---> M,F
    F ---> S,R
    A ---> +,
    M ---> *,
    L ---> (,
    R ---> ),
    ------------------------------------
    {L} .
    {}  {L} .
    {}  {}  {S} .
    {}  {}  {}  {A} .
    {}  {}  {}  {}  {S} .
    {}  {S} {E} {P} {F} {R} .
    {}  {}  {}  {}  {}  {}  {M} .
    {}  {}  {}  {}  {}  {}  {}  {S} .
    {S} {E} {}  {}  {}  {}  {T} {F} {R} .
    ----------------------
    ((0+1)*1) accepted.
    
    

    Problem 2 (to be woked on after we cover the elements of the Lambda Calculus).
    Define a data type db for the DeBruijn representation of lambda terms. The define a function deBruijn:lam-->db that translates lambda terms to their de Bruijn equivalents.

    Click here to see a nice description of de Bruijn indices ( or look at the description in the textbook).

    Also define functions printLam:lam-->string and printDeB:DB-->string that give nice printed representations of lambda terms and deBruijn terms.


    Last modified: Tue Sep 30 22:38:05 EDT 2008