diff --git a/WORKLOG.md b/WORKLOG.md index 371422e..ccb00f0 100644 --- a/WORKLOG.md +++ b/WORKLOG.md @@ -11,15 +11,18 @@ - The Second Pass does type checking for each statement and definition. It also recursively 'expends' every class and function definition and creates sub-scopes for them. When expending, we first need to process the underlying declarations and add them to the sub-SymbolTable of the corresponding scope. The statements inside these classes/functions or blocks are checked with their corresponding sub-SymbolTable. More specifically, each function, class, variable declaration are visited twice, the first time to create (sub-)SymbolTables and the second time to determine types. We reused the code in `DeclarationAnalyzer` declarations in such sub-scopes. To do this, in the `TypeChecker` class, we create an object of `DeclarationAnalyzer` and dispatch the nodes we need to analyze and update the current SymbolTable from to declaration analyzer. After analyzing a sub-scope, every function and class declaration is visited the second time to dispatch the underlying statements. This process is done recursively until we reached the deepest structure(function/class). Since the `dispatch` method is basically a function call and the traversing order tp the AST nodes follows the scoping hierarchy, we naturally make use of the stack frame to push and pop the sub-SymbolTables. Overall, each declaration is visited exactly twice, and each statement is visited once. - The compilation will stop when there're errors found during the first pass. We didn't use more passes because this 2-pass architecture is sufficient to complete type checking for ChocoPy and more paths will only add to the complexity of the algorithm. ## Recovery: - - Whenever an error is encountered that causes ambiguity, we chose a default action and continue the compilation process. For example, when a Type mismatch happens, the default action is that the lhs keeps its original types. - - The compilation process stops when errors are found in constructing global symbol table. Because declaration errors adds too much ambiguities and it will make less sense to continue compiling. + - Whenever an error is encountered that causes ambiguity, we chose a default action and continue the compilation process. For example, when a Type mismatch happens, the default action is that the lhs keeps its original types. + - The compilation process stops when errors are found in constructing global symbol table. Because declaration errors adds too much ambiguities and it will make less sense to continue compiling. ## Challenges: - Nested structures were a challenge. A function inside a function/class needs us to build correct scoping as well as dealing with dependencies. - This is dealt by the declaration-statement-definition recursion we described in the second pass above. - Error reporting in PA2 is more complex than PA1, generally because there're more types of errors can happen in semantic analysis and there needs to be default actions when each type of error happens. And because of that, the correctness of error handling is very hard to check. - In order for us to easily determine the correctness, we intentionally matches the error messages to the reference implementation. - However, there're certainly discrepancies with the reference compiler because of implementation or architectural differences. We didn't matches those differences that doesn't seems to affect the overall correctness, we'll show these differences in diff.py. - - Assignment compatibilities. - - We dealt with this by finding the least common ancestor of the two classes. This is implemented as a static helper method in class `StudentAnalysis`. + - Assignment compatibilities. + - We dealt with this by finding the least common ancestor of the two classes. Special cases such as empty lists are dealt with seprately. This is implemented as a static helper method in class `StudentAnalysis`. + - Testing various scenarious with similarly defined variables were time consuming. Instead, we defined certain set of variables in the begging of the student contributed test programs, and then used the same variables troughout the programs to cover various bad and good scenarious. + - Another challenge was to come up with good test cases to have a broader cover. Our approach to this issue was investigating Type Checking rules and writing adverse code to those rules to see if our analyzer can make correct inferences. + ## Improvements: - - Added more tests to rigorously check program flow. And a test(diff.py) to show a case where our implementation showed better recoverability compared to the reference compiler. + - Added more tests to rigorously check program flow. And a test(diff.py) to show a case where our implementation showed better recoverability compared to the reference compiler. \ No newline at end of file diff --git a/src/test/data/pa2/student_contributed/bad_semantic.py b/src/test/data/pa2/student_contributed/bad_semantic.py index 490a475..c18142b 100644 --- a/src/test/data/pa2/student_contributed/bad_semantic.py +++ b/src/test/data/pa2/student_contributed/bad_semantic.py @@ -1,14 +1,93 @@ -x:int = 1 -x:int = 2 +# class defs +class A_CLASS(object): + a_class_i:int = 0 + def __init__(self:"A_CLASS", x:int): + self.x = x -x + # Bad, self param is missing + def add(y:int) -> int: + y = y+self.x + return y -def fun_1() -> bool: - if True: - if True: - return True - # All path should return, not just one +class B_CLASS(object): + b_class_i:int = 0 +class C_CLASS(B_CLASS): + pass -fun_1() +# Bad, duplicate class def +class A_CLASS(object): + pass + +# Bad, E_CLASS is not declared +class D_CLASS(E_CLASS): + pass + + +# var defs +a_s:str = "a_s" +b_s:str = "b_s" +c_s:str = "c_s" + +a_i:int = 0 +b_i:int = 0 +c_i:int = 0 + +a_b:bool = False +b_b:bool = False +c_b:bool = False + +a_list:[int] = None +b_list:[int] = None +c_list:[int] = None + +a_class:A_CLASS = None +b_class:B_CLASS = None +c_class:C_CLASS = None + + +# fun defs +def f_1() -> object: + def f_f_1() -> object: + # a_s:int = 0 Fails if we uncomment this line + global a_s # Bad, duplicate declarion + print(a_s) + pass + pass + +def f_2() -> object: + f_a_s:str = "s" + def f_f_2() -> object: + nonlocal f_a_s + print(f_a_s) + pass + pass + +def f_3(x:int) -> bool: + f_b_s:int = 3 + if (x + f_b_s > 3): + return True + elif (x + f_b_s == 3): + print("Equal") # Bad, this path should return + return False + +def f_4() -> object: + f_a_i:int = 2 + a_i = f_a_i + 1 # Bad, cant assign to a_i without declaring it as global or nonlocal + return f_a_i + +# NEGATIVE TEST CASES - SEMANTIC +# Bad, f_2 cannot be redefined in the same scope +def f_2() -> object: + pass + +# Bad, print cannot be redefined in the same scope +def print(val:object) -> object: + pass + +# Bad, a_i cannot be redefined in the same scope +a_i:int = 2 + +# Bad return +return a_i diff --git a/src/test/data/pa2/student_contributed/bad_types.py b/src/test/data/pa2/student_contributed/bad_types.py index 91ba352..262c0ea 100644 --- a/src/test/data/pa2/student_contributed/bad_types.py +++ b/src/test/data/pa2/student_contributed/bad_types.py @@ -1,5 +1,111 @@ -x:int = True -x + [1] +# class defs +class A_CLASS(object): + a_class_i:int = 0 + def __init__(self:"A_CLASS", x:int): + self.x = x + + def add(self:"A_CLASS", y:int) -> int: + y = y+self.x + return y + +class B_CLASS(object): + b_class_i:int = 0 + +class C_CLASS(B_CLASS): + pass + +# var defs +a_s:str = "a_s" +b_s:str = "b_s" +c_s:str = "c_s" + +a_i:int = 0 +b_i:int = 0 +c_i:int = 0 + +a_b:bool = False +b_b:bool = False +c_b:bool = False + +a_list:[int] = None +b_list:[int] = None +c_list:[int] = None + +a_class:A_CLASS = None +b_class:B_CLASS = None +c_class:C_CLASS = None + + +# fun defs +def f_1() -> object: + def f_f_1() -> object: + global a_s # Fails if we change it to z, which doesnt exist in global scope + pass + pass + +def f_2() -> object: + f_a_s:str = "s" + def f_f_2() -> object: + nonlocal f_a_s # Fails if we change this to a_s which is in global scope but not in upper scope + pass + pass + +def f_3(x:int) -> str: + f_b_s:int = 3 + return x*f_b_s + + +# Declarations +a_list = [1, 2, 3] +b_list = [0, 0, 0] +c_list = [-1, -2, -3] + +a_class = A_CLASS(5) +b_class = B_CLASS() +c_class = C_CLASS() + + +# NEGATIVE TEST CASES - TYPES +c_i = True + +c_i + [1] # Bad, addint list to an int + +a_i = a_b = z = "Error" # Bad, z is not defined and a_b is boolean + +a_s = a_s + 1 # Bad, adding integer to a string + +c_s = a_s[a_s] # Bad, indexing with a string variable + +b_class.b_class_i = 2 # Bad, object attribute is not assignable + +f_1 = 5 # Bad, function is not assignable + +f_2 = f_1 # Bad, function is not storable + +a_i = b_class.b_class_i = z = 5 # Bad, b_class.b_class_i is not assignable and z is not declared + +x_i = "ss" # Bad assignment + +a_s = a_i + b_i # Bad, assigning integer to a variable with string type + +a_s = a_i == b_i and True # Bad, assigning boolean to a variable with string type + +a_list = a_list + a_s # Bad, adding string and list + +a_s = a_list[a_s] # Bad, indexing with a string variable and assigning int to str + +a_list[1] = "a" # Bad, assigning str to int + +a_i = f_3(3) + 5 # Bad, f_3 has string return type but it actually returns an int + +f_1() + +f_2() + +a_i = a_class.add(a_s) # Bad, passing string where method expects int + + + + + -y:bool = False -x = y = z = "Error" diff --git a/src/test/data/pa2/student_contributed/good.py b/src/test/data/pa2/student_contributed/good.py index c16c124..825c9fb 100644 --- a/src/test/data/pa2/student_contributed/good.py +++ b/src/test/data/pa2/student_contributed/good.py @@ -1,83 +1,109 @@ -# Below this point we have all the same test cases from PA1 for validation purposes. -class Foo(object): - x:int = 0 - - def __init__(self:"Foo", x:int): +# class defs +class A_CLASS(object): + a_class_i:int = 0 + def __init__(self:"A_CLASS", x:int): self.x = x - def bar(y:int): - print("Hello World!",self.x+y) - y = 10 - -def get_stones(name:str)->str: - def map_name(nm:str)->str: - return stones[color.index(nm)] - color=["Red","Blue"] - stones=["Mind","Soul"] - return map_name(name) - -def funa(): - def funb(): - print("Hello") - funb() - -def fund(): - def fune(): - print("Hello") - c = 4 + 5 - -def funf(): - def fung(): - print("Hello") - c = 6 - c = 4 + 5 - - -if True: - if True: - if True: - print("Hello") -print("World") - -if True: - if True: - if True: - print("Hello") - print("World") - -if True: - if True: - if True: - print("Hello") - print("World") - -if True: - if True: - if True: - print("Hello") - else: - print("World") - -if True: - if True: - if True: - print("Hello") -else: - print("World") - - - -f = Foo(1) -print(f.x) -f.bar(4) - -a=[[[1],[2]],[[3],[4]]] -print(a[0][0][1]*a[1][1][0]) - -multiline_string="Hi World,\ -Here I am" - -expr_precedence = -a + b * (c + d) - -stone="Blue" -print(get_stones(stone)) + def add(self:"A_CLASS", y:int) -> int: + y = y+self.x + return y + +class B_CLASS(object): + b_class_i:int = 0 + +class C_CLASS(B_CLASS): + pass + +# var defs +a_s:str = "a_s" +b_s:str = "b_s" +c_s:str = "c_s" + +a_i:int = 0 +b_i:int = 0 +c_i:int = 0 + +a_b:bool = False +b_b:bool = False +c_b:bool = False + +a_list:[int] = None +b_list:[int] = None +c_list:[int] = None + +a_class:A_CLASS = None +b_class:B_CLASS = None +c_class:C_CLASS = None + + +# fun defs +def f_1() -> object: + def f_f_1() -> object: + global a_s + print(a_s) + pass + pass + +def f_2() -> object: + f_a_s:str = "s" + def f_f_2() -> object: + nonlocal f_a_s + print(f_a_s) + pass + pass + +def f_3(x:int) -> int: + f_b_s:int = 3 + return x*f_b_s + +# Declarations +a_list = [1, 2, 3] +b_list = [0, 0, 0] +c_list = [-1, -2, -3] + +a_class = A_CLASS(5) +b_class = B_CLASS() +c_class = C_CLASS() + + +# POSITIVE TEST CASES + +#------------------- +# String operations +# String addition and assignment operations +a_s = a_s + b_s +print(a_s) + +# Assigning to a string with string indexing operation +c_s = a_s[0] +print(c_s) + + +# -------------------- +# Boolean operations +a_b = a_i == b_i and not b_b +print(a_b) + + +# -------------------- +# List operations +a_list = a_list + b_list +c_i = a_list[0] +a_list[1] = 2 + + +# -------------------- +# function operations +a_i = f_3(3) + 5 +f_1() +f_2() + + +# -------------------- +# class operations +a_i = a_class.add(2) +print(a_i) + + +a_i = a_class.add(c_class.b_class_i) +print(a_i) diff --git a/test_py_file.sh b/test_py_file.sh index 4e47b37..8334e48 100755 --- a/test_py_file.sh +++ b/test_py_file.sh @@ -9,8 +9,10 @@ fi echo "Testing file ${FILENAME}" +echo "Generating .ast.typed file using student parser and reference analyzer" java -cp "chocopy-ref.jar:target/assignment.jar" chocopy.ChocoPy --pass=sr \ ${FILENAME} --out=${FILENAME}.ast.typed +echo "Comparing the pervious output with student parser and student analyzer" java -cp "chocopy-ref.jar:target/assignment.jar" chocopy.ChocoPy \ --pass=ss --test ${FILENAME}