Lecture 15


Data type

 - implicit context for operations

 - limit the set of operations that may be performed in 
   a semantically valid program


Type system

 - a mechanism to define types and associate them with 
   certain language constructs

 - a set of rules for type equivalence, type compatibility, and 
   type inference.

   > Type equivalence: determine whether the types of two values are the same.

   > Type compability: determine whether a value of a given type can be used
                       in a given context.

   > Type inference: determine the type of an expression based on the types of
                     its constituent parts or the surrounding context.


Definition of types

 - denotational: a type is simply a set (or domain) of values. Therefore, a value 
        has a given type if it belongs to the set.

     > a data type is a set  of values

 - constructive: a type is either one of a small collection of built-in
        types, or composite type created by applying a type constructor 
        to simpler types

 - abstraction-based: a type is an interface consisting of a set of 
        operations with well-defined and mutually consistent semantics

     > a data type is a set of values, together with a set of operations
       on those values having certain properties.


Classification of types

 - built-in (or primitive) types: integers, characters, Booleans, and real numbers.

 - enumeration types

 - subrange types

 - composite types

   > records (Cartesian Product)

   > variant records (unions)

   > arrays

   > sets 

   > pointers

   > lists

   > files

  Note: In declarations of types, there are two kinds: 

     o named types, and 

     o anonymous types (type constructors are applied without giving them names)
        e.g., (in the following C code, variable a has a data type with two names;
                variable b's type has one name RecB; and varible c's type does not
                have name.)

        struct RecA
        {
          char x;
          int y;
        } a;

        typedef struct RecA RecA;

        typedef struct
        {
           char x;
           int y;
        } RecB;


         RecB b;

         struct 
         {
            char x;
            int y;
         } c;

      
Type checking


 - static 

     C, ML

 - dynamic

     Lisp, Scheme, Smalltalk

    
 - strong

      ML, Ada

 - weak

      C, Scheme

 benefits of strongly-typed:

  o static type information allows compilers to allocate memory efficiently
  o save compilers from generating code for run-time type checking
  o readability
  o reliability


 -type equivalence:

  > structural equivalence: two types are the same if they consist of the
    same components, put together in the same way.

    e.g., in Algol-68, C, and ML.

    how to determine: 
          
          step 1. expand each type definition recursively by replacing embeded type names
                  with respective definitions until only built-in types

          step 2. compare the two expanded strings, if they are the same, then the types are
                  equivalent

    e.g., type foo = record
              a : integer
              b : integer
          end;

          type bar = record
              b : integer
              a : integer
          end;

    If structural equivalence is adopted, foo and bar are not the same type.  

    Similarly, 

          struct RecA 
          {
             char x;
             int y;
          }
  
      and 
          struct RecB
          {
             char a;
             int b;
          }

      are not considered equivalent.


    pros: straightforward and easy to implement

    cons: 
          - implementation-oriented
          - can not distinguish between types that programmers may think of as distinct
          but which happen by coincidence to have the same internal struture.


         e.g., type student = record
                  name, address : string
                  age : integer
               
               type school = record
                  name, address : string
                  age : integer

               x : student;
               y : school;

               .....
               x := y; 


  > name equivalence: each definition introduces a new type. 
    (This is stricker than structural equivalence)
    
    e.g., in Java, Pascal, and Ada

    e.g., 

        struct RecA
        {
          char x;
          int y;
        };

        typedef struct RecA RecA;

        struct RecA a;

        RecA b;
        struct RecA c;        

         struct 
         {
            char x;
            int y;
         } d;


     all of the variables a, b, c, and d are structurally equivalent.
     a and c are name equivalent, and they are not name equivalent to b and d.
     b and d are not name equivalent to any other variables
      

   caveat: aliased types 
                       distinct    --> strict name equivalence
                       equivalent  --> loose name equivalence

  > type conversion and casts: use value of one type where another type is expected

    e.g., 

            x = (int) (2.3 + (double) (x /2));
     

- Type compatibility and coercion

  S is compatible with T iif one of the following is true
      1) S and T are equivalent,
      2) one is a subtype of the other, or
      3) both are arrays with the same numbers and types of elements in each dimension.

  a type coercion is just an implicit conversion allowed by type compatiblity.

  e.g., 

      int x = 3;
      ...
      x = 2.3 + x /2;


   variable x has the value 3 after the assignment, which involves two implicit conversions:

      double (1 -> 1.0)     widening
      int (3.3 -> 3)        narrowing


- Type Inference

  can be checked using syntax trees

  e.g., (in C-like syntax) the expression   a[i] + i 


                   +
                  / \
                 /   \ 
                []     i
               /  \
              /    \
             a      i


  Hindley-Milner type checking


  > assign type variables to all the names that do not have known types yet.

  > instantiation: once a type variable is replaced by an actual type, all instances
                   of that variable (subject to scope rules) must be updated to that same type.

  > unification: 


      e.g., 

           E1 : 'a * int
           E2 : string * 'b

           if x then E1 else E2


        By unifying  what it is known about E1 and E2, we would arrive at 

          both E1 and E2 are of type: string * int

     It is a kind of pattern matching.


  In Hindley-milner type checking, types can remain as general as possible ==> polymorphism

      e.g., 

               a[i] = b[i]

       type checking ==> the types of a and b are narrowed to "array of 'a", but 'a is still
                         generic

    
    Note: 1) This is parametric polymorphism. (versus overloaded functions)
          2) This is implicit parametric polymorphism (versus explicit polymorphism

               e.g., 

                  typedef struct StackNode
                  {
                     ?? data;  // how to make this polymorphic
                     struct StackNode * next;
                   } * Stack;


                  In ML this can be done as follows

                  datatype 'a Stack =  EmptyStack
                                      |Stack pf 'a * ('a Stack); 


          3) subtype polymorphism ( will be discussed in OO programming in next 3 lectures)