Perl Project

Perl Project
Building Code Analysis and Instrumentation Tools and Trace Analysis Tools

CISC 470/670 - Programming Languages
Fall 2000

DUE DATES:
	Part 1: Code Analysis and Instrumentation Tool (Originally Due 9/12/00 at start of class, now due 9/14/00)
	Part 2: Trace Generation and Analysis Tool (Due 9/19/00 at start of class)

NOTE: This is a group project.  You will be assigned to a group to work on
this project.  To promote everyone's participation in the group, 
the project will be graded,
and then each individual's grade will be computed by weighting the project
grade by a percentage that is determined to reflect your individual
effort on the project. 
The weight for your individual effort on the project will be 
determined through peer and
self reviews.
If your reviews indicate that you did not participate at all on the
project meetings, discussions, codings, etc, then your weight will be 0,
and your grade on the project will be 0.  If your reviews indicate that you
participated actively in the meetings and coding, etc, then your weight 
will
be 100, and your grade on the project will be the project grade itself.
Similarly, it is possible to get a percentage between 0 and 100, depending on
your individual effort.
Each member of the group can contribute in a variety of ways, but each should
be involved in some part of the coding in order to gain the Perl experience.
This should be kept in mind when the responsibilities are divided among
the group.

PURPOSE: This project has several different purposes.
  - to gain some experience writing programs in a scripting language, Perl 
  - to apply regular expressions to build a useful tool related to programming languages
  - to gain the experience of building code analysis and instrumentation
tools and execution trace analysis tools
  - to gather and examine data on real object-oriented programs to gain
    some insight into what programmers really do in this paradigm 

MAJOR TASKS:
The project is divided into two pieces of software.

	1. The Code Analysis and Instrumentation Tool
	
	2. The Program Trace Generation and Analyzer 

TARGET PROGRAMS FOR ANALYSIS:
You may either target Java or C++ programs for this analysis.  
We will supply some programs for you to analyze, in addition to ones
you may be able to dig up from the internet.

Code Analysis and Instrumentation Tool.
Your Code Analysis and Instrumentation Tool (let's call it CAN_IT for short)
should have a switch on the command line that indicates whether the user
wants to perform (1) static analysis, (2) code instrumentation for
dynamic analysis, or (3) both.
CAN_IT should then act accordingly.  CAN_IT should be able to work with
a set of files that make up the application, but need not analyze any
libraries called by the user program.  

The static analyzer component counts the number of occurrences of syntactic
constructions from the list below:

(1) lines of code (This is usually called LOC in the literature, and does
not contain comments and blank lines.)
(2) number of classes 
(3) number of occurrences of explicit exception handling coded by the user 
(4) number of loops (in total, and number of each of the different kinds
of loops in the language; e.g., number of for's, number of while's,...)
(5) number of conditionals
(6) number of method calls
(7) number of instances of print statements
(8) number of comment lines
(9) number of variables declared of primitive types (int, boolean,...)
(10) number of occurrences of the new construct 

All of these counts are called static because it is only an indication of
how many exist in the program, not how many actually get executed on a given
execution of the program.

Your task for static analysis:
Undergraduate groups should implement their CAN_IT to
count the items numbered 1-5 above, while the graduate 
groups should implement theirs to count the items numbered 1, 6-10 above.
A nicely formatted file containing a report of the final values for
each counter (nicely labeled) should be produced by the static analyzer.

Extra Credit:
Also count the following:


(1) number of calls to user-defined routines versus library routines


(2) number of potentially polymorphic calls


(3) number of classes that are extended from other classes (can look for
a:b in C++ or extended keywords in Java. 

The code instrumentation  part of CAN_IT is responsible 
for inserting counter
statements into the code to create an instrumented
program that will be compiled and run to obtain dynamic
statistics.  
Note that the code instrumentation statements should not be included
in the static analysis statistics, so the static analysis should occur
first on the original program.
While CAN_IT could insert counters for every instruction to find
out which instruction was executed most frequently,
we are not interested in quite that much detail.

The specific counters to be inserted include:

(1) a counter at the start of each method to count how many times  that method was
executed.
(2) a counter immediately before each method call to count how many times that method call was executed.
(3) counters at each branch of each conditional to count how many times the true branch
is taken versus the false branch. 

In order to create a readable and informative report with the final values
of the various counters, each counter that is created by your tool needs
to have an associated line number or name, indicating the method, method call,
or conditional which it is profiling.  
You should use your judgement in what to maintain for each counter and how
to produce a nicely formatted report of the profiling information.
CAN_IT should insert print statements to cause the final results of each
counter to be printed to a separate profile file, separate from
the program's normal output.
You might want to look at the output of gprof as an example of such a report.

Program Trace Generator and Analyzer.
There are some times when it is helpful to obtain a trace of part or all of
the execution of the program.  Different kinds of events can be traced.
The trace gives information about the ordering of the events of interest,
in addition to the frequency of the individual events.  

Your task is to enhance CAN_IT to have a switch for tracing.  The switch
indicates a single method to be traced each time it is called.  When tracing
is turn on, CAN_IT should do the following in addition to its other actions:

1. Insert print's into the program to trace all method calls and 
returns of the program in order to produce a trace file
that contains the dynamic call chain.  
2. Insert print's into the program to trace the execution of the method 
that was indicated by the switch, say method SUB.
The trace should output to a separate trace file a trace that
contains for each call to SUB, a line saying "enter SUB", 
the line number of every branch instruction with T or F to indicate
whether it was evaluated to true or false, and then, an 
"exit SUB" line.  A single trace
file for SUB and a trace file for the method call chain
should be produced in addition to the program's normal ouput
and other dynamic analysis results.

Extra Credit:
Create a trace analyzer that scans the call chain trace
and detects any recursion in the trace.  Recursion would cause
a call chain that contains the same name more than once before a return from
the repeating method,
e.g., A->B->C->D->return D->B->...  has recursion involving B.
A report that shows 
any recursive chains (starting from the repeating method) should be created.
If none exist, then the report should indicate no recursion executed. 

What to Hand in:
You may hand in your assignment by either setting up a password protected
web site for the TA to use (and email the password to the TA and instructor)
or emailing the TA with a set of attachments.  

For the first due date, you need to include the following:

(1) the perl program for CAN_IT
(2) a README file that explains
what your tool does, how to invoke it, any other nuances in order to
use it properly, and your group members' names and contributions listed.
(3) a table of your static analysis results 
and a table
of your dynamic analysis results for your set of programs, at
least 5 different programs run and analyzed.  
Your tables should show the results from your analyzer, with each program as
a separate line of the table.
(4) a scripted run of your tool demonstrating how it works for one program

For the second due date, you need to include the following:

(1) your updated CAN_IT 
(2) a README file that
explains your changes, how to invoke it, any other nuances
in order to use it properly, and your group members' names and contributions
listed.
(3) a scripted run demonstrating how it works for one program


Last Change: September 4, 2000 /
pollock@cis.udel.edu
Perl Project Building Code Analysis and Instrumentation Tools and Trace Analysis Tools CISC 470/670 - Programming Languages Fall 2000

Code Analysis and Instrumentation Tool.

Program Trace Generator and Analyzer.

What to Hand in:

Perl Project
Building Code Analysis and Instrumentation Tools and Trace Analysis Tools

CISC 470/670 - Programming Languages
Fall 2000