Course Overview
Course Learning Objectives
This course will focus on the problems, current techniques, and
remaining open research challenges in developing and applying text analysis for
constructing effective software engineering tools.
Text analysis extracts, analyzes and leverages
information available in the text found in software artifacts.
Text includes the words, phrases, and sentences in natural language
text as well as identifiers and comments that programmers use in
writing code.
At the end of the semester, a student completing the course should have:
- general knowledge of the kinds of information available in software artifacts for their intended purpose and potentially useful for other purposes
- general knowledge of the kinds of software engineering analyses and tools that might benefit from text analysis: their goals, state of the art, advantages and limitations
- capability to write scripts to extract information from software artifacts
- knowledge of state of the art and first-hand experience performing
- text normalization, including tokenizing text, identifier splitting, word stemming, abbreviation expansion, and stop word list construction and use
- information retrieval applied to software artifacts, including IR models, query formulation and reformulation, corpus preparation and filtering techniques, and evaluation methods
- natural language processing applied to software artifacts, including part-of-speech tagging, lexicalization, ontology and dictionary creation, and text generation
- general knowledge of and experience applying text analysis to software engineering client tools, such as concern location, code search and navigation, code similarity detection, code classification, traceability between documents, documentation generation, code summarization, and refactoring.
- skills to read, summarize, and critique a research paper, with particular attention to gaining a clear understanding of motivation, goals, approaches, limitations, contributions, and experimental evaluation methods
- ability to coherently present research material to a group of informed researchers
- ability to listen carefully to and evaluate a research presentation
- comfort level and quality of contributions in actively participating in a discussion of a research problem and ideas for solving that problem
- skills to write a clear summary of a set of research papers, including a critique of the research that puts the research into perspective
- experience in designing, conducting, and presenting an independent project
Course Format
CISC 879 is an advanced graduate level course which follows a seminar-discussion
format.
In particular, we will be reading and discussing papers in the most recent
relevant conferences and journals, as well as foundation papers.
Meeting Times: TTh 11:00-12:15 AM (3 hours)
Meeting Place: 102A Smith Hall
Prerequisite:
Algorithms and programming languages would be helpful.
Restrictions: Undergraduates must obtain instructor's permission.
Instructor:
Lori Pollock
pollock cis udel edu,
101D Smith Hall (831-1953)
Office Hours:
Thursdays, 2:00-3:00 pm, and Thursdays, 9:30-10:30 AM,
and by appointment.
Course Requirements
Readings
There is no required textbook. We will be reading research papers that can be downloaded for free from digital libraries or otherwise provided online.Assignments and Grading
Your grade will be based on your performance in the various activities in the course. Some of the activities will be done in groups, and some will be done individually. The relative weights of the components of the grade will be approximately:TENTATIVE UNTIL FIRST DAY OF CLASSES
- (10%) Technical Paper Summaries/Critiques - See the PDP form for more information. After reading a paper, this should take no more than 20-30 minutes to complete.
- (15%) Class Participation - Most class times will focus on active discussion of research papers that we have all read before class time. We will use several different models for discussion, all with expectations that everyone participates actively. To earn full points each class time, for class participation, you need to make yourself visible and known by contributing to the discussion and group activities with insightful questions and comments.
- (15%) Homework Assignments for gaining first-hand experience with different text analysis techniques and client tools. These will be mostly in the first half of the semester.
- (40%) Research Project - Individual or with a partner
- (10%) Knowledge Exam - There will be one in-class, written exam to assess your knowledge of the concepts, problems, techniques, limitations of papers discussed in class and performed in homework assignments. This exam will be prior to the final project presentations starting, about 3/4 of the way through the semester.
- (10%) Class Teaching/Discussion Leadership - Each student will work with a partner or individually on developing and presenting one 30-minute presentation of an overview summarizing a small set of research papers to the class.
How to Increase your Learning and Success in CISC 879
- Participate actively in class. This course is not meant to be a passive learning course, as it has been shown that the best learning occurs when the learner is an active participant, not a passive listener. Besides, classes are much more enjoyable when the audience actively participates!
- Read the assigned research papers actively, but efficiently. The readings in this course are critical to your active participation in class meetings. Part of your grade is based on class participation. Besides, you should not expect to gain all understanding of the concepts from passively listening in the class periods alone. However, follow the guidelines discussed in class for efficient ways to read technical papers based on the goals of the reading.
- Work individually and conscientiously on writing paper summaries/critiques.
- Take an active role in your group projects. Take responsibility for your parts of the project/activity and Ask your group members to explain concepts you do not clearly understand, and share ideas among the group members. Meet regularly with your group for out-of-class projects.
- Form an informal study group outside of class. Compare your notes from class, work together on group projects, and discuss concepts you find unclear.
- Seek help if you start to feel lost, ASAP. Take advantage of instructor office hours.
Course Policies
Due Dates and Lateness:
The due dates are to be taken seriously and you should not expect them to be extended. The pace of work is implicit in the due dates and necessary if you expect to finish by the end of the semester. NO late programs or homeworks will be accepted FOR FULL CREDIT without discussion with me prior to the due date. If you can not reach me, leave a message on my voicemail or send email. All other assignments not delivered by the due date are considered late.Unless otherwise stated, late assignments will be penalized 5% off the total possible points if turned in within the first 24-hour period after the specified due date and time, and 5% per 24-hour period (or fraction of a day) (including weekends) after that time, up to a week after the due date. Late assignments will be accepted with penalty up to one week after the due date. Assignments submitted at any later time without an approved excuse will not be accepted. It is up to you to determine the version of your assignment to be graded. You must weigh the late penalty against the completeness of your assignment.
Regrading Policy:
If you are dissatisfied with a grade on a homework, programming assignment, or exam, you should consult the instructor directly within a week of the day the graded assignment was returned to you. No regrade requests will be considered after this week period.
Policy on Academic Dishonesty
You will be told specifically which assignments are to be done collaboratively in groups, and which ones should be done individually without collaboration. For individual assignments, you should be directing your questions to the instructor, not to other students, unless the question is a clarification question. Any evidence of collaboration other than this kind will be handled as stated in the Official Student Handbook of the University of Delaware. You should not be using or examining any program code used for projects for this course in any prior instantiations of this course. If you are in doubt regarding the requirements, please consult with me before you complete any requirement of this course.
Collaboration Policy:
Each assignment specification will clearly specify whether you work in groups or individually. For the purposes of the collaboration policy, students choosing to work with a partner are effectively considered as one entity, and are freely allowed to exchange, help, design, and code with one other, but the guidelines below apply outside the partnership (neither of you should be debugging, sharing code, etc. with other people or teams). There are also some specific rules that apply within the partnership.
Things that are always allowed
These things are encouraged and allowed at all times for all students.- Discussing material covered in lecture or handouts.
- Discussing the requirements of an assignment.
- Discussing features of any programming language (including the one for which we are writing a compiler).
- Discussing how to use the tools or development environments.
- Discussing general techniques of coding or debugging.
- Any discussion between the student and a TA, or me. You are welcome to discuss any and all ideas, design, code, debugging, and details with the course staff (TA and instructor). They are the best folks to talk to because they are knowledgeable about all the material and know how to help you without giving away the farm. They also have the authority to give you definitive answers to your questions.
Collaboration that is allowed if documented
Two students engaging in a more detailed discussion of the project specifics can cross into the area of collaboration that is acceptable only if documented. I require that you include the name of those whom you received specific assistance from and properly credit their contribution, as you would cite a reference in a research paper. Some examples:
- Discussing the design of the project. Design is a crucial part of the programming process, and discussion can be valuable, but you should take care to document any design input you got from others.
- Getting help from another student in order to debug your code. You should credit their assistance.
- Sharing advice about testing. For example, if the team next to you tells you about some lesson learned ("our program didn't handle the case where the input file didn't end with a newline") that you then use to improve your program's robustness, you should credit them for providing you with that insight.
Collaboration that is NOT allowed
Basically, the rule is that you should be handing in code or other work which represents your original, independent work. It should not be based on, influenced by, or copied from anyone else's.- Copying code or other work. This is the most blatant violation. You should not be copying anyone else's work. You should also not allow anyone else to copy yours. You should keep your work secure (restrict access on the filesystem, don't leave printouts lying around, etc.)
- Using work from past semesters. Using someone's work or solutions from a previous semester is an obvious violation.
- Looking for this project on the Internet and copying the code or other work.
- Studying someone else's code. You should not be reading anyone else's code whether it is on the screen or written out by hand.
- Debugging someone else's code. Debugging along with someone makes it too easy to look over their code and allow (sometimes unintended) code-copying. Describing to someone a problem and asking for advice on how to track it down is okay, but you should do the actual debugging yourself.
Resource Usage Policy:
Under no circumstances should you be copying work written by others found on the internet or provided by others in other ways. There is no learning taking place in such actions.Closing thoughts
Above all you should use your common sense. If you suspect that what you are about to do is a violation, play it safe and ask a staff member first rather than take risks with your academic career.
Cheating is taken very seriously in this course. Please do your part in maintaining a community where academic work is done with a high standard of integrity!
Some parts of this document are based on a similar collaboration policy for CS courses at Brown, Drexel, and Stanford.