Sara's Guide to Writing a Good Lab Report

This document is a work in progress. Let me know if you have specific questions about something I did not address or if I should clarify anything.

I think I scared the class off with all of my feedback on the first lab. I give you feedback so that you'll learn from the feedback and understand our expectations. And, of course, so that you'll improve your grades.

Start Early and Ask Questions

It seems that there are common problems/confusions on the lab, yet few people are asking questions about the lab until the lab is almost due. We will post common questions/answers on the blog, but we don't know what you're confused about until you ask.

Break each assignment into "doable" chunks. You should be very adept at problem solving. You can attack one chunk and make sure that works (say, distribution), then attack another part (collection), and then put the two pieces together and make sure that they work. Use "dummy" data--placeholders for the data so that it will be clear if you made a mistake in one of the pieces. (For example, use a matrix filled with numbers 1 to N, rather than a matrix filled with 0s.)

• The Art of Asking a Question

Too often I get email like, "I have a bug. Can you look at my code?" I can, but that is a waste of my time because I don't know a) what the bug is or b) how the bug manifested itself (what type of problem you're having). You should describe the bug in detail and show that you have plenty of print statements or whatever is necessary for me to understand the problem. You should also send me an example execution that shows the bug. When possible, show the bug in the smallest case, e.g., with only 1 or 2 processes and with a small data set. You should also get rid of any irrelevant code so that it's easier to look through. (You should do most of these steps on your own to improve your debugging process.)

Write an Elegant, Efficient Program

Some of you assume that a correct program is a "good" program. A correct program is not necessarily a good program. Correctness only gets at the program's functionality--does the program generate the appropriate output. Of course we want correct programs, but programs must also be efficient and readable. Efficiency and readability may not be your first goal (get a working program first), but then you can attack the bottlenecks (which are usually obvious in your timings) and make sure that someone can read and understand your implementation.

• Efficiency

In this course, we usually think of efficiency in terms of memory, computation, and communication. (Clearly, the three are not mutually exclusive.) You should know a lot about memory and computation efficiency from earlier classes. The new part is communication efficiency. You want to send few messages with only the necessary data. Avoid having a central bottleneck. Try to find a process that has the data and will not be overwhelmed by other requests.

• Readability

I hate to play the "real world" card, but ... In the real world, managers, technical leads, and peers may read your code to review, debug, or maintain the code. They must be able to read and understand your code quickly. Using comments that describe the high-level ideas, such as what a class, a function, or a block of code does, will make the process much easier. Think about what you'll look for when you have to maintain or debug someone else's code. If that doesn't motivate you, think about when you need to return to these assignments months or years later to help you solve your current problem. You won't remember what you were doing ("Lab 3 part 2? What was that?"), but the comments will jog your memory ("Lab3: Parallel solution to Life Game, using row-block data distribution of matrix. After initial distribution of matrix, requires communication between neighbors... The rules of life are ...") And, finally, I need the comments at the top of the programs so that I can easily identify who wrote each program; the programs often have similar names and I don't know whose program is whose without the comments.

The style rules from introductory programming classes still apply.

If you attacked the problem well, your code should be easy to follow. Your solution should break into easily defined pieces that can be identified by good comments (/* distributing data in blocks of size N/P, where N is the size of the data and P is the number of processors */) and/or be put into functions ( distributeData() ). Confusing and/or inaccurate comments may be worse than no comments because the reader may misinterpret what your code does.

Your code should start with your name and a description of the program. Take proper credit! Someone maintaining your code may need to ask you for help understanding your code. (Of course, with your superior descriptions in comments, they might not need to ask for your help.)

Always comment magic numbers by describing what they're used to accomplish. ( boardSize[i]*2 + 2*numRows -1; //allocate extra space for X, Y, Z ... )

Improper indentation and alignment are inexcusable because they can be performed automatically by using a text editor just before submitting. For example, in Emacs, you can select the entire program and choose "Indent Line or Region" and your code is all lined up neatly.

Write a Thorough Analysis

Unlike in many English classes, technical writing is not meant to be creative. You are trying to make another person understand your results and what to conclude from those results. You want them to know that you discovered a direct relationship between a diet rich in fruit and vegetables (independent variable) and risk of cancer (dependent variable); therefore, people should eat more fruits and vegetables.

Don't leave analysis to the last minute. Think about the goals of the lab and what you need to show. Sit with your timing results and think about them in all different "directions". Think about how the results changed by changing various independent variables (e.g., data distribution, number of processors, etc.). In general, dependent variables are the things that you're measuring, while the independent variables are the "givens". You want to see how the dependent variable (e.g., execution time) changed as you changed an independent variable (e.g., number of processors).

Think about the different dependent variables. If the dependent variables are related (e.g., pieces of a larger metric), analyze the pieces individually and in aggregate. For example, consider if the total time is dominated by one specific piece of the program (e.g., data distribution). Explain why that piece dominates the total time, e.g., explain the source of the bottleneck.

Sketch out all of the ideas on paper before you write your sentences. Figure out what the main ideas of your analysis are. Those ideas are likely to become a topic sentence for each paragraph of your analysis.

Also, I'm not looking for quantity, just quality. If you talk about the effect of each independent variable on your various dependent variables succinctly and completely, you'll have a good analysis.

• Pattern for writing analysis

When you write the analysis of a graph, follow this process: summarize the general trend, give an example of the trend, and give a counterexample of the trend, if appropriate. Make sure you state the trend explicitly and precisely ("larger" and "smaller", rather than "higher" or "lower") and don't only talk about individual points. Clearly state how the dependent variable varies with the independent variable (increases? decreases? remains constant?). Saying as the independent variable changed is not precise because the reader does not know how the independent variable is changing (increasing, decreasing, or otherwise).

Reread your document. Make sure the writing makes sense to someone who does not know your implementation details.

• Include your methodology

If the spec leaves something open, such as timing the different sections of your parallel algorithm, describe how you implemented that requirement. For example, explain what is included in the "computation" portion of the timing and why (one iteration of the loop? communication with neighbors?). Your methodology may affect your results, so it is important to describe your methodology. Make sure that your description makes sense without intimate knowledge of your implementation (e.g., not "I start the timing before the first Send and end after the third Recv"). Finally, be sure that you justify your decisions. If you choose not to time a chunk of code, explain why your decision is appropriate (not relevant, makes times more fair, ...).

Correct English Grammar and Spelling

Writing clearly and concisely is an important skill. You will need to communicate with co-workers over email, and you want them to understand what you're saying easily.

I expect good English grammar (that means complete sentences) and, of course you should spell check your summary, hypothesis, analysis, etc. before submitting your lab.

Write precisely. Try not to use "this" and "that" because the reader may not know to what you are referring. Even using "this process" can be ambiguous. Use nouns as appropriate. For example, "each received some" should be written clearly, such as "each processor received some data".

Bonus: Write in active voice.

Present Your Results Clearly

Label your graph with the dependent variables/metrics (e.g., Execution Time, Size of Data), independent variables (e.g., Number of Processors), and the units (e.g., seconds, kilobytes).

If you're presenting a graph, also include the raw data used to generate the graph.

Follow the Spec

This point seems like an easy/obvious one, and it probably is. Reread the directions frequently before submitting to make sure that you're doing all that is required. The spec contains details that make it easier for us to review and grade your homework. If you did something differently than the spec said, check that your alternative is acceptable well before the lab is due.