Homework 4: HTML indentation.

Due Thursday, May 6.

HTML markup language consists of text interspersed with "tags" which provide information about how the text is to be displayed. Each tag is encapsulated between `<' and `>'. For instance, `<p>' indicates the start of a new paragraph and `</p>' indicates the end of a paragraph. <ol> starts a numbered list, <li> starts an item of the list, and </ol> indicates end of the list.

When editing HTML source text containing all these tags, it is very helpful if the source is indented in a systematic way, very similar to the indentation we use with C++ programs. The rules of the indentation we want are very simple. Divide the text into segments called "tokens". A token is either an html tag or a portion of text containing no tags and starting from the end of a previous token to the end of a line or to a tag.

For example, non-indented:
<ol> <li> first item <li> second item </ol>
and properly indented:
<ol>
  <li> 
    first item 
  <li> 
    second item
</ol>
Incidentally, either would be displayed by a browser like this:
  1. first item
  2. second item
The project directory is ~saunders/220/html/. It contains a Makefile, tools.h, test*.in. The Makefile and tools.h are for your convenience and can be used as you wish. Makefile assumes you will create an indent.cc file containing the program (and probably #including tools.h). The test*.in files are little test cases. You will be asked to run your program on a larger test case near the due date.