I developed a Text Simplification system with Dr. Vijay Shanker and Oana Tudor. The system detects clause boundaries and simplifies natural text. The goal is to reduce significantly the complexity of sentences while maintaining meaning.
Here's an example that shows a little of the difficulty introduced by conjunctions. The conjunction boundaries are highlighted. Notice the basic ideas in the sentence would be hidden to most automatic systems by the length and variety of the sentence. Using shallow processing and local clues, we hope to reduce this complexity and allow simpler, more reliable patterns and systems to be effective on sentences such as these.
Everything here should be considered "archives." We're currently annotating clause boundaries from simplification output on the GENIA corpus in order to: