Keith Trnka
PhD
Research

Various presentations

Most of these are word prediction talks without an associated publication. The presentations for my publications are available on the publication page next to each citation.

Below are descriptions of most of the projects I've worked on, starting with the most recent. Refer to the coursework section for any presentations I gave for a class.

Word Prediction (thesis work)

People who are unable to speak use devices that speak for them. They type words and the device speaks those words. But it's much slower than normal speech. The goal of word prediction is to help people type faster for these devices, especially people affected by motor impairments. It works much like the AutoComplete feature of Microsoft Word, by allowing the user to complete the rest of the word with the press of a single key. The one major difference from AutoComplete is that word prediction systems typically present the user with 5-7 candidates amongst which to choose. Similar features are built into many cell phones for text messaging.

In this work, I'm looking to improve the guesses of the word prediction software so that the correct word (the one the user wants) is presented earlier and at the top of the list of candidates. Currently, the most common method is the ngram model, which generates predictions that are appropriate for the previous few words. In general, these predictions are grammatically appropriate, but my work seeks to additionally make the predictions appropriate for the topic and style of the overall document or conversation.

Vehicle to Grid (V2G)

There are many problems with our power infrastructure - both the reliance on fossil fuels (e.g., coal) and foreign oil cause problems for our country. Renewable resources such as wind and solar power address these problems but they generate power intermittently, so by themselves they are unable to fulfill our power needs. However, if coupled with various kinds of energy storage, renewable resources may be able to (partially) replace fossil fuels.

Additionally, in the existing infrastructure there is economic incentive to connect batteries to the power grid - you can "sell" the ability of your batteries to stablize the rid. There are several reasons why power companies need this - generators take a while to respond to consumer demand, leaving a gap that needs to be temporarily filled with a quicker response. Similarly, generators sometimes fail and a backup must kick in quickly to prevent things like blackouts.

The goal of this project is to develop a way to meet these needs in a vision of the future world where pure electric vehicles are common. While a pure electric vehicle is plugged in overnight or at work, not only could the battery be recharged but the battery can be used to sell grid stability. In the short term, it means electric vehicle owners would make money by leaving their cars plugged in more, enough to cover the costs of charging and more.

I designed and developed the original coaliation software, which treats a collection of vehicles as a distributed battery. It involves quite a lot of work to use the battery when possible without causing the car's owner any trouble. My software runs both on a server that interacts with the power company, as well as on the electric vehicles which tells them to charge or discharge.

Natural Language Generation in Summarization

Summarization has become an important application for language processing. Most research has focused on extracting the most important information from the original document and presenting it within some length restriction. However, the coherence of the resulting text is usually ignored in current research. I looked at applying Natural Language Generation (NLG) techniques to make the summary easier to read. I did a little work towards this goal in a class. The three problems I've identified are discourse marker generation, sentence generation, and referring expression generation.

I implemented discourse marker generation using an RST-based summarization algorithm and the RST Discourse Treebank along with statistical methods to generating the discourse marker. Unfortunately, with a reasonably high probability threshold, I only generated a few discourse markers, at least one of which was not applicable because both text spans involved were quotations.

Parse Tree Application

Working with a system that requires a parser can be difficult. It's even more difficult to debug a grammar. To allow research involving parsers to accelerate, I've developed a tool for visualizing and comparing parse trees. This program, dubbed Parse Tree Application, will allow a user to enter a parse tree by hand or from a file and the program draws the tree. Two parses of the same sentence/string ma be compared visually using PTA. Click on the link above to go to this application's website.

PTA has many useful features, such as the ability to output rendered trees in EPS or SVG, rendering of feature structures associated with constituents, and drawing of some partial parses (e.g., (S (NP I) (VP went (PP to the store)))).

Predicting the gender of first names

I had the idea that first names have associated morphology for gender. To evaluate this hypothesis, I collected lists of first names, separated by gender, for a few different countries (America, Ireland, India, Greece, France). I found that I could predict gender with about 80% accuracy for each using a simple rule learning algorithm, even simpler than sequential coverage.

Even though I allowed the system to learn multiple letters, the rules that were learned used only the last letter of the name.

Summarizing Information Graphics

I worked for about a year with Sandee Carberry and Stephanie Elzer in understanding information graphics. The basic idea is that graphs, such as bar charts, are designed intentionally to be understood a certain way. This intentional graphic design is a form of communication, and language processing methods can be applied to the design. For instance, highlighting a particular bar or providing a value on top of a particular bar might allow the reader to deduce certain conclusions more easily.

My part in this work was twofold: to build a collection of information graphics and to investigate the utility of captions. In particular, I looked at captions that served as titles for the graphic.