N.B. This page has not been updated for too long. Click HERE for my NSF Nugget on U-Shaped Learning. Versions of corresponding papers (as well as other papers) coming soon.
ALSO, HERE is a link to a complete, uptodate (as of March 2009) list of my publications, including those on COLT.
Consider the problem of finding a rule for generating a sequence of numbers such as 9, 61, 52, 63, 94, 46, 18, 1, 121, 441, ... . Here is a rule for this sequence. First compute the squares of successive integers beginning with 3, but, then, to generate the sequence, use, in place of these squares, the squares each with its decimal digits written down in reverse order (ignoring any lead zeros). N.B. This rule can be written as a formal algorithm (or computer program). The problem of finding such rules gets harder as the sequences to generate get more complicated than the one above. Can the rule finding itself be done by some computer program? Interestingly, it is mathematically proven that there can be no computer program which can eventually find (synonym: learn) these (algorithmic) rules for all sequences which have such rules!
Here is a different problem. Given pictures of animals each (correctly) labeled as being or not being a picture of a bird, eventually find/learn an (algorithmic) rule for deciding, of each of those (past, present, and future) pictures, whether or not it depicts a bird. People seemingly solve this problem in childhood as well as many other problems of learning (subconscious) rules to predict membership in concepts (such as the concept of bird). Many cognitive scientists seek to model all of cognition (including concept learning) by computer program. It is, then, an interesting, very hard problem to write a computer program which can learn (algorithmic) rules to predict membership in all the concepts that people can . It's so hard we don't know how to do it yet, and we don't know how to to prove it can't be done.
Computational Learning Theory (COLT) is a branch of theoretical computer science which mathematically studies the power of computer programs to learn (algorithmic) rules for predicting things such as membership in a concept or, as in the first example above, rules for how to generate a sequence.
Besides the intrinsic scientific and philosophical interest, the expected primary applications of COLT are to construction of intelligent technology, especially technology which learns, and to cognitive psychology, including understanding human language acquisition (brief postscript bibliography available) and scientific inductive inference (brief postscript bibliography available). COLT seeks to provide a conceptual, mathematical infrastructure for these applied and basic areas.
I am especially interested in learning
models which can be explored with tools from mathematical
logic
such as
recursive function theory and
The empirical machine learning work of C. Sammut with others on teaching an autopilot to fly a flight simulator and the seminal theory paper on learning process-control,
Re Machine Learning, see also the two papers below on Concept Drift and Learning from Context.
For my other work on feasible computation and program size see:
I am also interested in characterizations of computational learning theoretic scenarios with an eye to providing useful insights and aiding machine learning practice. For example, I'm interested in characterizations regarding hypothesis spaces employed (click here for a paper reference below dealing with characterizations regarding the control structures employed in hypothesis spaces --- click here for more on control structures in general). Referenced below are also papers featuring characterizations of langauge learning in both noise-free data cases, in noisy data cases, and in computable noisy data cases.
I've recently been working directly on empirical machine learning problems (applied to bioinformatics).