Research

Some items in this section are password protected and only accessible to me and my supervisors (at the moment). Please contact me to get the password to download/access the files for yourself.

The motivation for my PhD research began when I realised that pattern recognition studies seem over-reliant on the classification accuracy criteria for assessment of a learner's performance. However, I noticed that, whilst collaborating with the medical staff at St. Bartholomew's hospital in London, end-users often want more than just a "black box" technique for diagnosis, and often prefer some probabilistic interpretation of what the patient is likely to suffer from. Taking inspiration from Philip Dawid's early 80's work on assessing the quality of probability forecasting (using the separate reliability and resolution terms) I found that many commonly used learning methods output very unreliable probability forecasts despite having good classification accuracy. Put simply, the estimates output by many of these traditional algorithms should not be trusted by the end user, as shown in the ERC plot for Naive Bayes (below) when a prediction is made with a forecast of 0.8, it actually only has 0.5 chance of being correct! Reliable probability forecasting is a very desirable goal, especially in cost-sensitve decision making domains such as with medical and financial applications.

The focus of my research thus started with my development of the Empirical Reliability Curve (ERC) (shown below) for visualising the reliability of probability forecasts output by learning algorithms (as presented at last year's ICDM conference). I have also carried out large numbers of experiments testing the quality of the forecasts of various machine learning methods such as: Boosting, Bagging, Binning, ERC re-calibration, Find Best Weights, Decision Trees, Neural Networks, Support Vector Machines, Naive Bayes, Bayesian Belief Networks and K-Nearest Neighbours. I have developed adaptations to the Venn Probability Machine (VPM) and the recently introduced Defensive Forecasting frameworks. I have just finished conducting my final experiments on challenging time-series and sports/gambling event data and have submitted various papers to internationally leading conferences in machine learning, data mining and artificial intelligence. I am currently writing up my PhD thesis and aim to complete my studies in September 2005. Any potential employers please view my current CV.

ERC Plots for Various Naive Bayes Learners on the Abdominal Pain Dataset
Naive Bayes
Binned Naive Bayes
VPM Naive Bayes

Probability Forecasting Results

As mentioned earlier, I have created many tables of results analysing the quality of probability forecasts output by machine learning algorithms across many standard datasets. I hope that these results will serve as a useful standard for comparison against in this relatively under studied area of machine learning. I am maintaining the results in a Microsoft Access Database and would welcome any body else's results on the particular datasets I have tested so that I can add them to this database.

Programs

During my research for my PhD I have developed a large number of programs implemented in Matlab, Java and Perl. The main bulk of programs are written as extensions to the Java based WEKA data mining system. The highlights of my programs are:

  • Framework for creating Confidence and Probability Machines easily by extending existing WEKA classes.
  • Implementation of the K29 Defensive Forecasting algorithm in WEKA.
  • Hidden Markov Models in WEKA.
  • Implementation of the Find Best Weights (FBW) algorithm for re-calibrating of an underlying learners probability forecasts using ROC curves.
  • Methods for testing learning algorithms in the offline and online learning settings.
  • Methods for assessing the quality of probability forecasts output by learners such as:
    • ROC curves.
    • ERC plots (as shown above).
    • Loss functions.

Autonomous Robotics

Recently I have become very interested in robotics. It all started when I first an excellent book by Brian Bagnall on building robots using leJOS, an open-source Java programming platform for the LEGO RCX brick). Sian then bought me a LEGO Mindstorms kit from Ebay, and I got started building my first robot "Ronny" as seen below:

The design of Ronny was taken from Brian Bagnalls book, and currently I have given a simple behaviour based control system. He currently has two touch sensors attatched to the red bumpers, and a light sensor mounted on top. He uses independent tank treads for steering. At present he just bumps around singing little tunes, but I am currently trying to scale down my Neural Network designs onto the RCX bricks tiny 32KB memory! My friend Roy is quite fond of Ronny and likes to tickle his belly.

After building these robots I then started to become more interested in the application of machine learning to robotics. In particular I am interested in developing robust autonomous systems for solving real world problems. I then became aware of the excellent Robocup conference (any donations to sponsor me to attend next years conference would be greatly appreciated!). The aim of this conference is to "By the year 2050, develop a team of fully autonomous humanoid robots that can win against the human world soccer champion team". I started to read many of the publications relevant to this event and have become hooked ever since! It combines two passions of mine beautifully: football and machine learning!

At present I am working on a smaller sub problem presented by the SoccerBots distribution which simulates the dynamics and dimensions of a regulation RoboCup small size robot league game. Two teams of five robots compete on a ping-pong table by pushing and kicking an orange golf ball into the opponent's goal. This program is excellent as it has enabled me to concentrate on developing my new Neural Network inspired ideas for controlling each robot agent. The initial results from this research are promising as detailed in my Power Point presentation that I will give at my upcoming cake talk. I would love to continue this research professionally at the end of my PhD (subtle hint to potential employers).

Presentations

Here is a short list of presentations that I have given in Power Point format:

Last modified: 7 April, 2005 12:41 PM By: DL
Home Contact me Search my web site Sign my guestbook! www.david-lindsay.co.uk Discussion board www.david-lindsay.co.uk Home Contact me Sign my guestbook To my discussion board Search my website www.david-lindsay.co.uk Go to hompage Contact me Sign my guestbook Visit my discussion groups Search my web site Useful links