Research
Some items in this section are password
protected and only accessible to me and my supervisors
(at the moment). Please contact
me to get the password to download/access the files for yourself.
The motivation for my PhD research began when I realised that pattern
recognition studies seem over-reliant on the classification accuracy
criteria for assessment of a learner's performance. However, I noticed
that, whilst collaborating with the medical staff at St. Bartholomew's
hospital in London, end-users often want more than just a "black
box" technique for diagnosis, and often prefer some probabilistic
interpretation of what the patient is likely to suffer from. Taking
inspiration from Philip Dawid's early 80's work on assessing the
quality of probability forecasting (using the separate reliability
and resolution terms) I found that many commonly used learning methods
output very unreliable probability forecasts despite having good
classification accuracy. Put simply, the estimates output by many
of these traditional algorithms should not be trusted
by the end user, as shown in the ERC plot for Naive Bayes (below)
when a prediction is made with a forecast of 0.8, it actually only
has 0.5 chance of being correct! Reliable probability forecasting
is a very desirable goal, especially in cost-sensitve decision making
domains such as with medical and financial applications.
The focus of my research thus started with my development of the
Empirical Reliability Curve (ERC) (shown below) for visualising
the reliability of probability forecasts output by learning algorithms
(as presented at last year's ICDM
conference). I have also carried out large numbers of experiments
testing the quality of the forecasts of various machine learning
methods such as: Boosting, Bagging, Binning, ERC re-calibration,
Find Best Weights, Decision Trees, Neural Networks, Support Vector
Machines, Naive Bayes, Bayesian Belief Networks and K-Nearest Neighbours.
I have developed adaptations to the Venn Probability Machine (VPM)
and the recently introduced Defensive Forecasting frameworks. I
have just finished conducting my final experiments on challenging
time-series and sports/gambling event data and have submitted various
papers to internationally leading conferences in machine learning,
data mining and artificial intelligence. I am currently writing
up my PhD thesis and aim to complete my studies in September 2005.
Any potential employers please view my current
CV.
|
ERC Plots for Various Naive Bayes Learners
on the Abdominal Pain Dataset
|
 |
 |
 |
|
Naive Bayes
|
Binned Naive Bayes
|
VPM Naive Bayes
|
Probability Forecasting Results
As mentioned earlier, I have created many tables
of results analysing the quality of probability forecasts output
by machine learning algorithms across many standard datasets. I
hope that these results will serve as a useful standard for comparison
against in this relatively under studied area of machine learning.
I am maintaining the results in a Microsoft Access Database and
would welcome any body else's results on the particular datasets
I have tested so that I can add them to this database.
Programs
During my research for my PhD I have developed a large number of
programs implemented in Matlab, Java
and Perl. The main bulk of programs are written as extensions to
the Java based WEKA data mining system. The highlights of my programs
are:
- Framework for creating Confidence and Probability Machines easily
by extending existing WEKA classes.
- Implementation of the K29 Defensive Forecasting algorithm in
WEKA.
- Hidden Markov Models in WEKA.
- Implementation of the Find Best Weights (FBW) algorithm for
re-calibrating of an underlying learners probability forecasts
using ROC curves.
- Methods for testing learning algorithms in the offline and online
learning settings.
- Methods for assessing the quality of probability forecasts output
by learners such as:
- ROC curves.
- ERC plots (as shown above).
- Loss functions.
Autonomous Robotics
Recently I have become very interested in robotics. It all started
when I first an excellent book by Brian Bagnall on building robots
using leJOS, an open-source
Java programming platform for the LEGO RCX brick). Sian then bought
me a LEGO Mindstorms
kit from Ebay, and I got started building my first robot "Ronny"
as seen below:
The design of Ronny was taken from Brian
Bagnalls book, and currently I have given a simple behaviour
based control system. He currently has two touch sensors attatched
to the red bumpers, and a light sensor mounted on top. He uses independent
tank treads for steering. At present he just bumps around singing
little tunes, but I am currently trying to scale down my Neural
Network designs onto the RCX bricks tiny 32KB memory! My friend
Roy is quite fond of Ronny and likes to tickle his belly.
| After building these robots I then started to
become more interested in the application of machine learning
to robotics. In particular I am interested in developing robust
autonomous systems for solving real world problems. I then became
aware of the excellent Robocup
conference (any donations to sponsor me to attend next years
conference would be greatly appreciated!). The aim of this conference
is to "By the year 2050, develop a team of fully autonomous
humanoid robots that can win against the human world soccer
champion team". I started to read many of the publications
relevant to this event and have become hooked ever since! It
combines two passions of mine beautifully: football and machine
learning! |
 |
 |
At present I am working on a smaller sub problem
presented by the SoccerBots
distribution which simulates the dynamics and dimensions of
a regulation RoboCup small size robot league game. Two teams
of five robots compete on a ping-pong table by pushing and kicking
an orange golf ball into the opponent's goal. This program is
excellent as it has enabled me to concentrate on developing
my new Neural Network inspired ideas for controlling each robot
agent. The initial results from this research are promising
as detailed in my Power Point presentation that I will give
at my upcoming cake talk. I would love to continue this research
professionally at the end of my PhD (subtle hint to potential
employers). |
Presentations
Here is a short list of presentations that I have given in Power
Point format:
|