My student Parasaran Raman continues his conference report from ECML-PKDD in Athens. For his first post, see here.
Christopher Bishop gave an excellent plenary talk on the first day
about the need to embrace uncertainty. He talked about the Kinect
technology that uses infra-red camera and sensors to measure distance of
aperson from the screen. Instead of modeling the problem as a motion
tracking problem, the folks at MSR started to look at the challenge as a
pattern recognition problem. Each frame is now to be cold-started and
understood as a picture instead of tracking the motion in a video sense.
The goal now is to classify sets of pixels enclosed inside an
axis-aligned rectangle. The training data is collected as a series of
motion capture videos by letting a test subject wear sensors on the body
a la hollywood-animation style, and is combined with some synthetic data, corrupted artificially since these are “too pure”. The training set
consists of a whopping 1 million examples of body positions .
They then use a decision tree to classify the pixels into one of 31
predefined body parts by maximizing an information gain function.
Another sub-goal that Bishop talked about was to identify human joints.
For this, they use a kind of mean-shift technique, applying a simple
Gaussian smoothing function on joints.
The second half of the talk revolved around model-based ML; where Bishop
recommends constructing model equivalent for algorithms. He says and I
quote “Don't look at the algorithms; look at the models”. He talked
about an application where the goal is to rank players who play each
other in a common environment. They have a new rating system TrueSkill,
that extends the popular Elo rating system [which typically only works
in a two player game like chess] to a multi-player team games. TrueSkill
performs a kind of Bayesian ranking inference on the graphs by local
message passing .
He also introduced Infer.NET, a framework for running Bayesian inference
in graphical models.
This looks like a cool tool to play with, especially since it provides a
single platform for classification and clustering problems.
The plenary talk on day two was by Rakesh Agarwal. In a talk that
focused on a very interesting application, Rakesh introduced the idea of
using data mining towards enriching education. The key idea is to take
concepts from textbooks and map it with a relevant Wikipedia article and
then induce relationship between concepts.
The goal is to enrich sections in the textbooks by providing
recommendations for illustrative pictures to go with the text.
Dispersion [which is a measure of how unrelated the concepts are] and
syntactic complexity play a big role in deciding “which picture should
go where”. Since search engines fail when given long key words, one
challenge is to find key concepts in a section that one can then match
up to the most similar article in Wikipedia. The underlying assumption
was that for most high school textbook sections will have a associated Wikipedia article.
Rakesh showed results on applying this to a huge word corpus from NCERT,
who draft the high school textbook in India for most schools. He also
talked about computing an immaturity score to see if a section is
well-written and not surprisingly, subjects like physics and chemistry
scored over commerce and economics!
To summarize, two things that go into solving the problem of augmenting
textbooks with images are “Comity” [where they identify lots of short
key queries by concatenate 2 or 3 concepts important concepts at a time]
and “Affinity”. Comity ransk images by computing a relevance score of
each image in context of the concepts picked out from a section and
affinity identifies relevant webpages and computes an importance score.
Another challenge is that these techniques cannot operate one section at a
time since the textbooks would probably not like the images to repeat!