Sunday, December 21, 2008

More experiments in algo-teaching

(ed. note: think of this as a 'theory'-chaser to take the bad taste of cricket-blogging out of your mouth)

Another experiment that I continued this time was a set of lectures on "fun topics". The idea is to convey some ideas about a fun topic in theoryCS without too much jargon, but with enough meat to indicate that there's something deeper going on beneath the fun stuff.

I run these lectures at the end of the semester (when everyone's worn out already :)), and throw them open to the general public: both times I did this, I got substantial attendance from people not in my class, and at least in one case, someone who attended the lecture last year actually decided to take my class this year.

Not all topics are amenable to such a treatment, but my hope is to build up a suite of such lectures: the nice thing is that they can then be reused for other venues: for example, I've given variants of these talks to freshmen and high school students, and will do an undergraduate colloquium in the math department next semester.

For all of these lectures, I've pilfered shamelessly from the original works, as well as great websites developed by the authors. This post can be viewed as a shout-out and a thank you.

1. Pancake flipping:
Brian Hayes wrote a great article on pancake flipping for the American Scientist. In it, he links it to the problem of genomic rearrangement, and in doing so, ends up with a beautiful tale of the interaction between theoretical problems and practical constraints, all told in context of a topic that everyone can relate to.

I made two-color pancakes in different sizes, and distributed them to the students at the start of class: I briefly described the problem, and let them go at it. It was quite entertaining, and put the more formal discussion later on in context.

2. Zero knowledge proofs
ZK proofs are great for this kind of setting: they are completely counter-intuitive, (and so have the 'bizarre' factor), and are easily explained using popular metaphors. I used Moni Naor's Sudoku demo page, as well as the version of the proof that involves slicing and dicing the puzzle. This was done with class participation (I was the prover, and there were multiple verifiers).

The second demo I ran was the Yao protocol for the Millionaire's problem (how do two millionaires determine which is richer, without either knowing the worth of the other). Again, I did this interactively with class volunteers and an RSA applet to speed things along. More details here.

3. Quantum Computing
No demos for this one, but I gave a crude high level view of what quantum computing is about (about one-step up from the "do everything in parallel" explanation). The main goal here was to convey the basic ideas of what a qubit is, what a quantum circuit looks like, and how the Bell inequalities show that quantum computing is much more bizarre than classical randomness. Here, Dave Bacon/Umesh Vazirani lecture notes proved indispensable, coupled with the Nielsen-Chuang book, and a recent blog post by Michael Nielsen.

4. Computational Origami
I haven't quite worked the kinks out of this one, but the basic idea is to demonstrate the principles behind computational origami (and where the 'computation' comes in) by looking at the question: What shapes can you make by folding, followed by a single cut.

There's a cool backstory to this: essentially the first example of such an algorithm was how Betsy Ross designed the 5 point star for the American Flag, and it lends itself to a nice demo in class.

Secondly, it's a classic example of resource-bounded computation: limiting yourself to one cut. Thus, it makes for a good illustration of how computation appears in all kinds of problems.

Thirdly, computational origami actually shows up in many real-world problems, most notable one involving folding mirrors for a telescope to be launched into space. If we're trying to convey a sense of computational thinking, this is a great example.

The actual problem itself was solved in this paper by Demaine, Demaine and Lubiw. Unfortunately, I have yet to find a way of explaining it that will make sense to non-expert geometers, and that's one weakness with this particular story: Erik's page has some nice examples to demo, but it's hard to convey the deeper algorithmics behind the problem.

Lessons learned:
  • As always, know your audience. What works for a graduate class doesn't work for sophomores, and certainly doesn't work for eigth-graders :)
  • Interactivity is key: if you allow people to participate, they're more involved, and will probably take away something positive
  • Keep it light: resist the urge to get too mathematical. There are many beautiful topics in theoryCS that can be explained without jargon, and many others for which with some effort, jargon can be removed.
  • A 'bizarro' factor helps: In general lectures that I've given, I find that presenting something completely counter to people's expectations catches their attention immediately, and leaves a lasting impression. ZK proofs have that property. Bell's inequality sort of does, but the impact would be more immediate if I could actually run a quantum experiment to demonstrate violation of Bell's inequality ;)
  • Relate it to the real world: again, it's not enough to cite applications: one should try to show them as far as possible. In this regard, the Betsy Ross story is great, because it relates to something everyone (in this country) knows.
My goal is to add one or two such lectures to my arsenal each time I teach the class. For next time, I'm seriously considering running a live auction of some kind to demonstrate some concept in algorithmic game theory (suggestions on how best to do this are always welcome). Are there other such topics that might lend themselves to an entertaining, edifiying and educational experience ?

On run-chases

(ed. note: this is about cricket, not algorithms, or geometry, or computer science. you have been warned)

South Africa, the Netherlands of cricket, finally won a big game, beating Australia in Perth after a historic run-chase of 414. This of course follows India's famous run-chase, beating England in Chennai by chasing down 387. As Cricinfo points out, 9 of the top 25 run chases have come in the last 8 years (in a tally that dates back to 1902).

There's a detailed statistical analysis of these chases, but no speculation as to why they're becoming more frequent. The answer seems obvious to me though: the increasing scores in one-day cricket. A quick search of Statsguru indicates that of the 258 overall scores above 300 (first or second team) one 1-day games, 179 of these happened after 2000. Even to a casual observer, it's clear that the scores in 1-day games have gone up (and don't get me started on Powerplays).

Frankly, when all this fuss was being made about India's run chase, I couldn't quite understand why, because if you think of this as a one-day game, it's not too hard (and in fact India's coaches thought the same way!).

All in all, two exciting Test matches (and how often have we been able to say that)

Wednesday, December 17, 2008

Videotaping talks

Videolectures.net is a company that has taken on the job of videotaping and packaging conference talks. They're based out of Slovenia, and offer a good service: they take your talk and your slides, and sync up the talk video and slides so someone watching later on can follow along.

For examples, you can see my talk at ETVC, and here are the other talks. I first heard about this company when I was googling a paper and discovered that the ICML talk was online.

I originally thought that they only handle events in Europe, but they appear to have covered this year's KDD as well (although the coverage appears strangely limited).

Saturday, December 13, 2008

Practical applications of 1-medians

From Optimal Home Location:
Have you been looking across many different neighborhoods for a place to buy or rent? Have you been contemplating whether to buy closer to your job, your spouse's job or your kids' school? Do not worry - this is not a simple geometric triangulation but a centuries old mathematical problem. Optimal Home Location tool is synthesizing math algorithms and Google Maps together in order to pinpoint optimal home location for any commute scenario. It is easy to use and fun to play with. All you need to do it [sic] enter all the addresses your family routinely visits throughout the day and then click on the map icons to define commute route for each member of your family. The tool automatically computes optimal home location that will minimize total combined commute for all members of your family

Monday, December 08, 2008

NSF bleg

This appears in the supplemental documents section of the upcoming CISE call:
In the Supplementary Documents Section, include a list of all PIs, Co-PIs, Senior Personnel, paid Consultants, Collaborators and Postdocs to be involved in the project. This list should be numbered and include (in this order) Full name, Organization(s), and Role in the project, with each item separated by a semi-colon. Each person listed should start a new numbered line.

Does anyone know what a "Collaborator" is ? is it merely your other-institution co-PI on a collaborative proposal ?

Saturday, December 06, 2008

I had a number of responses to people from my programming project post: I thought I'd post the responses here, rather than in comment fields.

* On rolling your own: I'm intrigued by the many suggestions to use Sphere Online. I've heard of topcoder and Project Euler before, and decided not to use topcoder for ease of pedagogy (I wanted problems where the key algorithmic idea was "preidentified", so I could give a DP problem in the homework on dynamic programming). I was also not convinced that topcoder focused on the algorithmic aspects of the problem as opposed to raw speed: the fact that it was set up as a time limited competition by default was also a pain in the neck.

* On copying: this is an unsolvable problem IMO. Since I was choosing problems from the pre-canned list at the ACM server, I was at the mercy of the online solution providers. Judging by the results, my students are either very honest, or don't know how to find these sites :). I've spied on the related forums, and they tend to be somewhat militaristic about not letting people post code directly, although hints are always supplied. As an aside, for theory problems this is a royal pain: I've had to mask things in various ways to prevent a google search, and I have my own way of creating problems that I'm happy to reveal to someone who asks me directly :).

* on what to do for geometry: the problem is not the lack of a good code base. In fact it's the reverse problem. CGAL for example has many quality solutions already pre-coded, so I can't even ask students to use CGAL as a base framework. Given the limited time I have before semester starts, its debatable how much coding and test generation I can do on my own, so stay tuned...

I am less than smart :)

I posted my note on programming assignments, and then wondered why there were no comments. It turns out that I forgot to monitor my moderation list, and when I checked, there were tons of comments ! apologies to all the commenters: your comments are available now, and I'll start replying shortly.

Tuesday, December 02, 2008

programming assignments

Inspired in part by Michael Mitzenmacher's exhortation:
Here's my claim: theory does untold damage to itself every year by not having programming assignments in the introductory classes on algorithms and data structures.
I tried an experiment in my algorithms class this year. Using the ACM programming competition online judge as an evaluation tool, I assigned one programming assignment in each of the first three assignments in my class (later assignments went into approximations and randomization, which the site didn't really cater to). Coupled with this, I used the ACM Programming challenges book by Skiena and Revilla to guide my choice (they break the problems down nicely by dominant technique).

The way the system works is this: you're given a problem definition and an input/output specification. You write your code in one of three languages, using a specific compiler flag sequence, and then upload your code. The system runs your code through a battery of tests, and then reports whether you had
  • compile-time errors
  • failure to terminate in the prespecified time limit
  • incorrect answers
  • run-time errors
  • all correct answers
What worked:
  • Students actually liked the idea of programming assignments: I received numerous variations of the comment, "I didn't understand the algorithm, but once I coded it...". They were less enthused by the server, but more on that later.
  • People started getting quite competitive: the site maintains stats on the best implementation thus far, and even after students satisfied the requirements of the assignment, by matching the desired time limit, they tried to optimize their code further.
  • The questions are well-designed, and usually need algorithmic tricks rather than hacks. For example, one question is to compute the closest pair, and any optimized brute force algorithm will fail, but the standard deterministic divide and conquer will work fine.
What didn't:
  • The server would occasionally flake, especially a few hours before assignment deadline time :)
  • I/O specifications were often tricky: the problem specifications would leave out details about what one could assume about the input, so simple things like "don't assume that when a range [x,y] is given in the input, that x <=y" needed to be discovered.
  • The error messages are cryptic to a fault. Compile time errors are linked to the point in the code where this happens, but for any other kind of error, you are merely told that an error occurred. This caused major frustration.
  • With Java especially, getting a correct implementation within the time bound seemed harder than with C/C++. Since Java is often the first language students learn, this is annoying, especially since the whole point of such exercises is to abstract away as far as possible the particular idiosyncracies of a language.
  • From a grading point of view, it's very painful to evaluate each person's submission. There's no easy way except to do it manually.
Overall, my experience was mixed-to-slightly-positive: I'll probably do this again, but I might also include examples where I design the test inputs and do local evaluation. For some problems (like with randomization) I might have to design the problem from scratch.

Now what I'd like is something similar for my computational geometry class :)

Disqus for The Geomblog