The Geomblog: On outreach to applied communities

Wednesday, October 20, 2010

On outreach to applied communities

There's an argument playing out on cstheory (yes, we need a better name) on whether we should encourage or discourage questions like "how can I model X phenomenon theoretically", or "Are applied questions off-topic"

While Scott Aaronson posted an eloquent defense of why we should always entertain such questions, the general tone of the responses has been negative. The main fear has been a 'slippery slope' - that if we allow one modelling question, then the site would be overrun by modelling questions.

I think this sentiment is particularly interesting in the context of David Johnson's post requesting examples where approximation algorithms were deployed usefully in practice (and that he couldn't come up with many good examples)

The truth is that outreach to more applied communities is not about taking your algorithms hammer and beating their heuristic over the head with it. As I've said before, you have to have a DRAMATIC improvement in performance before people will change their standard practices.

It's firstly about communicating a style of asking and answering questions, even if the actual answers are "trivial". A collaboration where you persuade someone to use max flows might not be theoretically very splashy, but it has a huge impact in how people think about algorithms, theory and problem solving.

Doing this first prepares the ground for future collaborations. If you want someone to use your new and snazzy approximation for vertex cover, you have to first convince them that vertex cover makes sense, and before that you have to convince them that even modelling the problem as a graph makes sense. Believe me when I say that the very notion of defining a problem formally before solving it is not how work gets done in many applied areas.

Michael Mitzenmacher makes a similar point in a comment to the above post, where he says:

I do think, however, that sometimes the focus of theorists -- to get the best constant-approximation or to shave log or log log factors off -- obscures rather than highlights a good part of what I see as the real "practical value" of this line of research. Focusing on what algorithmic ideas underlie the approximation, and more implementation and experimentation, would (I think) give more real-world power to this theoretical work.

1 comment:

Michael Mitzenmacher10/20/2010 06:20:00 PM
Hi Suresh. First, thanks for thinking my previous comment was worth posting. Second, thanks for linking to Scott's eloquent defense, which I quite liked.

This latest post sparked one thought that I'd like to share. You state that "you have to have a DRAMATIC improvement in performance before people will change their standard practices." I agree with this statement, but I would like to point out another way to get people to adopt your algorithmic solution: keep it simple.

My standard example here is the Bloom filter, which has become more successful than I would have ever dreamed when I (and others) started "preaching" about them several years ago. I attribute part of their success to the fact they often give a non-trivial performance improvement, but in my mind at least part of their success is due to the fact that they are so amazingly simple I can explain them in one power-point slide. Correspondingly, they usually take minutes to implement for any reasonable coder. Yet you still can't find them in any standard first-year undergraduate algorithms textbook (that I know of).

Another area where theory tends to "fail" is that many view complexity, rather than simplicity, as a virtue. Let me go back to your statement on "performance". We (theorists) generally measure an algorithm's performance by running time and/or space. As I make clear to the undergraduates in my course, there's another standard performance metric you won't hear about in most algorithms classes, although in many cases it's the most important: programmer time. Figure out a way to solve an interesting already-solved problem with an algorithm that has the same running time and space but is ten times simpler and you'll do a lot of good for the perception of theory outside of theory. (Of course, you're unlikely to get the paper published in a theory conference, but I'd argue you'll find many other benefits.)
ReplyDelete
Replies

The Geomblog

Pages

Wednesday, October 20, 2010

On outreach to applied communities

1 comment:

Disqus for The Geomblog