## Saturday, July 31, 2004

### That wonderful feeling of nothing...

A friend of mine just defended his Ph.D successfully (congratulations, Nabil !), and sent me this excerpt (extracted from here) in response to my email congratulating him:

In February 1995, on a beautiful sunny day with clear Carolina blue skies, I turned in the final,signed copy of my dissertation. The graduate school staff member did some last-minute checks on the document and pronounced it acceptable. After six and a half years of toil and sweat, I was finally done! While walking back to the C.S. department building, I was sorely disappointed that the heavens didn't part, with trumpet-playing angels descending to announce this monumental occasion. Upon hearing this observation, Dr. Fred Brooks (one of my committee members) commented, 'And the sad fact is, you're no smarter today than you were yesterday.'"

Another amusing take on graduate student life, from the same article:

Being a graduate student is like becoming all of the Seven Dwarves. In the beginning you're Dopey and Bashful. In the middle, you are usually sick (Sneezy), tired (Sleepy), and irritable (Grumpy). But at the end, they call you Doc, and then you're Happy.

## Friday, July 30, 2004

### Path compression...

The main takeaway from the Seidel-Sharir paper refining the analysis for union-find, as explained to me by Adam Buchsbaum (who is still walking the corridors in amazement):

If path compression takes linear time, then it takes α(n) (amortized) time.

Truly amazing. Worth reading just for that...

## Thursday, July 29, 2004

### Topology rules the playground !

This is the last place you'd expect to see a lucid discussion of topological concepts:

Boing Boing: Double twist strip playground equipment

## Wednesday, July 28, 2004

### Why do algorithm engineering ?

I was listening to a talk today where the speaker made the following statement: 'We know that this works well in theory, and now we'd like to see if it works well in practice'. Now this is a statement I have probably heard hundreds of times over the past many years, and it is in no way noteworthy or unusual.

However, it did get me thinking. It is by now a truism that no one will seriously dispute that there is often a gap between the theoretical predictions offered by an algorithm analysis and the behaviour of this algorithm in practice. Contrary to what many people would conclude, this is of course not a black mark on the value of theoretical work. The power of algorithm analysis, poly-time, and even the much maligned O() notation, lies in the invariance of (classical) machine definitions under poly-time transformations, and the powerful mathematical structure that complexity theory acquires as a result.

This structure does have a tradeoff; we get mathematical elegance, but lose a certain modicum of "predictiveness" when it comes to analysing the behaviour of an algorithm "in practice". And I think herein lies the true value of algorithm engineering. If the algorithm that has better theoretical guarantees does better in practice (and this happens more often than one thinks), one's task is complete; if not though, all is not lost. Often, peculiarities in the data cause certain algorithms to behave better than others, even though from an asymptotic perspective a different outcome would be expected. Sometimes, analysing the running time is not the right measure: output-sensitivity in geometric algorithms is an example of a measure that captures certain desirable properties that can be missed with a blunt-edge worst-case running time.

But ultimately, the goal of algorithm engineering, or of good experimental algorithmic work, is to tease out the tradeoffs, parameters and special cases that govern which algorithm is the right one for a specific setting. If I am a practitioner, I am less interested in knowing the algorithm with the best asymptotic efficiency than I am in knowing which algorithm can provide the best performance for my situation. This does not mean that we should throw proofs out of the window: it means that often a much more delicate analysis of the behaviour of an algorithm is required in order to determine what works better.

I think I just wrote an ALENEX manifesto. Oh well :)

### A Pentium III ?

From Forbes, via the Computing Research Policy blog (emphasis added):
Meanwhile, another even more serious threat to the U.S.' competitive position in supercomputing is lurking in the shadows. Tucked away into the House version of the National Defense Authorization Act, which passed in May, is Section 1404, which would classify any computer using a Pentium 3 processor or stronger as a 'militarily critical' machine to the Department of Defense--or, a weapon--and make it subject to tightened export regulations. While high-performance computers are already regulated separately as part of a 1991 agreement with Japan, called the Supercomputer Control Regime, the new restrictions would arguably affect certain systems components developers need for supercomputers.

If a Pentium III can be classified as military weaponry, suddenly I don't feel so safe anymore.

## Tuesday, July 27, 2004

### GR conferences are more interesting than one might imagine

This is from John Baez's latest column describing the frenzy at a general relativity conference in England where Stephen Hawking conceded the bet he and Kip Thorne made with John Preskill:

A fellow with long curly grey locks and round horn-rimmed glasses sat down beside me. I'd seen him around the conference, so I said hello. He asked me if I'd like to hear about his theory; at this point my internal alarm bells started ringing. I told him I was busy, but said I'd take a look at his manuscript later.

It turned out to describe an idea I'd never even dreamt of before: a heliocentric cosmology in which the planets move along circular orbits with epicycles a la Ptolemy! And his evidence comes from a neolithic British tomb called Newgrange. This tomb may have been aligned to let in the sun on the winter solstice, but some people doubt this, because it seems the alignment would have been slightly off back in 3200 BC when Newgrange was built. However, it's slightly off only if you work out the precession of the equinox using standard astronomy. If you use his theory, it lines up perfectly! Pretty cute. The only problem is that his paper contains no evidence for this claim. Instead, it's only a short note sketching the idea, followed by lengthy attachments containing his correspondence with the Dublin police. In these, he complained that people were trying to block his patent on a refrigerator that produces no waste heat. They were constantly flying airplanes over his house, and playing pranks like boiling water in his teakettle when he was away, trying to drive him insane.

## Sunday, July 25, 2004

### Things to do with an iPod...

courtesy BoingBoing:

1. Have drink
2. Eat pizza
3. Find loo...

Entertaining audio reviews and even accompanying sound tracks such as Handel’s ‘Water Music’ and ‘Cosmic Winds’ will help users to locate their nearest (and loveliest!) loos.

For people challenged in the Queen's English, loo = restroom.

## Wednesday, July 21, 2004

### Black holes mangle matter and energy, but cricket IS better !

I had recently mentioned Hawking's retraction of his claim that a black hole destroys everything that enters it. He has now presented his work at a conference on GTR.

A footnote to this story is the famous bet that Hawking and Kip Thorne (of CalTech) had with John Preskill (also at CalTech). The terms of the bet were:

"information swallowed by a black hole is forever hidden and can never be revealed."
Preskill bet against that theory.

The forfeit is an encyclopedia, from which Preskill can recover information at will.

Since Hawking lost the bet, he owed Preskill an encyclopedia:

He presented Preskill a favored reference work "Total Baseball, The Ultimate Baseball Encyclopedia" after having it specially flown over from the United States.

"I had great difficulty in finding one over here, so I offered him an encyclopedia of cricket as an alternative," Hawking said, "but John wouldn't be persuaded of the superiority of cricket."

However, the matter does not appear to be resolved:

But Preskill says that Hawking's new take on quantum gravity rests on shaky mathematical foundations, and is unlikely to be embraced by the physics community. "I am sceptical about whether he has found a fully satisfactory resolution to the problem," he says.

## Tuesday, July 20, 2004

### What's eccentric about that ?

From the Fields Medal citation for Tim Gowers:

Banach was an eccentric, preferring to spend his time in the café rather than in his office in the University of Lvov

sounds reasonable to me...

### The Two Cultures...

Michael Nielsen, in his ongoing series on the principles of effective research, talks about problem solvers and problem-creators as two models of research:

The problem-solver: This is the person who works intensively on well-posed technical problems, often problems known (and sometimes well-known) to the entire research community in which they work. The best problem-solvers are often extremely technically proficient and hard-working. Problem-solvers often attach great social cache to the level of difficulty of the problem they solve, without necessarily worrying so much about other indicators of the importance of the problem.

The problem-creator: This is a rarer working style. Problem-creators may often write papers that are technically rather simple, but ask an interesting new question, or pose an old problem in a new way, or demonstrate a simple but fruitful connection that no-one previously realized existed.

In an interesting coincidence, Chandra Chekuri just pointed me to Tim Gowers' (the 1998 Fields Medallist) website. There, Prof. Gowers has an article titled 'The Two Cultures', where he talks essentially about the same schism in mathematics, between problem solvers, and theory builders.

It is interesting to compare the two reflections. Nielsen feels that the 'problem-creator' style is rarer, and "less popular" than the problem-solver mode. In Gowers' view, theory builders look down on problem solvers (for example, people who do combinatorics) because of a perceived "lack of depth" and other such reasons.

The two polarities are not quite the same; but read together they provide a thoughtful perspective on the kind of schisms that we all face in our research work. I can definitely attest to feeling the kinds of conflicts they talk about, between 'solving problems' and 'building theories'.

Somehow writing a blog itself seems to fly in the face of Hardy's dictum:

There is no scorn more profound, or on the whole more justifiable, than that of the men who make for the men who explain. Exposition, criticism, appreciation, is work for second-rate minds..

Browsing the amazing list of expository articles at Gowers' website, it is comforting to see this shibboleth too go by the wayside.

### Reading Harry Potter makes you very smart...

Today I got an email from Amazon.com (emphasis added)

Dear Amazon.com Customer,

As someone who has purchased books by J. K. Rowling, you might like to know that "Harry Potter and the Philosopher's Stone (Ancient Greek Edition)" will be released soon. You can pre-order your copy at a savings of 30% by following the link below

I knew that reading is good for you, but I didn't know that reading Harry Potter was that good for me...

## Sunday, July 18, 2004

### Pacing...

The virtues of pacing are highly underrated. When you have over 60 submissions to review, the immenseness of the task can completely drown out all rational thought. The number of papers to be reviewed on a per-day basis is not that bad, but you have to stick to a routine !

Taking breaks is also a good thing: I finished my quota today, and I shall now soothe my fevered brow with a good helping of topological spaces via Claude Chevalley. Aaaahh.....

## Friday, July 16, 2004

Probably my first exposure to science fiction was via Isaac Asimov, and his robot books. I am shuddering at the thought of what the movie version of I, Robot did to his vision, and the review in Slate is not encouraging.

This little nugget of information is fascinating, and should be a Jeopardy question:

"By the end, Asimov achieved the Grand Slam of book writing, turning out at least one volume for each of the 10 classifications in the Dewey Decimal System."

## Thursday, July 15, 2004

### Conferences and the e-print server at LANL

Lance has a nice post on the reason why conferences are so much more important in CS than in other (older) fields:

The answer is technological, namely airplanes. Before air travel conferences were much more difficult to attend and drew from a much more regional audience. Those who made the great effort and time to attend a conference were allowed to present. But presenting your paper at such a conference would not reach the majority of your colleagues. Journals were the most efficient way to broadly publicize your research and took on the more important role and have kept that role for historical reasons.

Computer science started as a field during the jet age. Many more people from a wider geographical base could attend a conference. One could now widely disseminate their research through conferences well before a paper appeared in a journal. Journals still played an important role for refereeing, editing and archiving but never held the importance in computer science as conferences do

Lance goes on to speculate as to what changes the internet will bring in our models of publication. It seems to me that the LANL e-print archive is one example of a new paradigm. Interestingly this has been adopted much more readily in the "old" field of physics than in computer science (as an aside, why is it that physicists always come up with cool technological developments ?), and this effect even shows up in areas like quantum computing that are on the border of the two fields.

In a sense, this is the logical evolution of the conference. If you want a place to disseminate results quickly to a wide audience, what better way to do it than via the e-print server, especially given how active it has become. Physicists still publish in journals as the primary source, but e-prints have helped replace some of the functionality of the conference.

Conferences will really never go away, since there is no replacement for direct one-on-one contact in my opinion, but e-prints are definitely a new Internet-only model...

### Networking on the Network

Phil Agre has an article on networking on the internet for Ph.D Students. There are many general principles there that are quite interesting, and far too many to summarize in a brief post.

One nice excerpt:

Most people get socialized into institutions such as the research world without anyone ever explaining how the institutions work. For example, few PhD students ever get explicit lessons on the sorts of career strategies that I have been explaining in this article. What is more, the social world is filled with unspoken rules that keep these things hidden, for example the taboo against boasting or the imperative of explaining one's motives in terms of the general good rather than in terms of self-interest. These unspoken rules help people to get along, but they also make it much harder for average PhD students in complex professional interactions to figure out what is really going on. Most students do acquire up some insight from watching the experts, but they usually do not develop a complex theory like the one I have been explaining here. As a result, they often perceive their social environment in a relatively superficial way.

Pointer chase: Michael Nielsen -> Daniel Lemire -> Phil Agre.

All the above links spin off some interesting reading. In particular, Nielsen's ongoing series on the principles of effective research has some insightful thoughts.

## Wednesday, July 14, 2004

### Putnam Problem of the day...

Harvard has a Putnam Exam page that contains (apart from all the Harvard rah-rahing), a "problem of the day". Today's problem:

In Determinant Tic-Tac-Toe, Player 1 enters a 1 in an empty 3x3 matrix. Player 0 counters with a 0 in a vacant position and play continues in turn until the 3x3 matrix is completed with five 1s and four 0s. Player i wins if the determinant is iPlayer 0 wins if the determinant is 0 and Player 1 wins otherwise (thanks, Amit). Assuming optimal strategies from both players, who wins and how ?

What they need is an RSS feed. hmm....

### News of the day...

Some trivial...
On the origins of 'blog', from EvHead... (via BoingBoing)

....Blog was originally devised by British fans in the 1950s. There were two versions. A Liverpool fan named Peter Hamilton came up with the recipe for Blog Mark I, which consisted of "a brandy and egg flip base, to which was added black currant puree, Alka Seltzer, and Beechan's Powder. It effervesced." A second, simplified version (Blog Mark II) was produced by hotel barmen at the first Kettering Eastercon (1955) and consisted of "a half-pint of cider and a measure of rum."
...
You'll roll down stairs, fall off your chair, and hump your neighbors dog
It goes with a snack, you'll lie on your back, it's blog, blog, BLOG!

It's blog, it's blog
It's pink, it's yummy, it's punch
It's blog, it's blog
It'll make you lose your lunch

Come on and get your blog
Everyone loves their blog

And some not so trivial:
From the New Scientist:

After nearly 30 years of arguing that a black hole destroys everything that falls into it, Stephen Hawking is saying he was wrong. It seems that black holes may after all allow information within them to escape. Hawking will present his latest finding at a conference in Ireland next week.

## Monday, July 12, 2004

### Crisis in Science ?

<Rant>
In what twisted universe does 11pt and 12 pages actually mean 10pt, illegible margins, and 20 pages ? I tell you, this is the real crisis in science.
</Rant>
Phew: now I feel better...

So this article from the Chronicle of Higher Education talks about the problems facing universities trying to admit students for graduate studies. However, even after I read it a few times, I couldn't quite figure out what its central thesis was. The subtitle says:

Leaders warn of a labor shortage in the U.S., but indicators point to an oversupply

However, the following data points are presented:

* graduate enrollment from Taiwan has dropped by 25 percent
* NSF announced that foreign enrollment reached a new peak by 2002.
* A survey of 113 schools showed a 32% drop in foreign enrollment in those schools (especially from China)
* Purdue, UCLA and UT Austin are cited as examples of foreign enrollment either going up, or number of foreign applications going up.

If you are confused at this point, join the club.

Regarding overall jobs in the market,
* BLS predicted in 2001 that there would be a 47% increase in jobs in science and engineering by 2010
* Unemployment rates in computer science and systems analysis jumped to 6.7%
* unemployment among chemists is at "an all time high", as measured in terms of the number of postdocs.

Now I am getting rather dizzy.

The article deals with a complex matter, and I can respect the need to factor in a number of different statistics, but it is not clear to me that there is any story here at all. Enrollment is going up and down, jobs are going up and down, foreigners are applying more or less: what is the conclusion ?

If we look at the summary of the article:

...although no one can predict how many scientists and engineers the nation will need in 20 years, everyone agrees that the faces of those technical leaders will be far more diverse than those of generations past, and that American universities will scour the world for the best minds.

I don't know: it seems a little wishy-washy to me...

Some detailed methodological problems:

1. There is a conflation of science vs engineering, Ph.D vs M.S vs bachelors degrees, research jobs vs non-research jobs etc. These have different dynamics, and it is dangerous to draw broad conclusions from the aggregate of so many different sectors. For one thing, unemployment in computer science may not necessarily be related to employment prospects for Ph.Ds, because the IT industry has such a huge effect on such numbers....

2. They use number of US-authored publications to indicate a "flat" trend for research in the US. Without more details, it is hard to understand this statistic. As we know from SODA/STOC/FOCS, classifying papers by country is almost meaningless in times of multi-authored papers, and in experimental fields with lots of authors/paper, even more so.

3. There is a lot of confusion over US-born student enrollment vs foreign-born student enrollment; some stats are quoted for one group, some are quoted for the total

4. Using the rate of postdoc employment to infer general trends may or may not work in general. Do people in the natural sciences go for postdocs because they can't get jobs (as was the case in theory in the early 90s), or because they NEED a postdoc to even cross the interview threshold (as appears to be true in biology at least, and maybe physics). It is possible that mathematics is an example of the first case, but I don't know if math was part of this study at all.

A point that is not made often enough:

Mr. Freeman, like other economists, looks to dollars to make sense of the trends among graduate students. "They're not studying science," he says, "because they look and say, 'Do I want to be a postdoc paid $35,000 or$40,000 at age 35, with extreme uncertainty working in somebody else's lab, and maybe getting credit for my work and maybe not getting full credit? Or would I rather be an M.B.A. and making $150,000 and hiring Ph.D.'s?'" p.s So I just spent all this time writing a review of an article in the CHE instead of writing a review of a SODA submission. oh well.... ## Saturday, July 10, 2004 ### The 'e' movie No, this is not an electronic movie, it is literally, a movie about e (or a fantasy on what one would look like). In a whimsical and interesting article on ABC News (found via topix.net), John Allen Paulos speculates on what a movie about e might look like. He was prompted by the fame achieved by the golden ratio via The Da Vinci Code, and pi via the eponymous movie. He gives four examples of the 'natural occurrence' of e: for my amusement, I decided to subtitle the examples with the actual mathematical principle. All of these are old friends of of the theoretical computer scientist: 1. Divide up a square portion of the night sky into a very large number, N, of equal smaller squares. That is, imagine a celestial checkerboard. Then search for the N brightest stars in this portion of the sky and count how many of the N smaller squares contain none of these N brightest stars. Call this number U. (We're assuming the stars are distributed randomly so by chance some of the smaller squares will contain one or more of the brightest stars, others none.) If one knows some probability theory, it's not hard to prove that the ratio of N to U (N divided by U, that is) is very close to e and approaches it more and more closely as N gets large. Indeed. 1/e is the limit of the expression (1-1/n)n as n grows unboundedly large. 2. A somewhat unusual appearance of the number involves two decks of cards. Shuffle each deck thoroughly, turn over a card from each, and note if it's the same card (both 7s of diamonds, for example, or both jacks of clubs). Then turn over another card from each deck, and note if it's the same card. Continue doing this until all 52 cards in the decks are turned over. It can be shown that the probability of no matches at all between the two decks during this sequence of turnovers is extremely close to one chance in e; that is, the probability is 1/e or about 37 percent. This is the classical derangement problem. For many of us, the first introduction to the Principle of Inclusion and Exclusion. 3. ...imagine this year's high school graduates running a quarter-mile race. Runners are randomly selected and sequentially over a period of months each of them runs a quarter-mile and we keep track of the number of record times that they establish. The first runner would surely establish a record time and perhaps the fourth runner would be faster than the first three and establish the second record time. If the Nth runner sets the Rth record, it can be proved that the Rth root of N will be an approximation to e, and this approximation approaches e more and more closely as N increases without bound. Another way of saying this is that if we permute a set of numbers randomly and compute the min by the standard procedure, then its value will change roughly ln n times. Timothy Chan made use of this fact in an elegant way to replace parametric search by a cheaper randomized test for many geometric problems. The fourth example is a variant of the first: however, he alludes (possibly) to another famous homework problem, the secretary problem, in his conclusions. Here, the probability of choosing the best secretary for the job is 1/e, and this is in fact optimal (over all online algorithms that make irrevocable decisions). Considering there is already a movie called 'The Secretary', you'd think this would be a good candidate :) As an aside, the derangement problem has also been referred to as The Drunken Secretary Problem. ## Friday, July 09, 2004 ### If only all mathematical questions were "this complex" Seen on CNET: A billboard placed this week in the heart of Silicon Valley posed a complex mathematical question that most commuters on Highway 101 would need Google to crack. So that explains why we haven't solved the burning mathematical questions of today; we didn't use Google !! Click below to earn your$1000,000 prizes:
1. P vs NP
2. The Hodge Conjecture

For the curious, this is the "complex mathematical question":

In a kind of geek jeopardy, the billboard read:"{first 10-digit prime found in consecutive digits e}.com." The answer, 7427466391.com, would lead a puzzle-sleuth to a Web page with yet another equation to solve, with still no sign the game was hosted by Google.

Mastering that equation would lead someone to a page on Google Labs, the company's research and development department, which reads: "One thing we learned while building Google is that it's easier to find what you're looking for if it comes looking for you. What we're looking for are the best engineers in the world. And here you are.

Showing a stunning mastery of stereotypes, CNET manages to conflate eggheads, mathematicians, geeks, engineers in one horrible mish-mash.

### The Importance of...: Introducing Hatch's Hit List

I try to stay out of politics, but this is just ridiculous:

The Senate is trying to pass a bill called the INDUCE Act (Inducement of Infringement of Copyright) that is as ridiculous as it sounds: one example of "inducing" infringement could be an iPod, because it encourages the use of P2P networks.

Ernest Miller has a very funny fisking of the act at corante.com. He is also maintaining a
list of devices that could become illegal under the act.

## Thursday, July 08, 2004

### Today's burning question

From the latest issue of the Communications of the ACM:

HAS THE INTERNET BECOME INDISPENSABLE ?

In future issues:
1. DOS vs SYSTEM V: Pros and Cons.
2. Gopher: a new method for data transfer...
3. Special IBM Issue: Does the world need more than 5 computers ?

Sigh...

## Tuesday, July 06, 2004

### A dog and pony show...

Freeman Dyson has a review of Brian Greene's new book 'The Fabric of the Cosmos' in the New York Review of Books. The review provides a contrarian view on the viability of string theory, and an interesting history of the evolution of quantum physics from a revolutionary vs conservative perspective.

I found the following passage rather astounding:

Three years ago, in January 2001, I was invited to the World Economic Forum in Davos, Switzerland. Brian Greene was also invited, and we were asked to hold a public debate on the question "When will we know it all?" In other words, when will the last big problems of science be solved? The audience consisted mainly of industrial and political tycoons. Our debate was intended to entertain the tycoons, not to give them a serious scientific education. To make it more amusing, Greene was asked to take an extreme position saying "Soon," and I was asked to take an extreme position saying "Never."

Doesn't it seem sad that eminent scientists are brought out for a dog-and-pony-show in front of industrialists ? I know that "voices of reason" will argue that this is a good way to spread the Word among movers and shakers, but like this ?

A related article in Slate (where I found this review) also talks about the role of beauty and aesthetics in guiding (or misguiding) scientists.

### SODA update

490 papers in: roughly 10% over last year (450), but then the committee is 10% larger as well :) no breakdown as yet of short vs long.

it's going to be a long summer...

## Monday, July 05, 2004

### SODA 2005

well the SODA submission deadline has passed (congratulations Jeff), and we have a record number of submissions this year (again). I will now go underground, to see daylight in a few months...

## Friday, July 02, 2004

### Apple, Wired and FOCS 2004.

Jeff Erickson snarks about Apple's discovery of the wonders of searching as-opposed-to sorting. Just to demonstrate that our community is on the cutting edge of technology trends, I present to you, hot off the presses, from the FOCS 2004 accepted papers list:

No Sorting? Better Searching!
Gianni Franceschini and Roberto Grossi