Monday, October 31, 2005

On this most horrifying of days

Kurt Van Etten explores the most horrifying thing of all: the flawed P vs NP proof.

Friday, October 28, 2005

Darwin's and Einstein's (e)mail correspondence rates, or a rumination on power laws.

Over the last few days, there has been much discussion of a paper in Nature by Oliveira and Barabási on the correspondence patterns of Darwin and Einstein. One of the main conclusions of the paper was that at some point, their response patterns started following a power-law distribution, with coefficients such that a finite fraction of mail went unanswered.

Soon after this, there was a response suggesting that the analysis was flawed, and in fact the data followed a log-normal pattern, rather than a power-law. Cosma Shalizi and Aaron Clauset weighed in on this as well, on the side of the "log-normal folks".

As Geomblog denizens might know, I had a while ago mentioned a fascinating article by Michael Mitzenmacher on the difference (and similarity) between log-normal and power-law distributions, and the natural processes that generate them. I asked Michael if he'd like to weigh in on this controversy, and what follows below is his note. He comments not only on the technical issues involved, but on the whole issue of PR in science, and the contributions that Barabási has made to the field, and his perspective is very interesting indeed.

For those of you who might wonder why computer scientists (and theoretical computer scientists at that) should concern themselves about such matters, I'll refer you to the reading list for Jon Kleinberg's course on Information Networks, where you'll see how many aspects of link structure analysis on the web (which in turn is used for search engines like Google) relate to understanding power law distributions.

And now for the article...
Suresh asked me to comment on a current controversy: a paper by Barabasi et al claims that many human interactions, including the example of Darwin's and Einstein's letter response frequency, are governed by a bursty process that leads to power law tails. A rebuttal by Stouffer et al claims that the distribution is really lognormal.

I believe Suresh asked me to comment because (here comes the blatant self-promoting plug) I had written a survey on lognormal and power law distributions (and a slightly more fun, less edited earlier version). I'm embarassed to say I had not heard of the controversy, but it looks like a lot of fun for those of us who enjoy this sort of thing. Rather than focus on the specific technical details, let me make two high level comments.

First, as you'll know in more detail if you read the survey, this controversy does not surprise me. Lognormal and power law distributions are so closely related that if somebody wrote a paper claiming some distribution is a power law, you can count on somebody else writing a follow-up paper claiming it is actually lognormal (and vice versa). The only thing that is surprising is how fast the turnaround time is these days. The Web is truly amazing that way.

A more important second point, however, relates to how good Barabasi is at generating PR, and the recent panel discussion at FOCS about popularizing computer science. I think the reason this controversy erupted so quickly was because Barabasi sent a summary note on the work to Nature. It has been argued that we in TCS need a prophet, and Barabasi is indeed a modern prophet in the area of networks and power laws. Say what you like about him (and people say some bad things -- the Stouffer rebuttal does not use the words "technically incompetent", but I think the implication is in there; I and many others have had issues with Barabasi's grandiose claims and lack of sufficient citation of past work in the past), but he gets his message out there, and people actually know about it. His book Linked was very popular and had a big impact. While I think all his talk about networks is good for computer science in general, I'd rather see a computer scientist out in front of the world enlightening the masses as to why all this stuff is important.

For those who care about the technical issue, I will express some mild opinions. While the rebuttal suggests the data is a better fit for the lognormal distribution, I am not a big believer in the fit-the-data approach to distinguish these distributions. The Barabasi paper actually suggested a model, which is nice, although the problem of how to verify such a model is challenge, as I talk about in this MSRI video. This seems to be the real problem. Trust me, anyone can come up with a power law model. The challenge is figuring out how to show your model is actually right.

Michael Mitzenmacher

Computer Science in the NYT

but maybe not in the way we wanted.

Nearly a year ago, I had mentioned that Andy Yao had returned to China to teach there. Today, he leads off a NYT article on Chinese universities:
When Andrew Chi-chih Yao, a Princeton professor who is recognized as one of the United States' top computer scientists, was approached by Qinghua University in Beijing last year to lead an advanced computer studies program, he did not hesitate.

It did not matter that he would be leaving one of America's top universities for one little known outside China. Or that after his birth in Shanghai, he was raised in Taiwan and spent his entire academic career in the United States. He felt he could contribute to his fast-rising homeland.

"Patriotism does have something to do with it, because I just cannot imagine going anywhere else, even if the conditions were equal," said Dr. Yao, who is 58.

The rest of the article talks about the challenges of building up world-class universities in China. It did have this interesting line from Prof. Yao:
Dr. Yao said he had expected to concentrate on creating a world-class Ph.D. program but had found surprising weaknesses in undergraduate training and had decided to teach at that level. "You can't just say I'll only do the cutting-edge stuff," he said. "You've got to teach the basics really well first."
Sounds like something you could say about the U.S. system as well.

Wednesday, October 26, 2005

a puzzle

that someone asked me, and I don't know the answer to:

Is there a closed-form expression of some kind corresponding to the formal power series

∑ x2k
k = 0


The ranks grow..

D. Sivakumar, formerly of IBM Almaden, and now at Google, has started a 'Glob'.

Tuesday, October 25, 2005

Maybe there's hope for theory after all...

Conversation had by your humble blogger in a taxicab:

Driver: So where are you from ?
Humble blogger: From India
D: Ah, so are you a student (this was in Philadelphia) ?
HB: No, I work here, at AT&T
D: Oh, so what do you do ?
HB: I'm a computer scientist...
D: So do you do programming ?
HB: Not exactly, I do research. It's slightly more mathematically oriented.
D: Oh, so you mean algorithms ?
HB (slightly surprised): Yes, actually. That's right.
D: All this P vs NP stuff ?
HB (even more amazed): Yes, that kind of stuff.
D: Yeah, you know some years ago, three Indians showed how to find prime numbers efficiently.
HB (at this point on the edge of his seat): Yup ! So do you have a computer science background ?
D: Yes, I took a class at Brooklyn College

.... some time later....

D: I liked the algorithms class. The instructor was tough, but I passed. I really liked converting NFAs to DFAs; I remember standing in front of a blackboard at 2 in the morning.
HB: (at this point I have nothing more to say, except to congratulate him on his knowledge).

Monday, October 24, 2005

FOCS Panel Discussion: The Soundbite edition

Shorter Aaronson: Theory needs prophets
Shorter Quantum Pontiff: Theory needs physics
Shorter Vanity of Vanities: Theory needs a merge dance.
Shorter Geomblog: Theory needs metaphors.

I liked Rocco's characterization of Bernard's comments at the panel discussion:
If the broader scientific community comes to use algorithms as a conceptual framework for the problems they are dealing with, we will have done well.
This is an important point: PR is a good thing (whether you publish in Nature or not), and "spreading the word" is also great. But more fundamentally, it is important to spread the idea that when you think about any kind of computational process, you are thinking about an algorithm, and therefore trying to define what it is you are computing will help you tap into the best way of solving the problem. The problem with a lot of the computational conceptualization (phew, there's a tongue twister) that goes on outside of theory is that it is akin to looking at a physical process and trying to come up with your own physics to understand it, rather than invoking (at the very least), Newton's laws of physics.

A legitimate retort to the above comment is that often, theoretical methods fail to address the phenomenon under study. To a degree this is true: theory of computation involves "laws of the limit"; worst-case and asymptotic analyses often fail to detect phenomena that occur when n is not heading off towards infinity. But why it's only true to a degree is because there is a lot of theory (maybe not the sexiest kind) that does push back towards non-asymptotic behaviour, for example in the various explanations of why quicksort works so well inspite of its worst-case O(n2) complexity.

The Quantum Pontiff makes another good point:
they need to attempt to convey their results in a way which, while not totally faithful to the science, gives the reader a reason to believe that they understand what is going on. This, of course, is much hard to do as a computer scientist than as a theoretical physicist because in the former rigor is held in higher esteme than in physics where hand-wavy arguments hold a much greater standing.
It's true; viewing theoretical computer science as a form of mathematics (Avi Wigderson's clarification notwithstanding) gives us the same public persona (at best) as mathematicians. Most mathematicians are viewed by the general public as glorified accountants (and mathematics as "counting"). The effect of this on our ability to proselytize and disseminate is even more noxious. After all, mathematics remains under the thrall of G. H. Hardy's apology, in which
There is no scorn more profound, or on the whole more justifiable, than that of the men who make for the men who explain. Exposition, criticism, appreciation, is work for second-rate minds.
How on earth are you supposed to come up with the next Feynman/Hawking/Sagan under such a withering blast of derision ?

Returning to the Quantum Pontiff, he makes another point close to my heart:
CS theory hasn’t, in my opinion, exploited the fact that it is studying a fundamental question: the fundamental limits of computation. This fundamental research direction, to me, is as deep as understanding what the laws of nature are (and if you catch my on some days I might even say that one is deeper than the other.
As Dave Winer might say, 'Bing!'. This is exactly right, and is a point that unfortunately many theoretical computer scientists don't agree with. The evidence in favor of this view of theory would take another post, and ultimately does have a faith-based element to it, but it is hard not to look at quantum computing, information theory, and black hole entropy, and feel that there is something more to computing than computers.

Sunday, October 23, 2005

Algorithms and Technologies, and STOC 2006

Of course, effective demonstrations of the power of algorithms require actual algorithms ! A much closer deadline is STOC 2006, to be held in Seattle from May 21-23, 2006. The deadline is November 3.

Algorithms and Technologies, and ESA 2006

I recently attended a talk on convex programming by Stephen Boyd. He made the interesting point that least-squares minimization, linear programming, and even semidefinite programming are now "technologies", in that they have a well defined theory and polynomial time solutions, efficient black box code that works for very large sized inputs, and overall, a very well worked out understanding of what variants to use for what kinds of problem instances.

What this means is that tomorrow if I have to suggest to someone that their problem can be formulated as a convex program, this not only gives them a solution "in theory", but gives them a solution "in practice", that they can actually implement. Such happenings are surprisingly rare when applying theoretical methods to real-world problems; it is not often that one can invoke a sophisticated algorithm that has known worst-case bounds, and find an implementation for it.

Today is the FOCS panel discussion on 'Exciting the public about (theoretical) computer science'. One way (though certainly not the only way) of exciting people about theory is to demonstrate the power of mathematical rigor and worst-case analysis by showing how algorithms that have good theoretical guarantees actually work well in practice: in other words, showing that algorithms can become "technologies".

This is a really long and roundabout way of mentioning a call for papers: ESA 2006 will be held in Zurich from Sep 11-15, 2006. ESA has a both a theoretical and applied track (disclaimer: I am on the PC for the applied track), and this year the applied track is trying a new approach to submissions:
Authors of papers that include an experimental aspect can make supporting source code or data sets available on a webpage (referenced in the submitted paper) or send them by e-mail to esa06code@mcs.le.ac.uk; such supporting files will be considered by the programcommittee members at their discretion.
So get your source code ready ! And remember this is not about theory making itself "useful", but about demonstrating that the formal approach to algorithm design is far more effective than any other approach.

Saturday, October 22, 2005

Alternate logics

A Neighbourhood of Infinity writes about alternate logics (apart from "standard" two-valued modus ponens based inference). Among the many alternate logics discussed is intuitionistic logic, which objects to the law of the excluded middle (either X or not X is true) and its connection to constructivist mathematics (where "there exists" is equivalent to "we can construct"). Sigfpe makes the following comment regarding the two:
...intuitionist logic is closely tied up with Constructive Mathematics (not to be confused with a similarly named educational fad) which insists that you always construct things you claim exist. This means that it goes hand in hand with computer science where people's programs are usually expected to produce a result - not just say a result exists.
Which is funny, because "there exists" statements abound in (theoretical) computer science, especially when dealing with combinatorial constructions. Admittedly, the gap between "there exists" and "here it is" is often an exponential rather than infeasible gap (the probabilistic method, for example).

He goes on to discuss higher-order logics, and mentions an interesting fact:
Quine believed that Second Order Logic isn't even Logic.
and goes on to say:
And even though Second Order Logic appears to have greater expressive power mathematicians actually get along fine with First Order Logic. Every time they make a statement about properties there's always a workaround (sometimes quite inelegant) for expressing something that's just as useful in First Order Logic.
Aha, but that's not true for NP, for example. After all, NP is equivalent to a restricted form of second order logic, and P is First Order logic augmented with an ordering and fixed point operator.

Yet another nugget that I learned from this article is the existence of "Quantum Logic". Namely,
Quantum Logic really is the quantization of logic - we have replaced truth values with non-commuting Hermitian operators whose eigenvalues are the results we want!
Overall, a very illuminating article.

FOCS 2005

FOCS starts today in Pittsburgh. Alas, I won't be attending. What sounds most intriguing to me (apart from the papers themselves) is the session on 'Exciting the public about (theoretical) computer science' at the business meeting. I'm looking forward to hearing what folks have to say about this. Since I can't be there, I'll merely reiterate what I said earlier about metaphors for computation.

I also note that Bernard Chazelle is giving a tutorial on Algorithmic Techniques and Tools from Computational Geometry, and Subhash Khot is giving a tutorial on the Unique Games Conjecture. I especially regret not being able to attend the second tutorial, given the frenzy of activity that this conjecture has generated in recent years (see Lance's post for a summary of the problem).

Friday, October 21, 2005

Beauty and ugliness

Via BoingBoing comes this gallery of mathematical art from Justin Mullins. Each piece of art is a mathematical concept or set of equations, with an evocative title. I enjoyed browsing the gallery, and even reading the liner notes below each picture.

The first two in the gallery are Beauty and Ugliness. Beauty is represented by the famous equation

e+1 = 0

and Ugliness is represented by the 633 graphs representing the difference cases to check in the proof of the 4-color theorem (this is the "simplified" proof by Robertson, Sanders, Seymour and Thomas).

Wednesday, October 19, 2005

Books that read like a thriller...a new entry.

A while back, I had talked about books that read like a thriller:
Occasionally one finds technical books that, either because of the subject matter, or the style of writing, or some other inexplicable properties, make for such gripping reading that one almost wants to read from cover to cover in a single sitting.
The latest entry in this category is Convex Optimization, by Stephen Boyd and Lieven Vandenberghe. It's a Cambridge University Press book, and covers a wide range of topics in convex programming, including convex duality, semidefinite programming, interior-point methods, and applications. One special case of convex programming is second order cone programming, which is further a special case of semidefinite programming, and includes the minimum enclosing ball problem and max-norm distance computations as special cases.

Why do I like this book so much ? An underrated feature of a good book is its choice of fonts, and although this seems like a rather shallow reason, I have to say that it makes a big difference. Nothing like a clean layout and nice fonts to make a text easy to read.

The material is organized very well. It starts off with the basic theory of convex functions, and slowly introduces the various convex programming variants, from the simpler kinds to the fully general versions. Applications to problems are given everywhere, and the exercises contain all kinds of interesting variants. The book works both as a text, to be read from cover to cover, and as a reference for specific definitions. It does probably the most difficult thing of all; hit the right tone in terms of mathematical exposition and intuition. I particularly liked the sections on interior point methods, as well its explanation of Legendre duality (the generalization of traditional geometric duality)

Tuesday, October 18, 2005

FWCG: One day and counting

One day and counting: tomorrow is the deadline for submissions to the Fall Workshop on Computational Geometry.

Sunday, October 16, 2005

Sampling from the simplex

This post started off as a call for help, and after some digging, became an answer to a question :).

How do you sample uniformly from the unit simplex ? Specifically, I would like a uniform sample from the set X = { (x1, x2, ..., xD) | 0 <= xi <= 1, x1 + x2 + ... + xD = 1}. D is the dimension of the simplex. The first class of approaches that one might try are "rejection" sampling methods. For example, we can sample element from the unit hypercube, and reject elements that don't lie in the simplex. This of course gets worse and worse as the dimension increases: the number of rejections for each valid point gets more and more. Another type of approach is what is called a "projection" method. for example, sample a point from the unit hypercube, and mark off the point on the simplex that intersects a ray connecting the origin and the sampled point. The problem with these approaches is that they generate non-uniform distributions, and you then have to figure out a way of normalizing correctly. An interesting hybrid is the idea of sampling from the unit sphere, and then using the map x -> sqrt{x} to map the point to the simplex. However, this mapping is also non-uniform, and generating a uniform sample efficiently from the sphere is also non-trivial.

Generating a point from the D-dimensional simplex is equivalent to sampling D-1 points from the unit line, sorting them, and then taking the intervals between adjacent points. The distribution of the sorted set (the order statistics) can be derived from the distribution the points are sampled from. In general, if the points are taken from a distribution with pdf given by f and cdf given by F, then the pdf of the kth smallest element is given by

f(k)(x) = n f(x) F(x)k-1 (1-F(x))n-k [ n-1 choose k-1 ]

For a uniform distribution, these values are given by the Beta distribution, with parameters k-1 and n-k. Of course, what we want are the differences between adjacent elements.

It turns out that there is an elegant way of doing this. We generate IID random samples from an exponential distribution (which you do by sampling X from [0,1] uniformly, and returning -log(X)). Take D samples, and then normalize. Voila, the resulting list of numbers is a uniform sample from the simplex.

A similar idea works to sample from the unit sphere. Sample elements IID from a normal distribution, and normalize so the resulting vector has norm 1.

For more on this, check out Luc Devroye's book, 'Non-Uniform Random Variate Generation'. It was originally published by Springer, and is now out of print. He has the entire book available on his website.

Update: As one of the commenters pointed out, uniform sampling from the simplex is a special case of sampling from a Dirichlet distribution (whose main claim to fame is as the conjugate prior for a multinomial distribution). To sample from the Dirichlet distribution, you can take normalized samples from an appropriately constructed gamma distribution, which for the case of sampling from the simplex reduces to sampling from the exponential distribution as above.

Saturday, October 15, 2005

Numb3rs

I haven't been posting regular Numb3rs summaries for the new season, because I've been disappointed with the new episodes; they appear to be using math mostly for throwaway comments, rather than actually trying to get into any specifics (Farey sequences, the Art Gallery Theorem, and cellular automata being some examples).

They did mention curvelets though, when doing image reconstruction. This particular reference is interesting because, as Shankar Krishnan informs me, one of the top experts on curvelets, Emmanuel Candes, is actually at CalTech, and Caltech's math department has been providing technical expertise for the show.

Wednesday, October 12, 2005

SoCG 2006: The Vortex edition

The compgeom mailing list has the second CFP for SoCG 2006, to be held in Sedona, Arizona. Paper deadline is Dec 5, 2005, but the abstracts are due Nov 23, 2005. Jeff predicts doom.

However, what he doesn't realize, and what the CFP doesn't mention, is the natural healing powers of the Sedona vortex,
a spot where a concentrated source of energy is flowing from. Vortexes are an accelerator for healing and clearing. Or a place to receive an abundance of energy.
And consider this effect of visiting Sedona:
  • A sense of communion with unseen forces surrounding you, including the intense desire to spend more time with Nature.
Personally, that happens to me whenever I have to review papers.

And for authors, there's even more that Sedona has to offer. Especially if your paper gets rejected:
THE GATEKEEPER VORTEX EXPERIENCE

This experience is designed to accelerate clearing of emotional blocks, letting go of the past, releasing feelings of frustration, anger and injustices.
If that doesn't work, you could always escort reviewers up to
THE EAGLE'S VIEW VORTEX
...follow the course of water up the red rocks to the Eagle’s View Vortex! The view is incredible! [..] stand [reviewers] on the vortex and guide them into shape-shifting into the eagle! They fly ahead into the Dream Weave and explore healing a moment in their life or just fly and fly!!
Once you've done that, and feel better, try out
THE HAWK VORTEX EXPERIENCE

Boulder dancing at it’s purest! Feel yourself let go and fly as you hop from one boulder to the next!
with the extra bonus that
This place has a mystical feel to it and if your spirit is ready, the hawk that lives there might pay you a visit!

7 days and counting...

The deadline for submitting abstracts to the Fall Workshop on Computational Geometry is only 7 days away now.

Saturday, October 08, 2005

The "right" kind of research

Not Even Wrong points to an essay by Sir Michael Atiyah in the Bulletin of the AMS on mathematical beauty. Reading it (and he writes, and speaks exceedingly well; the interview that he and Isadore Singer gave after wining the Abel prize is a gem to read) reminded me of the constant arguments we have about the "right" kind of theoretical computer science (which the FOCS/STOC/SODA/SoCG brouhaha was really a proxy for).

It reminded me of these arguments because it illustrates how pointless they really are (and I have been as guilty as any in attempting to "define" the right kind of research):
We can identify a long list of desirable qualities: beauty, elegance, importance,
originality, usefulness, depth, breadth, brevity, simplicity, clarity. However,
a single work can hardly embody them all; in fact some are mutually incompatible.
Just as different qualities are appropriate in sonatas, quartets or symphonies, so
mathematical compositions of varying types require different treatment.
He goes on to describe the myriad ways in which mathematical beauty (to him) appears in a work. It's interesting to read it, but what's even more important is the disclaimer:
...what follows are my personal preferences and make no claim to universality. In fact I glory in the diversity of mathematics and the lack of a uniform straightjacket [sic]. This is what makes it live.
At the end of the essay, he briefly dwells on the role of applied work in mathematics (another topic of relevance in theoryCS):
Much of mathematics was either initiated in response to external problems or has subsequently found unexpected applications in the real world. This whole linkage between mathematics and science has an appeal of its own, where the criteria must include both the attractiveness of the mathematical theory and the importance of the applications. [...]. Not only can we mathematicians be useful, but we can create works of art at the same time, partly inspired by the outside world.

Friday, October 07, 2005

Priority Sampling...

Today's AT&T Labs math seminar was by Mikkel Thorup, who talked about a new paper with Nick Duffield and Carsten Lund on priority sampling.

Suppose you are given items 1, 2, 3, ... n with weights wi. The goal is to be able to compute subset sum queries: For a given set of indices I (the query), compute the sum of all weights wi, i \in I. In situations of interest, n is huge, so you can't really store all the weights. What you'd like to be able to do is sample from the universe, obtaining a representative set of indices S with appropriate weights, so that when a query comes in, you add up the weights of elements in the intersection of I and S, and return that as your answer.

Now you can sample the items uniformly at random, but you'd miss the (typically few) elements with high weight. You could do weighted sampling, but then you'd constantly be hitting the elements with high weight, and that's no good. What you really need is some kind of weighted-sampling without replacement (which you can simulate by throwing out duplicates, but then you waste time trying to find a reasonable sample).

Their approach is an elegant idea called priority sampling. For each item, generate a random number ai between 0 and 1. Assign a priority to i of value wi/ai. Whp, all priorities are distinct. Take the top k elements (with respect to priority), and let the priority of the (k+1)th element be t. Now give each sampled element a weight w'i = max(wi, t). All unsampled elements get a new weight of 0.

The main bit of magic is the following fact:
E[w'i] = wi
In other words, the estimation is unbiased, from which it immediately follows that any subset sum estimate is also unbiased. An even more surprising fact (surprising because the threshold t appears to depend on k elements), is:

w'i and w'j are independent.

which then means that the variance of any subset sum is merely the sum of the individual variances. A conjecture of the authors, recently resolved in the affirmative by Mario Szegedy, is that this sampling scheme (using k+1 samples) provides the minimum total variance unbiased estimator over all such schemes that sample k elements from the input.

A formal expression for the variance of this sampling scheme is a bit complicated to write down in general, but for simple cases it can be written explicitly. The authors have experimental results showing how the estimates provides by priority sampling converge to the true estimates.

The sampling trick is very neat, and does appear magical even after you see the proof. There is a fascinating connection to a kind of sampling called threshold sampling, where each element is picked with a specific probability (and thus the sample size is not fixed), and a fixed threshold is used to decide whether an element is retained in the sample or not. Informally, threshold sampling provides a lower bound on the variance of such a scheme, and priority sampling is close to matching it.

What I understand less is the apparent connection to range sum histograms. In the data streaming world, a range sum histogram H is an approximation of an array A[1..n] such that range sum queries in A can be answered in H approximately. Muthukrishnan and Strauss have a SODA paper from 2003 on this topic, where they show how to construct histograms on a*B buckets that are within some approximation error of the best B-bucket histogram. One key difference is that in their formulation, the average squared error is minimized rather than the worst-case error of a single query. Note also that the subset sum formulation is more general, since ranges are a special kind of subset.

Disqus for The Geomblog