Thursday, January 27, 2005

SODA business meeting redux.

Adam Buchsbaum now has slides for his SODA 2005 business meeting presentation. After opening remarks and some basic statistics, he looked at acceptance rates across topics, and found that overall there is no advantage to be gained by focusing on specific areas. He thankfully skipped the now de rigeur "silly" section of the business meeting, where exotic formulae and elaborate paper titles that optimize acceptance are proposed.

The next section discussed diversity on the PC and in accepted papers, recapping the statistics that I talked about earlier.

Why Are Submissions Going Up ?

Next, he addressed what is a growing concern: increased submission levels and how to deal with them (this has become a serious enough problem that the ACM has a task force investigating the matter). His data comprises submission/acceptance information from 1999 forward (although for 1999 he only had long paper submission/acceptance rates). The most striking fact is the steady increase in submissions over the past 4 years. Note that this is not accounted for by the start of short paper submissions in 1999; current submissions levels are well beyond that.

Most of the increase can be attributed to an increased number of long submissions (but this will be addressed below). The main takeaway from this graph and from the subsequent plots is that the popular hypothesis "submission increases can be explained by the dot-com bust and lots of people returning to academia", is not supported by the data. The increase in new authors (as defined as people who had not submitted before year X) is steady, but not bursty. In fact the increase in paper submissions can be directly attributed to an increase in submitting authors, an increase that comes from both new authors and returning authors. Interestingly, drop-outs (people who never submit after year X) match returning authors quite well.

It does not appear to be the case that people are submitting more; weighted (by number of authors) and normal "papers/author" shows a small increase, but not significant.

Short vs Long

The next part of the presentation is where things get really interesting. Two charts display statistics on the scores of papers. The first one indicates the expected bell-curve like chart. The second chart (suggested by Piotr Indyk) plots score against reverse rank, and shows that except at the top 10% and at the bottom 10% of papers, there is no natural cutoff where an acceptance-rejection line can be drawn. This is significant, because often acceptance rates are claimed to be set by some "natural cutoff": for SODA this year, this appears not to be the case.

What all of this feeds into is a serious debunking of the value of short papers. Adam takes all the reasons that have been proposed for having short papers in the first place, and trashes them one by one:

1) Short papers increase acceptance of discrete math papers
BEEP ! This is not the case. Papers labelled as "discrete math" (which also includes CS papers with a strong discrete math component) in fact were accepted at a higher rate (34.5%) than the overal rate (27%). They also broke down into long and short submission rates the same way as non-DM papers.

2) Short papers provide an anchor for page lengths not at 12 (so that authors don't feel that their 6 page submission is viewed as inferior to a 12 page submission)
PZZT ! Wrong answer. Acceptance rates were not correlated with paper length. In fact many good 7-10 page papers were accepted. Compounding this was the fact that many short submissions had 7 pages of appendices (!).

3) Short papers allow for nascent and inter-disciplinary work.
This is a noble idea, but since we have reached the limit (135) of papers we can accept, any such short paper has to compete directly against a long paper, and invariably loses out. Basically, it is a zero-sum game at this point.

There was much wailing and gnashing of teeth at this point, but IMHO Adam's rather convincing presentation of the data quelled much discussion. After all, it's hard to have a strong opinion when there is clear data that contradicts it.

Will short papers die ? It's not clear. After one of the over 10 billion straw votes that we had, it was decided that the SODA 2006 PC would "take into consideration" the sense of the community, which was:

a) No extra tracks
b) Short papers MUST DIE (err... have problems).

