Tuesday, January 26, 2010

Author Feedback, or "Conference Review process considered harmful"

Author feedback is the latest attempt to put a band-aid on the bleeding carcass of the conference review process. We had author feedback at SoCG, and it's a common feature at many other conferences. The ostensible purpose of author feedback is to allow authors to clarify any misconceptions/confusions the reviewer might have so as to make the review process a bit more orderly (or less random?).

Usually, the process works like this: reviewers submit their reviews and have the option of requesting clarification on specific points from the authors. Authors get the questions, are required to submit a rebuttal/response by a certain date, and then deliberation continues. Variations on this include:
  • Length of the author response
  • When it's asked for (before discussions start, or after)
  • Whether it's called a 'rebuttal' or a 'response' or even just 'feedback' - I think the choice of word is significant
  • Whether the reviewers' current scoring for the paper is revealed or not.
While a good idea in principle, it can cause some headache for program committees, and often devolves into a game of cat and mouse: the reviewer carefully encrypts their questions so as not to tip their hand, the author tries to glean the reviewers' true intent from the questions, while trying to estimate which reviewer has the knife in, and so on and so forth.

What I want to rant about is the author feedback system for a conference I recently submitted to. The reviews came back long and vicious: as far as one reviewer is concerned, we should probably go and hide under a rock for the rest of our pathetic (and hopefully short) lives.

That doesn't bother me as much as it used to - I've grown a thick hide for these sorts of things ;). However, a combination of things has sent me into a fury:
  • The reviewer is actually wrong on most counts. This is isn't a matter of disagreeing over motivation, relevance etc. It's just a basic "please read section 5, column 1, sentence 3" type problem.
  • The author feedback limit is 2048 characters (which is a rather tiny amount if you're counting at home)
There's a basic issue of fairness here. Why does a reviewer get to go off on a rant for pages, while we have to limit our response to essentially sentences of the form "Yes. No. Maybe" ? Especially when the reviewer is basically wrong on a number of points, it takes a while to document the inaccuracies. At the very least, we should get as many characters in our response as the reviewers got in theirs ! (point of note: the set of reviews were 11225 characters long, and the specific reviewer I'm complaining about had a 2500 character long review)

This paper is not getting in, no matter what we say: that much is clear. I've almost never heard of a paper successfully rebutting the reviews, and in all fairness the other reviewers have issues that are matters of opinion and can't be resolved easily. That is a little disappointing, but perfectly fine within the way the review process works. But I'm annoyed that there's no good way to express my dissatisfaction with the reviewing short of emailing the PC chair, and it's not clear to me that this does any good anyway.

Overall, I think that author feedback in the limit gets us to journal reviews, which is a good thing (and my colleague actually suggested that conference reviewing should have more rounds of author feedback and less time for actual paper reviewing). But the way it's done right now, it's hard to see it as anything other than 'reviewing theater', to borrow a Bruce Schneier term. It looks nice, and might make authors and PCs feel good, but has little value overall.

Update: in case it was implied, this conference is NOT SoCG :)

16 comments:

  1. +1. In addition, assume that the terrible reviews one tends to sporadically get are due to reviewer overload (and not malice or laziness). Then what do we expect to happen when we increase the amount of work per reviewer?

    ReplyDelete
  2. There may be a sample size issue here.

    From what I'm hearing of the SoCG author feedback emails, at least, some are just blank (no different than a review process without rebuttals) and some are of the form you describe (long negative reviews where if you're very careful you might be able to nullify some of the reviewer's criticisms in a way that doesn't make them dig in their heels and fight harder against your paper, but there's still very little chance of changing the result). But some may be genuine questions where the committee still has a chance of making a more informed decision by getting a response. Perhaps if there are enough of the last kind it will make up for the pointlessness of the others?

    I wonder if there's any chance of SoCG releasing some sort of aggregate statistics, e.g. to what extent did accept/rejection differ among papers that got no questions vs those that did (I would expect more rejections among those with questions, but I wonder how much more). Or, though this may be harder to measure quantitatively, to what extent did author responses lead to a change in score for any papers?

    ReplyDelete
  3. I'm pretty skeptical of the author response period, too.

    If the PC chair doesn't force reviewers to account for the author response at the PC meeting, they're worthless.

    I also can't recall a case where the author response made a difference.

    "Peer review theater" is a good way to put it.

    ReplyDelete
  4. hi suresh,

    sorry about the review. still, you can probably agree that the option to provide a limited feedback and rebuttal is more fair than no feedback at all.

    also, i think that perhaps the main advantage of the feedback process is that the reviewers are typically more careful with formulating opinions, knowing they will be confronted by the authors. this certainly can help avoiding various pitfalls, like pointing out errors which are not really errors etc. of course, nothing works perfectly.

    piotr

    ReplyDelete
  5. There was recently a very interesting post about new venues for publication on John Langford's blog.

    The summary is basically use arxiv, and people post comments to arxiv which can be rebutted/discussed. All papers get published. Reviews are "hidden" until at least five are entered to prevent bias.

    I think we should really use a system like this in TOC.

    ReplyDelete
  6. Piotr: I have qualified agreement with the statement that "something is better than nothing" - hence my comment about 'peer review theater'. I agree with David that there's a sample size bias here, and frankly I'm also letting off steam, but I think the basic point I want to make is that it's important to implement such changes properly and not just think that turning a bit on (author feedback, double blind, whatever) is sufficient.

    ReplyDelete
  7. I'm afraid you have a losing argument. I've argued this in several conference business meetings (ICML, NAACL, ACL) and it never goes anywhere. But I haven't learned my lesson yet, so I'll say that I disagree with Piotr: I think he's making an all-else-equal fallacy.

    Namely, saying that something is better than nothing is true when there is no additional cost for that something. But there are many sources of additional cost for author feedback. At least: (1) time for authors to write the response, (2) time for reviewers to read it (hopefully!) and adjust their scores (even more hopefully!), (3) the extension of the entire reviewing process, (4) additional work for area chairs for many papers, (5) additional frustration from authors when their responses are not heeded (or read). The most significant, I think, is (2) because it requires reviewers to swap back in all the papers that they (most likely) reviewed in one day.

    The problem is that for any group of people, you will always find some subset that is (a) famous and (b) believes that they have turned a rejection into an acceptance via feedback. The problem is that we never see the other side, and it's totally unclear whether the time is worth the handful of extra papers that get accepted. (I remember NIPS showing some slides about how much scores changed, but I don't know where you can find these.)

    Okay, that's it for my soap box. It's nice to see that some people (tend to) agree with me, though, despite the fact that I think I was the only person (or maybe one of three) in the ICML 2009 business meeting to vote against author feedback.

    ReplyDelete
  8. For what is worth, in my experience serving on committees with rebuttals, most people don't dig their heels on a point by point basis. If a reply of the type alluded by Suresh comes (please read section 5, column 1, sentence 3) reviewers will concede the point. Sometimes they do tend to dig their heels on the overall opinion of the paper, though. This happens with and without rebuttals by the way. I've seen PC discussions in which a PC shoots down every objection from another PC yet the negative reviewer never increases his/her score.

    In terms of extra work, as Hal claims, in my experience is not large enough to make any difference. One can easily spend 100+ hours reviewing papers and the rebuttal process adds maybe two or three at most.

    I agree with Suresh that it is a band-aid solution on a flawed process. We give conferences journal-like status while using a rather superficial review process. This is absurd on its face.

    ReplyDelete
  9. hal, i think you misread my post: i only suggested that rebuttal option is more *fair*, not *better* overall (in response to suresh's complaint about fairness).

    as far as the overall balance of pros and cons is concerned, the answer could depend on the area. for example, if the main goal is to reduce factual errors (is there a mistake in the proof or experiment, has the work been done already, etc), then the feedback seems beneficial; in fact, many theory conferences utilize informal paper-specific feedback mechanisms already, and it certainly helps in my experience. so (addressing suresh), i do not think this is just a "theater", the role is much more clear.


    on the other hand, i could imagine that feedback has only limited impact on value judgments (as "is this work interesting").

    piotr

    ReplyDelete
  10. Can you build a tenure case in computer science having zero conference publications since being hired, only producing journal publications? (of high quality) Or keep your job at a research lab?

    The reason I ask:

    We give conferences journal-like status while using a rather superficial review process.

    This has an easy fix: stop submitting to conferences altogether.

    Anyway, a related link.

    ReplyDelete
  11. Piotr: Sorry, I did misread! You're right! (Can I adjust my review? :P)

    Re timing, my experience is more like 10:2 or 10:3. Maybe this is an area thing (eg., I'm not routinely checking very involved proofs). But, at 10:2 or 10:3, the tradeoff becomes: I could do author feedback, or I could get rid of the worst 20-25% of the reviewers (and give everyone else -- who's better -- an extra paper). IMO, the latter is much more preferable.

    NIPS did the "factual error" thing the first time they did author feedback and my understanding is it didn't work very well: people still wrote about what they wanted to write about, and you still got issues like the one Suresh points out where the line between factual and opinion gets blurred.

    Curious: Surely depends on area... my sense is it's doable, but hard. Since so much seems to be built around letters of reference, you need to get yourself known as well through journal land as you would have in conference land. (Disclaimer: I don't have tenure!)

    ReplyDelete
  12. To Curious: in most areas of computer science, no, you can't build a tenure case from journal pubs alone.

    The reason is more subtle than journals bad conferences good. It's that, while journal pubs can be as good as conference pubs in terms of how the publication itself looks on your vita, they are much worse at attracting attention to your work from others in the field. And you need that attention, both to get good recommendation letters and because people tend to look at citation counts.

    I'd like to think this is changing due to the existence of other online methods of attracting attention to one's papers (arxiv, blogs, etc) but I have no evidence that it actually is.

    ReplyDelete
  13. Curious: I agree with 0x1101110: the reason exclusive journal pubs aren't a viable route to tenure is because of visibility. If you're working on a critical problem, it doesn't matter where you publish, but most research is interesting, but not the kind that draws attention to itself automatically. Conferences help draw that attention.

    Piotr: a narrowly limited rebuttal for factual problems is definitely fairer than not having one. There, as Hal mentions, the problem is not with the process, but with the mission creep that happens. But I have no problem with a fact-determination feedback process.

    ReplyDelete
  14. It strikes me that the current conferences-are-the-end-all-be-all system creates the opposite of least publishable unit problem.

    Say, if you have bright insight on a problem, the result of many years of hard work, just the type that Dick Lipton regularly posts in his blog, then there is no forum in CS where to publish that.

    In Math, you give a talk in a conference like Frey did when he pointed out his straightforward conjecture connecting elliptic curves with Fermat's Last Theorem.

    In CS, a conference would be rejected such paper since it doesn't have enough blood, sweat and tears (BST) to honestly qualify for a conference publication.

    As well the conference would be taking a big chance on whether the observation prooves deep and groundbreaking or it becomes a dead end.

    So we end up with a large collection of technical results and silly improvements in our best conferences, since those are guaranteed to meet the BST threshold.

    ReplyDelete
  15. I will assume you're talking about NIPS or AISTATS, both of which had stupidly short character limits. I have experienced the same thing you're talking about.

    I just started submitting to CS conferences; my background is in EE, where what matters are journal publications and conferences are excuses to hang out and see what everyone is working on. Of course since it's what I grew up with I like it better, but that's another debate.

    From my perspective, the rebuttal process seems like a poorly applied band-aid on a reviewing process which encourages conservative research ideas and has inconsistent outcomes. That's only the two cents of an interloper, however.

    ReplyDelete
  16. In the db community, author feedback is getting popular, and I find it works well. It is NOT "rebuttal" though. It is not applied for the majority of papers, but only where a reviewer explicitly asks for clarification; usually this means the reviewer likes the paper, but is worried about some aspect and won't put their vote to accept without some reassurance. For example, reviewer might ask "what size was the cache in the experiment?" (if the answer is unreasonable, the experiments won't be adequate evidence that the idea works well) or "how do you handle the case where [thing] happens?" (when some part of the protocol isn't fully described in the submission, and the correct approach isn't obvious)

    ReplyDelete

Disqus for The Geomblog