In the last post,we saw a lower bound for bit probing (specifically, extracting the lower-most bit of a number) in the PRAM-without-bitops model. Now clearly this can be computed efficiently in P, so why isn't this enough of a proof for the separation of the two classes ?
The answer is subtle. In a PRAM model, excluding bit operations is a nontrivial restriction, because, as we saw, there's no parallel-efficient way of extracting bits from a word without bit operations. However, in P, this is not true. In time linear in the bitlength, we can extract every single bit, and then carry on. Thus, P-with-no-bitops is as powerful as P itself !
Mulmuley argues that the "correct" notion of excluding bit operations in P is that of an algorithm that might run in time that depends on the bit length, but without bitops, in the way that algorithms like the ellipsoid method work. But an easier approach is to exclude algorithms whose run time depends on the bit length, yielding the class of strongly polynomial algorithms. Note that this class (which he calls SP) still contains the P-complete problems max-flow and min-cost flow.
Once we do that, extracting bits is no longer an "easy" operation in SP. The "input size" is 1, and so any bit-extracting algorithm is only permitted a constant number of steps when extracting a bit. Therefore, the apparent separation doesn't really hold.
Formally, what is proved in this paper is that SP strictly contains C (the class of PRAM-without-bitops algorithms), via the separation result for max flow.