Friday, June 19, 2009

SoCG 2009: Local search, geometric approximations and PTASs

I planned on this post coming a lot sooner after the last three, but got caught up in my own version of a Danish flu. sigh....

So let's get greedy. In the last post, I talked about new developments in the epsilon-net route to approximating covering/hitting problems. The problem with these approaches though (as Saurabh Ray pointed out) is that they (at best) get you to "some" constant. They don't get you down to a specific constant (unless you can really optimize the net-construction) and they certainly don't give you a PTAS.

So what's a poor approximator to do if their livelihood depends on getting a 1+epsilon-approximation ? Two papers at SoCG 2009 answer this question independently using essentially the same idea, and there's already a third paper that applies this idea as well. You know what they say: "three applications doth a general technique make" (or at least they should say it !), and the technique is so neat that I thought I'd spend some time outlining the basic principle.

But first, a detour. What follows is my own take on the underlying argument, and I'm the only one responsible for all the inaccuracies in interpretation and exposition. Consider the problem of computing a maximum cardinality matching in a bipartite graph. As we all know, this can solved using a max-flow algorithm. A well-known "spin" on this result is the following:
Suppose we construct a max flow F_k using augmenting paths of length at most 2k+1. Then OPT <= (1 + 1/(k+1)) F_k
Or in other words, F_k is a 1 + 1/(k+1) approximation to the true answer. The proof is fairly easy: consider any augmenting path of length at least 2k+3 that could increase the total matching size. Because we've tried all augmenting paths of length 2k+1, it must be that this path alternates between OPT (red) and algorithm (blue) edges, starting and ending in a red edge. In other words, if X is the number of blue edges along this path, then 2X+1= 2k+3, and X = k+1. This means that the number of optimal edges is at most (1 + 1/(k+1)) times the number of algorithm edges. Note that all augmenting paths must be disjoint, so some algebra yields the above result.

Another way of viewing the above fact about alternation is that if we think of the edges as vertices of a bipartite graph, (colored as the edges were), then each red set is adjacent to a reasonable number of blue vertices (in fact if this were not the case, the analysis would break). The edge-disjointness then allows us to scale things up to the entire graph

Notice that the process of generating longer and longer augmenting paths can be viewed as a local search operation. Starting from a given path, we modify it by constructing an alternating path using the current edges.In other words, we're able to make claims about local search yielding optimal solutions because we can control the way OPT interacts with the results of the algorithm, with the interaction parametrized by the "degree of locality".

Now let's return to the papers, keeping this in mind.

Mustafa and Ray apply this idea to get a PTAS for hitting set problems. Specifically, they show that for r-admissible regions in the plane and for halfspaces in R^3, the hitting set problem admits a PTAS that runs in time O(mn^(1/eps^2)) (m: number of regions, n: number of points). A collection of regions in the plane is r-admissible if any pair intersect in at most r points, and the differences are connected (these are often called non-piercing collections for that reason).

Chan and Har-Peled use the idea to get a PTAS for max-independent-set problems for pseudo-disks in the plane (which are 2-admissible). Their running time is similar.

Both papers run the following algorithm, parametrized by a number b. Take any feasible solution, and call it L. Now check all sets that differ from L in at most b places and see if any of them can be used to improve the quality of L by at least one unit. If so, make the swap and repeat. This process will terminate in at most n steps, and at each stage, you need to do a search over at most n^b sets. We'll call such a solution "b-local"

The idea that roughly governs both analyses is this: Consider the solution you're currently at (let's call it red) and the optimal solution (color it blue). Think of building a bipartite graph (this works slightly differently for the different problems) which contains the red and blue objects as vertices, and connects two (differently colored) objects by an edge if there's a range that contains both (for the hitting set problem) or if they intersect (for the MIS) (call this a 'locality property' as Mustafa and Ray do) Then they show some things:
  1. This graph is planar
  2. It "expands": neighborhoods of blue sets of size at most b are large
  3. As a consequence of 1,2, the size of the blue set is upper bounded by (1+1/sqrt(b))(size of red set)
See the resemblance to the max flow argument ? (or maybe not ;)). A reasonably quick corollary then shows that by setting b = 1/eps^2, the desired PTAS follows (since by 3), the optimal solution is reasonably close to the current solution. 2) holds by the locality argument, i.e if it weren't true, then there'd be some way to locally improve the red set using a "lookahead" of size at most b. 1) is case-by-case, but works for these problems. The proof of 3) itself requires separator machinery for planar graphs.

An upcoming paper by Gibson, Kanade, Krohn and Varadarajan applies the same idea for the 1.5D terrain guarding problem (one of the few nontrivial art gallery problems that has yielded to approximation attacks). They produce a planar graph that satisfies the locality property, and then apply the rest of the local search machinery as before. What's neat is that the machinery really does apply just as before: "all they have to do" (and I'm not trying to minimizing the effort here, just pointing out the smoothness of the technique) is construct the planar graph.

Some thoughts:
  • There's some room for improvement in the running time perhaps, at least down to a linear dependence on 1/epsilon in the exponent. We can't get a poly dependence because of strong-NP-completeness
  • I don't think the weighted versions of these problems would fall to the same "trick": indeed the Chan-Har-Peled paper considers a neat LP-based approach for the weighted MIS.
There have been numerous PTASs for geometric problems: the Arora/Mitchell TSP approximation is the most famous of course. But this technique is a relatively new kind of generic tool, and has demonstrated immediate applicability for different problems, which makes it noteworthy. Secondly, it's a way of reasoning about the efficacy of local search, which is always neat, since local search algorithms are generally not that complicated.
Post a Comment

Disqus for The Geomblog