Kernelization

Summary

In computer science, a kernelization is a technique for designing efficient algorithms that achieve their efficiency by a preprocessing stage in which inputs to the algorithm are replaced by a smaller input, called a "kernel". The result of solving the problem on the kernel should either be the same as on the original input, or it should be easy to transform the output on the kernel to the desired output for the original problem.

Kernelization is often achieved by applying a set of reduction rules that cut away parts of the instance that are easy to handle. In parameterized complexity theory, it is often possible to prove that a kernel with guaranteed bounds on the size of a kernel (as a function of some parameter associated to the problem) can be found in polynomial time. When this is possible, it results in a fixed-parameter tractable algorithm whose running time is the sum of the (polynomial time) kernelization step and the (non-polynomial but bounded by the parameter) time to solve the kernel. Indeed, every problem that can be solved by a fixed-parameter tractable algorithm can be solved by a kernelization algorithm of this type. This is also true for approximate kernelization.

Example: vertex cover edit

A standard example for a kernelization algorithm is the kernelization of the vertex cover problem by S. Buss.[1] In this problem, the input is an undirected graph   together with a number  . The output is a set of at most   vertices that includes an endpoint of every edge in the graph, if such a set exists, or a failure exception if no such set exists. This problem is NP-hard. However, the following reduction rules may be used to kernelize it:

  1. If   and   is a vertex of degree greater than  , remove   from the graph and decrease   by one. Every vertex cover of size   must contain   since otherwise too many of its neighbors would have to be picked to cover the incident edges. Thus, an optimal vertex cover for the original graph may be formed from a cover of the reduced problem by adding   back to the cover.
  2. If   is an isolated vertex, remove it. An isolated vertex cannot cover any edges, so in this case   cannot be part of any minimal cover.
  3. If more than   edges remain in the graph, and neither of the previous two rules can be applied, then the graph cannot contain a vertex cover of size  . For, after eliminating all vertices of degree greater than  , each remaining vertex can only cover at most   edges and a set of   vertices could only cover at most   edges. In this case, the instance may be replaced by an instance with two vertices, one edge, and   , which also has no solution.

An algorithm that applies these rules repeatedly until no more reductions can be made necessarily terminates with a kernel that has at most   edges and (because each edge has at most two endpoints and there are no isolated vertices) at most   vertices. This kernelization may be implemented in linear time. Once the kernel has been constructed, the vertex cover problem may be solved by a brute force search algorithm that tests whether each subset of the kernel is a cover of the kernel. Thus, the vertex cover problem can be solved in time   for a graph with   vertices and   edges, allowing it to be solved efficiently when   is small even if   and   are both large.

Although this bound is fixed-parameter tractable, its dependence on the parameter is higher than might be desired. More complex kernelization procedures can improve this bound, by finding smaller kernels, at the expense of greater running time in the kernelization step. In the vertex cover example, kernelization algorithms are known that produce kernels with at most   vertices. One algorithm that achieves this improved bound exploits the half-integrality of the linear program relaxation of vertex cover due to Nemhauser and Trotter.[2] Another kernelization algorithm achieving that bound is based on what is known as the crown reduction rule and uses alternating path arguments.[3] The currently best known kernelization algorithm in terms of the number of vertices is due to Lampis (2011) and achieves   vertices for any fixed constant  .

It is not possible, in this problem, to find a kernel of size  , unless P = NP, for such a kernel would lead to a polynomial-time algorithm for the NP-hard vertex cover problem. However, much stronger bounds on the kernel size can be proven in this case: unless coNP   NP/poly (believed unlikely by complexity theorists), for every   it is impossible in polynomial time to find kernels with   edges.[4] It is unknown for vertex cover whether kernels with   vertices for some   would have any unlikely complexity-theoretic consequences.

Definition edit

In the literature, there is no clear consensus on how kernelization should be formally defined and there are subtle differences in the uses of that expression.

Downey–Fellows notation edit

In the notation of Downey & Fellows (1999), a parameterized problem is a subset   describing a decision problem.

A kernelization for a parameterized problem   is an algorithm that takes an instance   and maps it in time polynomial in   and   to an instance   such that

  •   is in   if and only if   is in  ,
  • the size of   is bounded by a computable function   in  , and
  •   is bounded by a function in  .

The output   of kernelization is called a kernel. In this general context, the size of the string   just refers to its length. Some authors prefer to use the number of vertices or the number of edges as the size measure in the context of graph problems.

Flum–Grohe notation edit

In the notation of Flum & Grohe (2006, p. 4), a parameterized problem consists of a decision problem   and a function  , the parameterization. The parameter of an instance   is the number  .

A kernelization for a parameterized problem   is an algorithm that takes an instance   with parameter   and maps it in polynomial time to an instance   such that

  •   is in   if and only if   is in   and
  • the size of   is bounded by a computable function   in  .

Note that in this notation, the bound on the size of   implies that the parameter of   is also bounded by a function in  .

The function   is often referred to as the size of the kernel. If  , it is said that   admits a polynomial kernel. Similarly, for  , the problem admits linear kernel.

Kernelizability and fixed-parameter tractability are equivalent edit

A problem is fixed-parameter tractable if and only if it is kernelizable and decidable.

That a kernelizable and decidable problem is fixed-parameter tractable can be seen from the definition above: First the kernelization algorithm, which runs in time   for some c, is invoked to generate a kernel of size  . The kernel is then solved by the algorithm that proves that the problem is decidable. The total running time of this procedure is  , where   is the running time for the algorithm used to solve the kernels. Since   is computable, e.g. by using the assumption that   is computable and testing all possible inputs of length  , this implies that the problem is fixed-parameter tractable.

The other direction, that a fixed-parameter tractable problem is kernelizable and decidable is a bit more involved. Assume that the question is non-trivial, meaning that there is at least one instance that is in the language, called  , and at least one instance that is not in the language, called  ; otherwise, replacing any instance by the empty string is a valid kernelization. Assume also that the problem is fixed-parameter tractable, i.e., it has an algorithm that runs in at most   steps on instances  , for some constant   and some function  . To kernelize an input, run this algorithm on the given input for at most   steps. If it terminates with an answer, use that answer to select either   or   as the kernel. If, instead, it exceeds the   bound on the number of steps without terminating, then return   itself as the kernel. Because   is only returned as a kernel for inputs with  , it follows that the size of the kernel produced in this way is at most  . This size bound is computable, by the assumption from fixed-parameter tractability that   is computable.

More examples edit

  • Vertex cover parametrized by the size of the vertex cover: The vertex cover problem has kernels with at most   vertices and   edges.[5] Furthermore, for any  , vertex cover does not have kernels with   edges unless  .[4] The vertex cover problems in  -uniform hypergraphs has kernels with   edges using the sunflower lemma, and it does not have kernels of size   unless  .[4]
  • Feedback vertex set parametrized by the size of the feedback vertex set: The feedback vertex set problem has kernels with   vertices and   edges.[6] Furthermore, it does not have kernels with   edges unless  .[4]
  •  -path: The  -path problem is to decide whether a given graph has a path of length at least  . This problem has kernels of size exponential in  , and it does not have kernels of size polynomial in   unless  .[7]
  • Bidimensional problems: Many parameterized versions of bidimensional problems have linear kernels on planar graphs, and more generally, on graphs excluding some fixed graph as a minor.[8]

Kernelization for structural parameterizations edit

While the parameter   in the examples above is chosen as the size of the desired solution, this is not necessary. It is also possible to choose a structural complexity measure of the input as the parameter value, leading to so-called structural parameterizations. This approach is fruitful for instances whose solution size is large, but for which some other complexity measure is bounded. For example, the feedback vertex number of an undirected graph   is defined as the minimum cardinality of a set of vertices whose removal makes   acyclic. The vertex cover problem parameterized by the feedback vertex number of the input graph has a polynomial kernelization:[9] There is a polynomial-time algorithm that, given a graph   whose feedback vertex number is  , outputs a graph   on   vertices such that a minimum vertex cover in   can be transformed into a minimum vertex cover for   in polynomial time. The kernelization algorithm therefore guarantees that instances with a small feedback vertex number   are reduced to small instances.

See also edit

Notes edit

  1. ^ This unpublished observation is acknowledged in a paper of Buss & Goldsmith (1993)
  2. ^ Flum & Grohe (2006)
  3. ^ Flum & Grohe (2006) give a kernel based on the crown reduction that has   vertices. The   vertex bound is a bit more involved and folklore.
  4. ^ a b c d Dell & van Melkebeek (2010)
  5. ^ Chen, Kanj & Jia (2001)
  6. ^ Thomassé (2010)
  7. ^ Bodlaender et al. (2009)
  8. ^ Fomin et al. (2010)
  9. ^ Jansen & Bodlaender (2013)

References edit

  • Abu-Khzam, Faisal N.; Collins, Rebecca L.; Fellows, Michael R.; Langston, Michael A.; Suters, W. Henry; Symons, Chris T. (2004), Kernelization Algorithms for the Vertex Cover Problem: Theory and Experiments (PDF), University of Tennessee.
  • Bodlaender, Hans L.; Downey, Rod G.; Fellows, Michael R.; Hermelin, Danny (2009), "On problems without polynomial kernels", Journal of Computer and System Sciences, 75 (8): 423–434, doi:10.1016/j.jcss.2009.04.001.
  • Buss, Jonathan F.; Goldsmith, Judy (1993), "Nondeterminism within  ", SIAM Journal on Computing, 22 (3): 560–572, doi:10.1137/0222038, S2CID 43081484.
  • Chen, Jianer; Kanj, Iyad A.; Jia, Weijia (2001), "Vertex cover: Further observations and further improvements", Journal of Algorithms, 41 (2): 280–301, doi:10.1006/jagm.2001.1186, S2CID 13557005.
  • Dell, Holger; van Melkebeek, Dieter (2010), "Satisfiability allows no nontrivial sparsification unless the polynomial-time hierarchy collapses" (PDF), Proceedings of the 42nd ACM Symposium on Theory of Computing (STOC 2010) (PDF), pp. 251–260, doi:10.1145/1806689.1806725, ISBN 978-1-4503-0050-6, S2CID 1117711.
  • Downey, R. G.; Fellows, M. R. (1999), Parameterized Complexity, Monographs in Computer Science, Springer, doi:10.1007/978-1-4612-0515-9, ISBN 0-387-94883-X, MR 1656112, S2CID 15271852.
  • Flum, Jörg; Grohe, Martin (2006), Parameterized Complexity Theory, Springer, ISBN 978-3-540-29952-3, retrieved 2010-03-05.
  • Fomin, Fedor V.; Lokshtanov, Daniel; Saurabh, Saket; Thilikos, Dimitrios M. (2010), "Bidimensionality and kernels", Proceedings of the 21st ACM-SIAM Symposium on Discrete Algorithms (SODA 2010), pp. 503–510.
  • Jansen, Bart M. P.; Bodlaender, Hans L. (2013), "Vertex Cover Kernelization Revisited - Upper and Lower Bounds for a Refined Parameter", Theory Comput. Syst., 53 (2): 263–299, arXiv:1012.4701, doi:10.1007/s00224-012-9393-4,
  • Lampis, Michael (2011), "A kernel of order 2k − c log k for vertex cover", Information Processing Letters, 111 (23–24): 1089–1091, doi:10.1016/j.ipl.2011.09.003.
  • Thomassé, Stéphan (2010), "A 4k2 kernel for feedback vertex set", ACM Transactions on Algorithms, 6 (2): 1–8, doi:10.1145/1721837.1721848, S2CID 7510317.
  • Niedermeier, Rolf (2006), Invitation to Fixed-Parameter Algorithms, Oxford University Press, CiteSeerX 10.1.1.2.9618, ISBN 0-19-856607-7, archived from the original on 2008-09-24, retrieved 2017-06-01.

Further reading edit

  • Fomin, Fedor V.; Lokshtanov, Daniel; Saurabh, Saket; Zehavi, Meirav (2019), Kernelization: Theory of Parameterized Preprocessing, Cambridge University Press, p. 528, doi:10.1017/9781107415157, ISBN 978-1107057760
  • Niedermeier, Rolf (2006), Invitation to Fixed-Parameter Algorithms, Oxford University Press, Chapter 7, ISBN 0-19-856607-7, archived from the original on 2007-09-29, retrieved 2017-06-01
  • Cygan, Marek; Fomin, Fedor V.; Kowalik, Lukasz; Lokshtanov, Daniel; Marx, Daniel; Pilipczuk, Marcin; Pilipczuk, Michal; Saurabh, Saket (2015), Parameterized Algorithms, Springer, Chapters 2 and 9, ISBN 978-3-319-21274-6