A three-string approach to the closest string problem

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

22 Scopus Citations
View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)164-178
Journal / PublicationJournal of Computer and System Sciences
Volume78
Issue number1
Online published28 Jan 2011
Publication statusPublished - Jan 2012

Abstract

Given a set of n strings of length L and a radius d, the closest string problem (CSP for short) asks for a string tsol that is within a Hamming distance of d to each of the given strings. It is known that the problem is NP-hard and its optimization version admits a polynomial time approximation scheme (PTAS). Parameterized algorithms have been then developed to solve the problem when d is small. In this paper, with a new approach (called the 3-string approach), we first design a parameterized algorithm for binary strings that runs in (nL+nd3·6.731d) time, while the previous best runs in (nL+nd·8d) time. We then extend the algorithm to arbitrary alphabet sizes, obtaining an algorithm that runs in time (nL+nd·(1.612(|Σ|+β2+β-2))d), where |Σ| is the alphabet size and β2+1-2α-1-2 with 3√ √|Σ| −1+1. This new time bound is better than the previous best for small alphabets, including the very important case where |Σ|=4 (i.e., the case of DNA strings).

Research Area(s)

  • Computational biology, Fixed-parameter algorithms, The closest string problem