Algorithms and Hardness for the Longest Common Subsequence of Three Strings and Related Problems

Lusheng Wang, Binhai Zhu*

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

2 Citations (Scopus)

Abstract

A string is called a square (resp. cube) if it is in the form of XX = X2 (resp. XXX = X3). Given a sequence S of length n, a fundamental problem studied in the literature is the problem of computing a longest subsequence of S which is a square or cube (i.e., the longest square/cubic subsequence problem). While the longest square subsequence (LSS) can be computed in O(n2) time, the longest cubic subsequence (LCubS) is only known to be solvable in O(n5) time, using the longest common subsequence of three strings (LCS-3) as a subroutine (which was much less studied compared with LCS for two strings, or LCS-2). To improve the running time for LCubS, we look at its complementary version and also investigate LCS-3 for three strings S1, S2, S3, with input lengths mn1 n2 respectively. Firstly, we generalize an algorithm by Nakatsu et al. for LCS-2 to have an O(n1n2δ) algorithm for computing LCS-3, where δ is the minimum number of letters to be deleted in S1 to have an LCS-3 solution for S1S2 and S3. This results in an O(k3n2) algorithm for LCubS, where k is the minimum number of letters deleted in S to have a feasible solution. Then, let R be the number of triples (i, j, k) that match in the input, i.e., S1[i] = S2[j] = S3[k], we show that LCS-3 can be computed in (n + R log log n + R2) time (n is the maximum length of the three input strings). Finally, we define the t-pseudo-subsequence of S under an integer parameter t, which is a string Z containing a subsequence S of S such that S' can be obtained from Z by deleting at most t letters. Subsequently, we study the longest majority t-pseudo-subsequence (LMtPS) of S'i, i = 1..3, which is a t-pseudo-subsequence T = t1t2 ···tK of Si, i = 1..3, with the maximum length K; moreover, when T is aligned with some subsequence S'is of length K in Si, i = 1..3, each tj matches at least two letters with S'ii  = 1..3. We show that LMtPS of three strings S1S2 and S3 is polynomially solvable, while if we require additionally that all letters in Σ appear in the solution T then it becomes NP-complete, via a reduction to a new SAT instance called Even-(3,B2)-SAT.
Original languageEnglish
Title of host publicationString Processing and Information Retrieval
Subtitle of host publication30th International Symposium, SPIRE 2023, Proceedings
EditorsFranco Maria Nardini, Nadia Pisanti, Rossano Venturini
PublisherSpringer, Cham
Pages367-380
ISBN (Electronic)978-3-031-43980-3
ISBN (Print)978-3-031-43979-7
DOIs
Publication statusPublished - 20 Sept 2023
Event30th International Symposium on String Processing and Information Retrieval, SPIRE 2023 - Pisa, Italy
Duration: 26 Sept 202328 Sept 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14240 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference30th International Symposium on String Processing and Information Retrieval, SPIRE 2023
PlaceItaly
CityPisa
Period26/09/2328/09/23

Research Keywords

  • Longest common subsequence
  • Longest cubic subsequence
  • NP-completeness
  • Polynomial-time algorithms

RGC Funding Information

  • RGC-funded

Fingerprint

Dive into the research topics of 'Algorithms and Hardness for the Longest Common Subsequence of Three Strings and Related Problems'. Together they form a unique fingerprint.

Cite this