TY - GEN
T1 - Algorithms and Hardness for the Longest Common Subsequence of Three Strings and Related Problems
AU - Wang, Lusheng
AU - Zhu, Binhai
PY - 2023/9/20
Y1 - 2023/9/20
N2 - A string is called a square (resp. cube) if it is in the form
of XX = X2 (resp. XXX = X3). Given a sequence S of length n, a
fundamental problem studied in the literature is the problem of computing a longest subsequence of S which is a square or cube (i.e., the
longest square/cubic subsequence problem). While the longest square
subsequence (LSS) can be computed in O(n2) time, the longest cubic
subsequence (LCubS) is only known to be solvable in O(n5) time, using
the longest common subsequence of three strings (LCS-3) as a subroutine (which was much less studied compared with LCS for two strings, or
LCS-2). To improve the running time for LCubS, we look at its complementary version and also investigate LCS-3 for three strings S1, S2, S3,
with input lengths m ≤ n1 ≤ n2 respectively. Firstly, we generalize an
algorithm by Nakatsu et al. for LCS-2 to have an O(n1n2δ) algorithm
for computing LCS-3, where δ is the minimum number of letters to be
deleted in S1 to have an LCS-3 solution for S1, S2 and S3. This results
in an O(k3n2) algorithm for LCubS, where k is the minimum number of
letters deleted in S to have a feasible solution. Then, let R be the number
of triples (i, j, k) that match in the input, i.e., S1[i] = S2[j] = S3[k], we
show that LCS-3 can be computed in O (n + R log log n + R2) time (n is
the maximum length of the three input strings). Finally, we define the t-pseudo-subsequence of S under an integer parameter t, which is a string
Z containing a subsequence S of S such that S' can be obtained from Z
by deleting at most t letters. Subsequently, we study the longest majority t-pseudo-subsequence (LMtPS) of S'i, i = 1..3, which is a t-pseudo-subsequence T = t1t2 ···tK of Si, i = 1..3, with the maximum length K;
moreover, when T is aligned with some subsequence S'i’s of length K in
Si, i = 1..3, each tj matches at least two letters with S'i, i = 1..3. We
show that LMtPS of three strings S1, S2 and S3 is polynomially solvable, while if we require additionally that all letters in Σ appear in the
solution T then it becomes NP-complete, via a reduction to a new SAT
instance called Even-(3,B2)-SAT.
AB - A string is called a square (resp. cube) if it is in the form
of XX = X2 (resp. XXX = X3). Given a sequence S of length n, a
fundamental problem studied in the literature is the problem of computing a longest subsequence of S which is a square or cube (i.e., the
longest square/cubic subsequence problem). While the longest square
subsequence (LSS) can be computed in O(n2) time, the longest cubic
subsequence (LCubS) is only known to be solvable in O(n5) time, using
the longest common subsequence of three strings (LCS-3) as a subroutine (which was much less studied compared with LCS for two strings, or
LCS-2). To improve the running time for LCubS, we look at its complementary version and also investigate LCS-3 for three strings S1, S2, S3,
with input lengths m ≤ n1 ≤ n2 respectively. Firstly, we generalize an
algorithm by Nakatsu et al. for LCS-2 to have an O(n1n2δ) algorithm
for computing LCS-3, where δ is the minimum number of letters to be
deleted in S1 to have an LCS-3 solution for S1, S2 and S3. This results
in an O(k3n2) algorithm for LCubS, where k is the minimum number of
letters deleted in S to have a feasible solution. Then, let R be the number
of triples (i, j, k) that match in the input, i.e., S1[i] = S2[j] = S3[k], we
show that LCS-3 can be computed in O (n + R log log n + R2) time (n is
the maximum length of the three input strings). Finally, we define the t-pseudo-subsequence of S under an integer parameter t, which is a string
Z containing a subsequence S of S such that S' can be obtained from Z
by deleting at most t letters. Subsequently, we study the longest majority t-pseudo-subsequence (LMtPS) of S'i, i = 1..3, which is a t-pseudo-subsequence T = t1t2 ···tK of Si, i = 1..3, with the maximum length K;
moreover, when T is aligned with some subsequence S'i’s of length K in
Si, i = 1..3, each tj matches at least two letters with S'i, i = 1..3. We
show that LMtPS of three strings S1, S2 and S3 is polynomially solvable, while if we require additionally that all letters in Σ appear in the
solution T then it becomes NP-complete, via a reduction to a new SAT
instance called Even-(3,B2)-SAT.
KW - Longest common subsequence
KW - Longest cubic subsequence
KW - NP-completeness
KW - Polynomial-time algorithms
UR - http://www.scopus.com/inward/record.url?scp=85174450434&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85174450434&origin=recordpage
U2 - 10.1007/978-3-031-43980-3_30
DO - 10.1007/978-3-031-43980-3_30
M3 - RGC 32 - Refereed conference paper (with host publication)
SN - 978-3-031-43979-7
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 367
EP - 380
BT - String Processing and Information Retrieval
A2 - Nardini, Franco Maria
A2 - Pisanti, Nadia
A2 - Venturini, Rossano
PB - Springer, Cham
T2 - 30th International Symposium on String Processing and Information Retrieval, SPIRE 2023
Y2 - 26 September 2023 through 28 September 2023
ER -