Abstract
In this paper, we initiate the study on some problems related to multiple protein scaffold filling, with or without references. The objective is to maximize the sum of the Blosum62 scores of the filled sequences when no reference is given, or to maximize the Blosum62 score between the filled sequence and a reference. We present the following results: (1) given n scaffolds generated from the top-down tandem mass spectra, finding k scaffolds whose corresponding contents can be used to fill into a target sequence (or, a sequence whose Blosum62 score with a reference is maximized) takes Ω (nk-ε) time, unless the SETH (Strong Exponential Time Hypothesis) fails. (2) given two or more protein scaffolds and the corresponding multisets of amino acids to be filled accordingly, the corresponding optimization problem can be solved in polynomial time with dynamic programming. (3) Due to the high (and impractical) running times of the algorithms in (2), we implement several heuristic algorithms for the special cases when three scaffolds are given, and the corresponding empirical results are quite promising—although we also find that for the greedy algorithm more biological information needs to be incorporated to generate biologically meaningful results. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2026.
| Original language | English |
|---|---|
| Title of host publication | Bioinformatics Research and Applications |
| Subtitle of host publication | 21st International Symposium, ISBRA 2025, Proceedings, Part 1 |
| Editors | Jing Tang, Xin Lai, Zhipeng Cai, Wei Peng, Yanjie Wei |
| Publisher | Springer Singapore |
| Pages | 249-262 |
| Number of pages | 14 |
| Edition | 1 |
| ISBN (Electronic) | 978-981-95-0698-9 |
| ISBN (Print) | 978-981-95-0697-2 |
| DOIs | |
| Publication status | Published - 2025 |
| Event | 21st International Symposium on Bioinformatics Research and Applications, ISBRA 2025 - Helsinki, Finland Duration: 3 Aug 2025 → 5 Aug 2025 |
Publication series
| Name | Lecture Notes in Computer Science |
|---|---|
| Volume | 15756 LNBI |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | 21st International Symposium on Bioinformatics Research and Applications, ISBRA 2025 |
|---|---|
| Place | Finland |
| City | Helsinki |
| Period | 3/08/25 → 5/08/25 |
Funding
This research is supported by NSF under awards 2307571, 2307572, 2307573, and 2434487. LW is supported by a grant from NNSF of China (project 61972329) and GRF grants from Hong Kong SAR, China (CityU 11218423 and CityU 11218821). We also thank anomyous reviewers for helpful comments.
Research Keywords
- Greedy algorithms
- Lower bound
- Protein scaffold filling
- Protein sequencing
- Simulated Annealing
- Top-down and bottom-up methods
RGC Funding Information
- RGC-funded
Fingerprint
Dive into the research topics of 'On Multiple Protein Scaffold Filling'. Together they form a unique fingerprint.Projects
- 2 Active
-
GRF: Algorithms for Proteoform Detection and Multiplexed Protein Sequencing
WANG, L. (Principal Investigator / Project Coordinator)
1/01/24 → …
Project: Research
-
GRF: Algorithms for Identification and Quantification of Proteoforms Using Multiplexed Tandem Mass Spectra and De Novo Sequencing of Mixture Spectra
WANG, L. (Principal Investigator / Project Coordinator)
1/01/22 → …
Project: Research
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver