A Tri-Character guided exact String-matching Algorithm for Efficient str detection In Forensic DNA Analysis
DOI:
https://doi.org/10.54692/ijeci.2025.0901/251Keywords:
TSLMC (Text scan for left right and middle character),, STR (Short Tandem Repeat), T(Text) and P(Pattern)Abstract
The importance of string-matching algorithms in the world of modern DNA forensic technology cannot be over-stated. Short Tandem Repeats (STRs) play an important role in forensic DNA analysis due to their high variability among individuals. Fast and accurate detection of STRs in large-scale genomic data is most important for criminal investigations, identity verification, and population studies. This study introduces a novel exact string-matching algorithm, the Tri-Scan for Left, Right, and Middle Character (TSLRMC) approach, tailored for efficient pattern detection in forensic DNA sequences. This research addresses limitations found in some famous and widely used exact string-matching algorithms. Proposed algorithm improves the running time for scanning pattern string in a long text string. The novelty of the proposed algorithm is to optimize the scanning by scanning the pattern string left, right and middle characters in the long DNA sequence string and then scanning the remaining characters of the pattern string in that partial text window where the pattern string’s left, right and middle characters are found. The proposed algorithm shows significant improvement compared with the most popular exact string-matching algorithms, based on running time as well as number of characters compared. Time complexity of this proposed novel TSLRMC algorithm is O (n-m) in worst case, O (mn) in average case and O (1) in best case
Downloads
Published
Versions
- 2025-12-25 (2)
- 2025-06-30 (1)