GAZAR

Principal Engineer | Mentor

Substring Search Algorithms

Substring Search Algorithms

Substring search algorithms play a crucial role in various applications, including text processing, data analysis, and information retrieval. Understanding different substring search techniques is essential for efficient text manipulation and pattern matching.

  • Naive Approach: A simple method that involves comparing the pattern with each substring of the text sequentially.
function naiveSubstringSearch(text, pattern) {
    const occurrences = [];
    const n = text.length;
    const m = pattern.length;

    for (let i = 0; i <= n - m; i++) {
        let j;
        for (j = 0; j < m; j++) {
            if (text[i + j] !== pattern[j]) {
                break;
            }
        }
        if (j === m) {
            occurrences.push(i);
        }
    }

    return occurrences;
}
  • Knuth-Morris-Pratt (KMP) Algorithm: An efficient algorithm that preprocesses the pattern to avoid unnecessary comparisons during search.
// Knuth-Morris-Pratt (KMP) Algorithm
function kmpSubstringSearch(text, pattern) {
    const computeLPSArray = (pattern) => {
        const lps = [0];
        let len = 0;
        let i = 1;
        while (i < pattern.length) {
            if (pattern[i] === pattern[len]) {
                len++;
                lps[i] = len;
                i++;
            } else {
                if (len !== 0) {
                    len = lps[len - 1];
                } else {
                    lps[i] = 0;
                    i++;
                }
            }
        }
        return lps;
    };

    const occurrences = [];
    const lps = computeLPSArray(pattern);
    let i = 0;
    let j = 0;
    const n = text.length;
    const m = pattern.length;

    while (i < n) {
        if (pattern[j] === text[i]) {
            i++;
            j++;
        }
        if (j === m) {
            occurrences.push(i - j);
            j = lps[j - 1];
        } else if (i < n && pattern[j] !== text[i]) {
            if (j !== 0) {
                j = lps[j - 1];
            } else {
                i++;
            }
        }
    }

    return occurrences;
}
  • Boyer-Moore Algorithm: Another efficient algorithm that leverages the mismatch information to skip characters during search.
function boyerMooreSubstringSearch(text, pattern) {
    const occurrences = [];
    const n = text.length;
    const m = pattern.length;
    const last = Array(256).fill(-1);

    for (let i = 0; i < m; i++) {
        last[pattern.charCodeAt(i)] = i;
    }

    let i = m - 1;
    let j = m - 1;

    while (i < n) {
        if (text[i] === pattern[j]) {
            if (j === 0) {
                occurrences.push(i);
                i += m;
                j = m - 1;
            } else {
                i--;
                j--;
            }
        } else {
            i += m - Math.min(j, 1 + last[text.charCodeAt(i)]);
            j = m - 1;
        }
    }

    return occurrences;
}

  • Naive Approach: O((n - m + 1) * m)
  • Knuth-Morris-Pratt (KMP) Algorithm: O(n + m)
  • Boyer-Moore Algorithm: O(n + m)

Substring search algorithms are widely used in applications such as text editors, search engines, and bioinformatics, where efficient pattern matching is crucial for data analysis and retrieval.