Substring Search Algorithms
Substring search algorithms play a crucial role in various applications, including text processing, data analysis, and information retrieval. Understanding different substring search techniques is essential for efficient text manipulation and pattern matching.
- Naive Approach: A simple method that involves comparing the pattern with each substring of the text sequentially.
function naiveSubstringSearch(text, pattern) {
const occurrences = [];
const n = text.length;
const m = pattern.length;
for (let i = 0; i <= n - m; i++) {
let j;
for (j = 0; j < m; j++) {
if (text[i + j] !== pattern[j]) {
break;
}
}
if (j === m) {
occurrences.push(i);
}
}
return occurrences;
}
- Knuth-Morris-Pratt (KMP) Algorithm: An efficient algorithm that preprocesses the pattern to avoid unnecessary comparisons during search.
// Knuth-Morris-Pratt (KMP) Algorithm
function kmpSubstringSearch(text, pattern) {
const computeLPSArray = (pattern) => {
const lps = [0];
let len = 0;
let i = 1;
while (i < pattern.length) {
if (pattern[i] === pattern[len]) {
len++;
lps[i] = len;
i++;
} else {
if (len !== 0) {
len = lps[len - 1];
} else {
lps[i] = 0;
i++;
}
}
}
return lps;
};
const occurrences = [];
const lps = computeLPSArray(pattern);
let i = 0;
let j = 0;
const n = text.length;
const m = pattern.length;
while (i < n) {
if (pattern[j] === text[i]) {
i++;
j++;
}
if (j === m) {
occurrences.push(i - j);
j = lps[j - 1];
} else if (i < n && pattern[j] !== text[i]) {
if (j !== 0) {
j = lps[j - 1];
} else {
i++;
}
}
}
return occurrences;
}
- Boyer-Moore Algorithm: Another efficient algorithm that leverages the mismatch information to skip characters during search.
function boyerMooreSubstringSearch(text, pattern) {
const occurrences = [];
const n = text.length;
const m = pattern.length;
const last = Array(256).fill(-1);
for (let i = 0; i < m; i++) {
last[pattern.charCodeAt(i)] = i;
}
let i = m - 1;
let j = m - 1;
while (i < n) {
if (text[i] === pattern[j]) {
if (j === 0) {
occurrences.push(i);
i += m;
j = m - 1;
} else {
i--;
j--;
}
} else {
i += m - Math.min(j, 1 + last[text.charCodeAt(i)]);
j = m - 1;
}
}
return occurrences;
}
- Naive Approach: O((n - m + 1) * m)
- Knuth-Morris-Pratt (KMP) Algorithm: O(n + m)
- Boyer-Moore Algorithm: O(n + m)
Substring search algorithms are widely used in applications such as text editors, search engines, and bioinformatics, where efficient pattern matching is crucial for data analysis and retrieval.