You are given a string, s, and a list of words, words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters.
For example, given:
s: "barfoothefoobarman"
words: ["foo", "bar"]
You should return the indices: [0,9].
(order does not matter).
public List<integer> findSubstring(String s, String[] words) {
int N = s.length();
List<integer> indexes = new ArrayList<integer>(s.length());
if (words.length == 0) {
return indexes;
}
int M = words[0].length();
if (N < M * words.length) {
return indexes;
}
int last = N - M + 1;
// 很简单的数据处理
//map each string in words array to some index and compute target counters
Map<string integer=""> mapping = new HashMap<string integer="">(words.length);
int [][] table = new int[2][words.length];
int failures = 0, index = 0;
for (int i = 0; i < words.length; ++i) {
Integer mapped = mapping.get(words[i]);
if (mapped == null) {
++failures;
mapping.put(words[i], index);
mapped = index++;
}
++table[0][mapped];
}
// 将题目中的数据放到目标容器中
//find all occurrences at string S and map them to their current integer, -1 means no such string is in words array
int [] smapping = new int[last];
for (int i = 0; i < last; ++i) {
String section = s.substring(i, i + M);
Integer mapped = mapping.get(section);
if (mapped == null) {
smapping[i] = -1;
} else {
smapping[i] = mapped;
}
}
// 到这里题目就变成对数组的处理了,真的不得不说原作者的思路真的很巧
//fix the number of linear scans
for (int i = 0; i < M; ++i) {
//reset scan variables
int currentFailures = failures; //number of current mismatches
int left = i, right = i;
Arrays.fill(table[1], 0);
//here, simple solve the minimum-window-substring problem
while (right < last) {
while (currentFailures > 0 && right < last) {
int target = smapping[right];
if (target != -1 && ++table[1][target] == table[0][target]) {
--currentFailures;
}
right += M;
}
while (currentFailures == 0 && left < right) {
int target = smapping[left];
if (target != -1 && --table[1][target] == table[0][target] - 1) {
int length = right - left;
//instead of checking every window, we know exactly the length we want
if ((length / M) == words.length) {
indexes.add(left);
}
++currentFailures;
}
left += M;
}
}
}
return indexes;
}
// 如果对过程不理解的,可以去实际测试数据并调试或中间输出。
// 耽误时间好久~</string></string></integer></integer></integer>
本文介绍了一种高效的字符串匹配算法,该算法能在给定的字符串中找到由特定单词组成的子串的所有起始索引位置。通过使用映射和计数等技巧,文章详细解释了算法的工作原理,并提供了一个具体的实现示例。
300

被折叠的 条评论
为什么被折叠?



