题目:
Given a list of words (without duplicates), please write a program that returns all concatenated words in the given list of words.
A concatenated word is defined as a string that is comprised entirely of at least two shorter words in the given array.
Example:
Input: ["cat","cats","catsdogcats","dog","dogcatsdog","hippopotamuses","rat","ratcatdogcat"] Output: ["catsdogcats","dogcatsdog","ratcatdogcat"] Explanation: "catsdogcats" can be concatenated by "cats", "dog" and "cats";
"dogcatsdog" can be concatenated by "dog", "cats" and "dog";
"ratcatdogcat" can be concatenated by "rat", "cat", "dog" and "cat".
Note:
- The number of elements of the given array will not exceed
10,000
- The length sum of elements in the given array will not exceed
600,000
. - All the input string will only include lower case letters.
- The returned elements order does not matter.
思路:
这道题目既可以用DFS来求解,也可以用DP来求解,但我感觉DP的代码更加简洁优雅一些。思路是:对于words中的每个单词w,我们定义一个数组dp[n+1],如果dp[i] == true,则表示w.substr(0, i)可以由words中的已有单词连接而成。那么状态转移方程就是:dp[i] = || {dp[j] && w.substr(j + 1, i - j) is in words},其中j < i。最终检查dp[n]是否为true,如果是则将其加入结果集中。为了加速对words中的单词的查找,我们用一个哈希表来保存各个单词。这样时间复杂度可以降低到O(n * m^2),其中n是words中的单词的个数,m是每个单词的平均长度(或者最大长度?)。
代码:
class Solution {
public:
vector<string> findAllConcatenatedWordsInADict(vector<string>& words) {
unordered_set<string> hash(words.begin(), words.end()); // using hash for acceleration
vector<string> res;
for (auto w : words) {
int n = w.size();
vector<bool> dp(n + 1, false); // store whether w.substr(0, i) can be formed by existing words
dp[0] = true; // empty string is always valid
for (int i = 0; i < n; ++i) {
if (dp[i] == 0) { // cannot start from here
continue;
}
for (int j = i + 1; j <= n; ++j) { // check whether w.substr(i, j - i) can be concatenated from i
if (j - i < n && hash.count(w.substr(i, j - i))) {// cannot be itself
dp[j] = true;
}
}
if (dp[n]) {
res.push_back(w); break;
}
}
}
return res;
}
};