Sicily 1008 Translations

最新推荐文章于 2017-03-29 00:15:45 发布

light_14

最新推荐文章于 2017-03-29 00:15:45 发布

阅读量510

点赞数

CC 4.0 BY-SA版权

分类专栏： sicily 文章标签： sicily 图论

本文链接：https://blog.youkuaiyun.com/light_14/article/details/50822148

sicily 专栏收录该内容

41 篇文章

订阅专栏

本文介绍了一种特定的图同构问题背景下的解决方案，通过构建词组之间的关系图，并利用VF2算法实现自动翻译词组的匹配。文章详细解释了如何将语言翻译问题转化为图同构问题，进而解决翻译文件中因排序导致的翻译错位。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Description

Bob Roberts is in charge of performing translations of documents between various languages. To aidhim in this endeavor his bosses have provided him with translation files. These files come in twos -- onecontaining sample phrases in one of the languages and the other containing their translations into theother language. However, some over-zealous underling, attempting to curry favor with the higher-upswith his initiative, decided to alphabetically sort the contents of all of the files, losing the connectionsbetween the phrases and their translations. Fortunately, the lists are comprehensive enough that theoriginal translations can be reconstructed from these sorted lists. Bob has found this is most usuallythe case when the phrases all consist of two words. For example, given the following two lists: Language 1 Phrases Language 2 Phrases arlo zym bus seat flub pleve bus stop pleve dourm hot seat pleve zym school busBob is able to determine that arlo means hot, zym means seat, flub means school, pleve means bus, anddourm means stop. After doing several of these reconstructions by hand, Bob has decided to automatethe process. And if Bob can do it, then so can you.

Input

Input will consist of multiple input sets. Each input set starts with a positive integer n, n 250,indicating the number of two-word phrases in each language. This is followed by 2n lines, each containingone two-word phrase: the first n lines are an alphabetical list of phrases in the first language, and theremaining n lines are an alphabetical list of their translations into the second language. Only upper andlower case alphabetic characters are used in the words. No input set will involve more than 25 distinctwords. No word appears as the first word in more than 10 phrases for any given language; likewise, noword appears as the last word in more than 10 phrases. A line containing a single 0 follows the lastproblem instance, indicating end of input.

Output

For each input set, output lines of the formword1/word2where word1 is a word in the first language and word2 is the translation of word1 into the secondlanguage, and a slash separates the two. The output lines should be sorted according to the firstlanguage words, and every first language word should occur exactly once. There should be no whitespace in the output, apart from a single blank line separating the outputs from different input sets.Imitate the format of the sample output, below. There is guaranteed to be a unique correct translationcorresponding to each input instance.

Sample Input

Sample Input

4
arlo zym
flub pleve
pleve dourm
pleve zym
bus seat
bus stop
hot seat
school bus
2
iv otas
otas re
ec t
eg ec
0

Sample Output

arlo/hot
dourm/stop
flub/school
pleve/bus
zym/seat

iv/eg
otas/ec
re/t

Solution

给出一个n，接下来有n个词组是用一种语言写的，还有n个词组是前面词组的翻译，词是一一对应的，但是关系的对应被打乱了。这2n个词组按在组内按字母顺序排列。然后需要根据这些词组来找出每个词和与之对应的翻译的意思。

一开始看的时候，感觉就像是统计一下出现次数，然后一一对应。仔细观察之后发现，并不是这么回事，很多词出现次数一样，那怎么决定那个词的翻译呢？从关系上入手是一个很好的切入点。然后解析了一下，把关系建图，将词看成是节点，每个词组就是两个词加一条有向边。于是乎这道题目就变成了图的同构问题。只要找到图的一一对应就可以了。关于图的同构的VF2算法，这篇文章帮助很大：http://blog.youkuaiyun.com/luo123n/article/details/49787183

一步一步来套用VF2算法，即可，因为数据量不大，可以简化模型，并不需要剪枝也是ok的。代码如下

#include <iostream>
#include <string.h>
#include <string>
#include <map>
#include <vector>

using namespace std;

int l1[55][55], l2[55][55], res[55], size;
bool used[55];

bool check(const vector<int>& a, const vector<int>& b)
{
  bool ans = true;
  for (int i = 0; i < a.size(); ++i) if (res[a[i]] != -1)
  {
    int j, m = res[a[i]];
    for (j = 0; j < b.size(); ++j) if (m == b[j]) break;
    if (j == b.size())
    {
      ans = false;
      break;
    }
  }
  return ans;
}

bool find(int n)
{
  if (n == size) return true;
  vector<int> npred, nsucc, mpred, msucc;
  npred.resize(0);
  nsucc.resize(0);
  for (int i = 0; i < size; ++i)
  {
    if (l1[i][n]) npred.push_back(i);
    if (l1[n][i]) nsucc.push_back(i);
  }

  for (int i = 0; i < size; ++i) if (!used[i])
  {
    mpred.resize(0);
    msucc.resize(0);
    for (int j = 0; j < size; ++j)
    {
      if (l2[j][i]) mpred.push_back(j);
      if (l2[i][j]) msucc.push_back(j);
    }
    if (npred.size() == mpred.size() && nsucc.size() == msucc.size())
    {
      if (check(npred, mpred) && check(nsucc, msucc))
      {
        res[n] = i; used[i] = true;
        if (find(n + 1)) return true;
        else res[n] = -1, used[i] = false;
      }
    }
  }
  return false;
}

int main()
{
  int n;
  bool flag = true;

  while (cin >> n && n)
  {
    memset(l1, 0, sizeof(l1));
    memset(l2, 0, sizeof(l2));
    memset(res, -1, sizeof(res));
    memset(used, 0, sizeof(used));
    map<string, int> lm1, lm2;
    lm1.clear();
    lm2.clear();
    string s1, s2, words1[55], words2[55];
    int count = 0;

    for (int i = 0; i < n; ++i)
    {
      cin >> s1 >> s2;
      if (lm1.count(s1) == 0) lm1[s1] = count, words1[count++] = s1;
      if (lm1.count(s2) == 0) lm1[s2] = count, words1[count++] = s2;
      l1[lm1[s1]][lm1[s2]] = 1;
    }
    count = 0;
    for (int i = 0; i < n; ++i)
    {
      cin >> s1 >> s2;
      if (lm2.count(s1) == 0) lm2[s1] = count, words2[count++] = s1;
      if (lm2.count(s2) == 0) lm2[s2] = count, words2[count++] = s2;
      l2[lm2[s1]][lm2[s2]] = 1;
    }

    size = count;
    if (find(0))
    {
      map<string, string> ans;
      ans.clear();
      for (int i = 0; i < size; ++i) ans[words1[i]] = words2[res[i]];
      map<string, string>::iterator it;
      if (flag) flag = false;
      else cout << endl;
      for (it = ans.begin(); it != ans.end(); ++it) cout << it->first << '/' << it->second << endl;
    }
    else
      cout << "Can not find!" << endl;
  }

  return 0;
}