Sicily 1008 Translations

本文介绍了一种特定的图同构问题背景下的解决方案,通过构建词组之间的关系图,并利用VF2算法实现自动翻译词组的匹配。文章详细解释了如何将语言翻译问题转化为图同构问题,进而解决翻译文件中因排序导致的翻译错位。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Description
Bob Roberts is in charge of performing translations of documents between various languages. To aidhim in this endeavor his bosses have provided him with translation files. These files come in twos -- onecontaining sample phrases in one of the languages and the other containing their translations into theother language. However, some over-zealous underling, attempting to curry favor with the higher-upswith his initiative, decided to alphabetically sort the contents of all of the files, losing the connectionsbetween the phrases and their translations. Fortunately, the lists are comprehensive enough that theoriginal translations can be reconstructed from these sorted lists. Bob has found this is most usuallythe case when the phrases all consist of two words. For example, given the following two lists: Language 1 Phrases Language 2 Phrases arlo zym bus seat flub pleve bus stop pleve dourm hot seat pleve zym school busBob is able to determine that arlo means hot, zym means seat, flub means school, pleve means bus, anddourm means stop. After doing several of these reconstructions by hand, Bob has decided to automatethe process. And if Bob can do it, then so can you.
Input
Input will consist of multiple input sets. Each input set starts with a positive integer n, n 250,indicating the number of two-word phrases in each language. This is followed by 2n lines, each containingone two-word phrase: the first n lines are an alphabetical list of phrases in the first language, and theremaining n lines are an alphabetical list of their translations into the second language. Only upper andlower case alphabetic characters are used in the words. No input set will involve more than 25 distinctwords. No word appears as the first word in more than 10 phrases for any given language; likewise, noword appears as the last word in more than 10 phrases. A line containing a single 0 follows the lastproblem instance, indicating end of input.
Output
For each input set, output lines of the formword1/word2where word1 is a word in the first language and word2 is the translation of word1 into the secondlanguage, and a slash separates the two. The output lines should be sorted according to the firstlanguage words, and every first language word should occur exactly once. There should be no whitespace in the output, apart from a single blank line separating the outputs from different input sets.Imitate the format of the sample output, below. There is guaranteed to be a unique correct translationcorresponding to each input instance.
Sample Input
Sample Input
4
arlo zym
flub pleve
pleve dourm
pleve zym
bus seat
bus stop
hot seat
school bus
2
iv otas
otas re
ec t
eg ec
0
Sample Output
arlo/hot
dourm/stop
flub/school
pleve/bus
zym/seat

iv/eg
otas/ec
re/t

Solution

给出一个n,接下来有n个词组是用一种语言写的,还有n个词组是前面词组的翻译,词是一一对应的,但是关系的对应被打乱了。这2n个词组按在组内按字母顺序排列。然后需要根据这些词组来找出每个词和与之对应的翻译的意思。

一开始看的时候,感觉就像是统计一下出现次数,然后一一对应。仔细观察之后发现,并不是这么回事,很多词出现次数一样,那怎么决定那个词的翻译呢?从关系上入手是一个很好的切入点。然后解析了一下,把关系建图,将词看成是节点,每个词组就是两个词加一条有向边。于是乎这道题目就变成了图的同构问题。只要找到图的一一对应就可以了。关于图的同构的VF2算法,这篇文章帮助很大:http://blog.youkuaiyun.com/luo123n/article/details/49787183

一步一步来套用VF2算法,即可,因为数据量不大,可以简化模型,并不需要剪枝也是ok的。代码如下

#include <iostream>
#include <string.h>
#include <string>
#include <map>
#include <vector>

using namespace std;

int l1[55][55], l2[55][55], res[55], size;
bool used[55];

bool check(const vector<int>& a, const vector<int>& b)
{
  bool ans = true;
  for (int i = 0; i < a.size(); ++i) if (res[a[i]] != -1)
  {
    int j, m = res[a[i]];
    for (j = 0; j < b.size(); ++j) if (m == b[j]) break;
    if (j == b.size())
    {
      ans = false;
      break;
    }
  }
  return ans;
}

bool find(int n)
{
  if (n == size) return true;
  vector<int> npred, nsucc, mpred, msucc;
  npred.resize(0);
  nsucc.resize(0);
  for (int i = 0; i < size; ++i)
  {
    if (l1[i][n]) npred.push_back(i);
    if (l1[n][i]) nsucc.push_back(i);
  }

  for (int i = 0; i < size; ++i) if (!used[i])
  {
    mpred.resize(0);
    msucc.resize(0);
    for (int j = 0; j < size; ++j)
    {
      if (l2[j][i]) mpred.push_back(j);
      if (l2[i][j]) msucc.push_back(j);
    }
    if (npred.size() == mpred.size() && nsucc.size() == msucc.size())
    {
      if (check(npred, mpred) && check(nsucc, msucc))
      {
        res[n] = i; used[i] = true;
        if (find(n + 1)) return true;
        else res[n] = -1, used[i] = false;
      }
    }
  }
  return false;
}

int main()
{
  int n;
  bool flag = true;

  while (cin >> n && n)
  {
    memset(l1, 0, sizeof(l1));
    memset(l2, 0, sizeof(l2));
    memset(res, -1, sizeof(res));
    memset(used, 0, sizeof(used));
    map<string, int> lm1, lm2;
    lm1.clear();
    lm2.clear();
    string s1, s2, words1[55], words2[55];
    int count = 0;

    for (int i = 0; i < n; ++i)
    {
      cin >> s1 >> s2;
      if (lm1.count(s1) == 0) lm1[s1] = count, words1[count++] = s1;
      if (lm1.count(s2) == 0) lm1[s2] = count, words1[count++] = s2;
      l1[lm1[s1]][lm1[s2]] = 1;
    }
    count = 0;
    for (int i = 0; i < n; ++i)
    {
      cin >> s1 >> s2;
      if (lm2.count(s1) == 0) lm2[s1] = count, words2[count++] = s1;
      if (lm2.count(s2) == 0) lm2[s2] = count, words2[count++] = s2;
      l2[lm2[s1]][lm2[s2]] = 1;
    }

    size = count;
    if (find(0))
    {
      map<string, string> ans;
      ans.clear();
      for (int i = 0; i < size; ++i) ans[words1[i]] = words2[res[i]];
      map<string, string>::iterator it;
      if (flag) flag = false;
      else cout << endl;
      for (it = ans.begin(); it != ans.end(); ++it) cout << it->first << '/' << it->second << endl;
    }
    else
      cout << "Can not find!" << endl;
  }

  return 0;
}


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值