单词距离

题目描述

来源于 https://leetcode-cn.com/

有个内含单词的超大文本文件,给定任意两个单词,找出在这个文件中这两个单词的最短距离(相隔单词数)。如果寻找过程在这个文件中会重复多次,而每次寻找的单词不同,你能对此优化吗?

示例:

输入:words = ["I","am","a","student","from","a","university","in","a","city"], word1 = "a", word2 = "student"
输出:1

提示:

  • words.length <= 100000

解法:

单次运行,下面这样的方法就可以了。

class Solution {
public:
    int findClosest(vector<string>& words, string word1, string word2) {
        int pos1 = -1, pos2 = -1;
        int min_distance = words.size();
        for(int i=0;i<words.size();i++){
            if(words[i] == word1){
                pos1 = i;
                if(pos2 != -1){
                    min_distance = min(min_distance, abs(pos1 - pos2));
                }
            }else if(words[i] == word2){
                pos2 = i;
                if(pos1 != -1){
                    min_distance = min(min_distance, abs(pos1 - pos2));
                }
            }
        }
        return min_distance;
    }
};

如果要多次运行,那可以先把所有单词出现的下标统计出来。然后给定两个单词,在两个下标数组中,寻找距离最近的一对下标即可。

在 (16.06. 最小差)[.\16.06-smallest-difference-lcci.md] 中给出了找两个数组中最相近的两个元素的方法。

class Text {
public:
    Text(vector<string>& words){
        for(int i=0;i<words.size();i++){
            mp[words[i]].push_back(i);
        } 
    }
    int findClosest(string word1, string word2) {
        vector<int> a = mp[word1];
        vector<int> b = mp[word2];
        int i = 0, j = 0;

        int min_distance = words.size();
        while(i < a.size() && j < b.size()){
            min_distance = min(min_distance, abs(a[i] - b[j]));
            if(a[i] < b[j]){
                i++;
            }else{
                j++;
            }
        }
        return min_distance;
    }
private:
    unordered_map<string, vector<int>> mp;
};