Wednesday, November 12, 2014

Word Ladder II -- Leetcode

[Question]

Given two words (start and end), and a dictionary, find all shortest transformation sequence(s) from start to end, such that:
  • Only one letter can be changed at a time
  • Each intermediate word must exist in the dictionary
For example,

Given:
    start = "hit"
    end = "cog"
    dict = ["hot","dot","dog","lot","log"]

Return:
    [ ["hit","hot","dot","dog","cog"], ["hit","hot","lot","log","cog"] ]

Note:
All words have the same length.
All words contain only lowercase alphabetic characters.

[Analysis]
The problem is about to find all possible paths between two nodes in a graph. To get the paths, it is normal to use DFS (Depth First Search). The drawback to apply DFS is, unlike tree, there exist many circles in a graph. Most depth first routes will not reach the target and will be extremely time-consuming.

Meanwhile, BFS (Breadth First Search) will help to reach the target quickly (i.e. find the shortest path from start to end), like what we have seen in Word Ladder question. However, BFS visits nodes layer by layer and does not maintain the paths of nodes from start to end.

The idea is to combine the benefits of BFS and DFS together:
  1. Apply BFS to reach target starting from "start" layer by layer. So that we can know by the number of steps in the shortest path from start to end. Any nodes outside the visited nodes during this procedure need no further consideration.
  2. During the above BFS procedure, we construct a (directional) sub-graph where each node A in layer i+1 points to the connected nodes (where A comes from) in layer i. If A is the target ("end"), we know back-trace to the start using the path in this sub-graph. Any nodes outside this sub-graph are irrelevant to the shortest path from "start" to "end". 
  3. Apply DFS starting from "end" using the sub-graph above. All shortest paths will be generated.
It is interesting to consider the data structure to use for the algorithm. For the BFS, queue is not a fit in this case because all connections between layer i and layer i+1 need to be collected. Queue-based BFS like in Word Ladder question actually will lose those information. It is recommended to use set in this case.



[Solution]

class Solution {
public:
    unordered_set<string> getChildren2(string s, unordered_set<string> &dict) {
        unordered_set<string> rslt;
        for (int i=0; i<s.size(); i++) {
            string tmp = s;
            for (char c='a'; c<='z'; c++) {
                tmp[i]=c;
                if (dict.find(tmp)!=dict.end())
                    rslt.insert(tmp);
            }
        }
        return rslt;
    }
 
    vector<vector<string>> pathsInGraph(string start, string end, unordered_map<string, unordered_set<string>> &graph, int depth) {
        vector<vector<string>> rslt;
        if (depth==0 || graph.find(end)==graph.end()) return rslt;
        if (depth==1) {
            rslt.push_back({start, end});
            return rslt;
        }
        for (auto w: graph[end]) {
            vector<vector<string>> paths = pathsInGraph(start, w, graph, depth-1);
            for (auto& v: paths) {
                v.push_back(end);
                rslt.push_back(v);
            }
        }
        return rslt;
    }
 
    vector<vector<string>> findLadders(string start, string end, unordered_set<string> &dict) {
        unordered_set<string> unvisited = dict;
        unvisited.erase( start );
        unvisited.insert(end);
     
        unordered_map<string, unordered_set<string>> graph;
     
        unordered_set<string> que1, que2;
        que1.insert(start);
     
        int len = 0;
        while (!unvisited.empty() && !que1.empty()) {
            for (auto w: que1) {
                unordered_set<string> children = getChildren2(w, unvisited);
                for (auto chld : children) {
                    graph[chld].insert( w ); //reverse route
                    que2.insert( chld );
                }
            }
            len ++;
            if (que2.find(end) != que2.end())    //found
                break;
         
            que1 = que2;          
            for (auto w: que2) unvisited.erase( w );
            que2.clear();  // ... must clear que2; waste hours because of missing this line.
        }
     
        if (que1.empty()) return vector<vector<string>>();
        return pathsInGraph(start, end, graph, len);
    }
};

No comments:

Post a Comment