Thursday, November 27, 2014

Total of Subarray

[Question]
Given an array A of numbers and a target number N, check if there is a contiguous sequence in array which sums to the target N.

[Analysis]
To match a target number, the best way is to use hash-table. The sum of contiguous sub-array Sum[i,j] = Sum[0,j] - Sum[0, i-1]. Assume all Sum[0, i] , N needs to be the difference of  Sum[0, k] and Sum[0, l]. Time complexity is O(N).

[Solution]
bool maxOfSubArray ( vector<int> vec, int target ) {
    vector<int> sum (vec.size(), vec[0]);
    for (int i=1; i<vec.size(); i++)
         sum[i] =sum[i-1]+vec[i];
 
    unordered_set<int> sum_set ( sum.begin(), sum.end() );
    for (auto s: sum_set) {
        if ( sum_set.find(s+target) != sum_set.end() )
             return true;
    }
    return false;
}

Maximum Subarray -- Leetcode 53

[Question]:
Find the contiguous subarray within an array (containing at least one number) which has the largest sum.
For example, given the array [−2,1,−3,4,−1,2,1,−5,4],
the contiguous subarray [4,−1,2,1] has the largest sum = 6.

[Analysis]:
Assume S[i] is the maximum sub-array which ends at A[i]. Then S[i] = max (S[i-1]+A[i], A[i]), i>0; S[0] = A[0]. The maximum of all S[i] is the result.  Time complexity is O(N).

Another solution is made by simple math.
[Solution]:
//-- DP--
class Solution {
public:
    int maxSubArray(int A[], int n) {
        assert(n>0);
     
        vector<int> sum(n, A[0]);
        for (int i=1; i<n; i++) {
            sum[i] = max (sum[i-1]+A[i], A[i]);
        }
        return *max_element( sum.begin(), sum.end() );
    }
};

//-- Running sum --
class Solution {
public:
    int maxSubArray(int A[], int n) {
        int ans=A[0],i,j,sum=0;
        for(i=0;i<n;i++){
            sum+=A[i];
            ans=max(sum,ans);
            sum=max(sum,0);
        }
        return ans;
    }
};

Wednesday, November 12, 2014

Word Ladder II -- Leetcode

[Question]

Given two words (start and end), and a dictionary, find all shortest transformation sequence(s) from start to end, such that:
  • Only one letter can be changed at a time
  • Each intermediate word must exist in the dictionary
For example,

Given:
    start = "hit"
    end = "cog"
    dict = ["hot","dot","dog","lot","log"]

Return:
    [ ["hit","hot","dot","dog","cog"], ["hit","hot","lot","log","cog"] ]

Note:
All words have the same length.
All words contain only lowercase alphabetic characters.

[Analysis]
The problem is about to find all possible paths between two nodes in a graph. To get the paths, it is normal to use DFS (Depth First Search). The drawback to apply DFS is, unlike tree, there exist many circles in a graph. Most depth first routes will not reach the target and will be extremely time-consuming.

Meanwhile, BFS (Breadth First Search) will help to reach the target quickly (i.e. find the shortest path from start to end), like what we have seen in Word Ladder question. However, BFS visits nodes layer by layer and does not maintain the paths of nodes from start to end.

The idea is to combine the benefits of BFS and DFS together:
  1. Apply BFS to reach target starting from "start" layer by layer. So that we can know by the number of steps in the shortest path from start to end. Any nodes outside the visited nodes during this procedure need no further consideration.
  2. During the above BFS procedure, we construct a (directional) sub-graph where each node A in layer i+1 points to the connected nodes (where A comes from) in layer i. If A is the target ("end"), we know back-trace to the start using the path in this sub-graph. Any nodes outside this sub-graph are irrelevant to the shortest path from "start" to "end". 
  3. Apply DFS starting from "end" using the sub-graph above. All shortest paths will be generated.
It is interesting to consider the data structure to use for the algorithm. For the BFS, queue is not a fit in this case because all connections between layer i and layer i+1 need to be collected. Queue-based BFS like in Word Ladder question actually will lose those information. It is recommended to use set in this case.



[Solution]

class Solution {
public:
    unordered_set<string> getChildren2(string s, unordered_set<string> &dict) {
        unordered_set<string> rslt;
        for (int i=0; i<s.size(); i++) {
            string tmp = s;
            for (char c='a'; c<='z'; c++) {
                tmp[i]=c;
                if (dict.find(tmp)!=dict.end())
                    rslt.insert(tmp);
            }
        }
        return rslt;
    }
 
    vector<vector<string>> pathsInGraph(string start, string end, unordered_map<string, unordered_set<string>> &graph, int depth) {
        vector<vector<string>> rslt;
        if (depth==0 || graph.find(end)==graph.end()) return rslt;
        if (depth==1) {
            rslt.push_back({start, end});
            return rslt;
        }
        for (auto w: graph[end]) {
            vector<vector<string>> paths = pathsInGraph(start, w, graph, depth-1);
            for (auto& v: paths) {
                v.push_back(end);
                rslt.push_back(v);
            }
        }
        return rslt;
    }
 
    vector<vector<string>> findLadders(string start, string end, unordered_set<string> &dict) {
        unordered_set<string> unvisited = dict;
        unvisited.erase( start );
        unvisited.insert(end);
     
        unordered_map<string, unordered_set<string>> graph;
     
        unordered_set<string> que1, que2;
        que1.insert(start);
     
        int len = 0;
        while (!unvisited.empty() && !que1.empty()) {
            for (auto w: que1) {
                unordered_set<string> children = getChildren2(w, unvisited);
                for (auto chld : children) {
                    graph[chld].insert( w ); //reverse route
                    que2.insert( chld );
                }
            }
            len ++;
            if (que2.find(end) != que2.end())    //found
                break;
         
            que1 = que2;          
            for (auto w: que2) unvisited.erase( w );
            que2.clear();  // ... must clear que2; waste hours because of missing this line.
        }
     
        if (que1.empty()) return vector<vector<string>>();
        return pathsInGraph(start, end, graph, len);
    }
};

Monday, November 10, 2014

Word Ladder -- Leetcode

[Question]

Given two words (start and end), and a dictionary, find the length of shortest transformation sequence from start to end, such that:
  1. Only one letter can be changed at a time 
  2. Each intermediate word must exist in the dictionary 
For example,

Given:
  • start = "hit"
  • end = "cog"
  • dict = ["hot","dot","dog","lot","log"]
As one shortest transformation is "hit" -> "hot" -> "dot" -> "dog" -> "cog",
return its length 5.

Note:
  • Return 0 if there is no such transformation sequence. 
  • All words have the same length. 
  • All words contain only lowercase alphabetic characters.
[Analysis]
The dictionary of words composed a word graph where two nodes of words are (bi-directionally, undirected) connected when there is only one character difference between two words: for example, "hot" and "dot". 

A BFS (Breadth First Search) will give the shortest path between two nodes in a graph. Please note, there is no need to construct the graph physically -- that will be time-consuming. Using Set/unordered_set can give the siblings of a node, which is sufficient to get the result.

Another catch is, how to find the sibling words of a given word. The straightforward solution is to compare the given word with every word unvisited in dictionary. The comparison will be costly when the dictionary is large (O(N), N is the number of words in dictionary). Due to the assumptions in the question, the number of possible sibling words is 25*L, where L is the length of the word -- see "getUnvisitedSiblings()" below.

[Solution]
//-- C++ --
class Solution {
    unordered_set<string> getUnvisitedSiblings( string& str, unordered_set<string>& unvisited ) {
        unordered_set<string> rslt;
        for (int i=0; i<str.size(); i++) {
            string tmp = str;
            for (char c='a'; c<='z'; c++) {
                tmp[i] = c;
                if (unvisited.count(tmp))
                    rslt.insert( tmp );
            }
        }
        return rslt;
    } 

public:
    int ladderLength(string beginWord, string endWord, vector<string>& wordList) {
        queue<string> q;
        unordered_set<string> unvisited(wordList.begin(), wordList.end());
        unvisited.erase(beginWord); //make sure no redundant steps.
    
        q.push( beginWord );
        int len = q.size();    //size of current queue.
       
        int step = 1;
        while (!q.empty() && q.front()!=endWord) {
            string cur = q.front();
            q.pop(), len--;
           
            unordered_set<string> siblings = getUnvisitedSiblings( cur, unvisited );
            for (auto s: siblings) {
                q.push(s);
                unvisited.erase(s);
            }
        
            if (len==0) { // new level of BFS is reached.
                step++;
                len = q.size();
            }
        }

        if (!q.empty() && q.front()==endWord) return step;
        return 0;       
    }
};