3

Im looking at a way of doing a partial match using a binary search. Here is my code:

public void checkCardIndexForMatches(List<String> wordsToCheck) throws IOException {
    String[] cardIndexCache = cardIndexCreator.getCardIndexCache();

    for (String text: wordsToCheck){
        int i = Arrays.binarySearch(cardIndexCache, text.getText().toLowerCase().trim());

        if (i > 0){
            text.setCardIndexMatch(true);
        }
        //check if partial match
        //                  else if 
    }
}

So its pretty simple stuff so far - there is basically an external file which gets fed in, each line in the file is stored as an array in the cardIndexCache. The problem comes when the user wants to be able to match a 'phrase' in the array (a phrase being more than one word e.g. Mohammed Ali). The words in the wordsToCheck parameter are only passed in as individual words. So the first word will be Mohammed but the binary search fails as it doesnt know what the second word is. I cant think of a simple way of getting the binary search to indicate that a word has the potential to be a match (first part matches, just append the next word and see if that matches).

Any ideas much appreciated!

4

1 回答 1

0

这是我编写的一个 Trie 以及一个基本上可以找到最大公共前缀的搜索功能,相似性搜索是可能的,但成本很高..


class TNode
{
    public MapList<Char,TNode> next;

    public TNode ()
    {
        next = new MapList<Char,TNode>();
    }
}

class Trie
{
    TNode head;

    public Trie ()
    {
        head = new TNode();
    }

    public void insert (String t)
    {
        TNode cur = head;

        for (Char c : t.toCharArray())
        {
            if (!cur.next.containsKey(c))
            {
                cur.next.put(c,new TNode());
            }
            cur = cur.next.get(c);
        }
    }

    public boolean remove (String t)
    {
        Stack<Pair<Char,TNode>> path = new Stack<Pair<Char,TNode>>();
        TNode cur = head;
        Pair<Char,TNode> n = null;

        for (Char c : t.toCharArray())
        {
            if (!cur.next.containsKey(c))
            {
                return false;
            }
            path.push(c,cur);
            cur = cur.next.get(c);
        }

        while (path.size() > 0)
        {
            n = path.pop();
            if (n.getSecond().next.get(n.getFirst()).next.size() > 1)
            {
                break;
            }
            else
            {
                n.getSecond().next.remove(n.getFirst());
            }
        }
    }

    public boolean search (String t)
    {
        TNode cur = head;

        for (Char c : t.toCharArray())
        {
            if (!cur.next.containsKey(c))
            {
                return false;
            }
            cur = cur.next.get(c);
        }

        return true;
    }

    public String searchMaxPrefix (String t)
    {
        TNode cur = head;
        String match = "";

        for (Char c : t.toCharArray())
        {
            if (cur.next.containsKey(c))
            {
                match += c;
            }
            else
            {
                break;
            }
        }

        return match;
    }
}
于 2013-06-18T05:26:53.083 回答