算法（第4版） Chapter 5.2 单词查找树_C++教程

上一篇: Playing with __attributes__ （... 下一篇:C语言学习资料汇集助你成为更好...

算法（第4版） Chapter 5.2 单词查找树

发布时间：2019-08-06 发布网站：脚本宝典

脚本宝典收集整理的这篇文章主要介绍了算法（第4版） Chapter 5.2 单词查找树，脚本宝典觉得挺不错的，现在分享给大家，也给大家做个参考。

AlgorIThms Fourth Edition
Written By Robert SEdgewick &amp; Kevin Wayne
Translated By 谢路云
Chapter 5 Section 2 单词查找树


查找所需要的单词的时间和键的长度成正比
查找未命中只需检查若干个单词

单词查找树
单词查找树API

基本性质


每个链接对应一个字符

每个结点可能有一个值

有值，说明存在从根结点到这个结点的字符串。
没有值，说明不存在从根结点到这个结点的字符串。没有对应的键值。它的存在是为了简化查询。





查找

命中： 对应结点有值（注意不单单是指向该对应结点d链接存在，并且该结点一定要有值！）
未命中： 没有值 or 链接为空




结点的表示

每个结点下面有R个链接，一个链接对应一个字符
键隐式地保存在结点里


TrieST 代码

      
      
      PE="button" class="copyCode code-tool" data-toggle="tooltip" data-placement="top" data-clipboard-text="public class TrieST {
    PRivate static int R = 256; // radix
    private Node root; // root of trie

    private static class Node {
        private Object val;
        private Node[] next = new Node[R];
    }

    public Value get(String key) {
        Node x = get(root, key, 0);
        if (x == null)
            return null;
        return (Value) x.val;
    }
    
    // Return value associated with key in the suBTrie rooted at x.
    private Node get(Node x, String key, int d) { 
        if (x == null)
            return null;
        if (d == key.length())
            return x;
        char c = key.charAt(d); // Use dth key char to identify subtrie.
        return get(x.next[c], key, d + 1);
    }

    public void put(String key, Value val) {
        root = put(root, key, val, 0);
    }
    
    // Change value associated with key if in subtrie rooted at x.
    private Node put(Node x, String key, Value val, int d) { //神一般的递归思想
        if (x == null)
            x = new Node();
        if (d == key.length()) {
            x.val = val;
            return x;
        }
        char c = key.charAt(d); // Use dth key char to identify subtrie.
        x.next[c] = put(x.next[c], key, val, d + 1); //神来之笔
        return x;
    }

    public Iterable keys() {
        return keysWithPrefix("");
    }

    public Iterable keysWithPrefix(String pre) {
        Queue q = new Queue();
        collect(get(root, pre, 0), pre, q);
        return q;
    }

    private void collect(Node x, String pre, Queue q) {
        if (x == null)
            return;
        if (x.val != null)
            q.enqueue(pre);
        for (char c = 0; c < R; c++) //这个效率简直了。。。烂到爆炸
            collect(x.next[c], pre + c, q);
    }
}" title="" data-original-title="复制">
      
      
public class TrieST<Value> {
    private static int R = 256; // radix
    private Node root; // root of trie

    private static class Node {
        private Object val;
        private Node[] next = new Node[R];
    }

    public Value get(String key) {
        Node x = get(root, key, 0);
        if (x == null)
            return null;
        return (Value) x.val;
    }
    
    // Return value associated with key in the subtrie rooted at x.
    private Node get(Node x, String key, int d) { 
        if (x == null)
            return null;
        if (d == key.length())
            return x;
        char c = key.charAt(d); // Use dth key char to identify subtrie.
        return get(x.next[c], key, d + 1);
    }

    public void put(String key, Value val) {
        root = put(root, key, val, 0);
    }
    
    // Change value associated with key if in subtrie rooted at x.
    private Node put(Node x, String key, Value val, int d) { //神一般的递归思想
        if (x == null)
            x = new Node();
        if (d == key.length()) {
            x.val = val;
            return x;
        }
        char c = key.charAt(d); // Use dth key char to identify subtrie.
        x.next[c] = put(x.next[c], key, val, d + 1); //神来之笔
        return x;
    }

    public Iterable<String> keys() {
        return keysWithPrefix("");
    }

    public Iterable<String> keysWithPrefix(String pre) {
        Queue<String> q = new Queue<String>();
        collect(get(root, pre, 0), pre, q);
        return q;
    }

    private void collect(Node x, String pre, Queue<String> q) {
        if (x == null)
            return;
        if (x.val != null)
            q.enqueue(pre);
        for (char c = 0; c < R; c++) //这个效率简直了。。。烂到爆炸
            collect(x.next[c], pre + c, q);
    }
}

方法keys：返回一个Queue，里面有所有的字符串
方法keysWithPrefix(String pre)：返回一个Queue，里面有所有以给定字符串pre开头的所有字符串
方法collect：递归用

通配符匹配


结构差异

不含通配符的结构。keysWithPrefix(); get(); collect();
含通配符的结构。keysWithPrefix(); collect();


为什么结构和之前不一样呢？因为get方法要重写。直接把重写的get方法归并进collect方法里去了。


      
      
      hatMatch(String pat) {
        Queue q = new Queue();
        collect(root, "", pat, q);
        return q;
    }

    public void collect(Node x, String pre, String pat, Queue q) {
        int d = pre.length(); 
        // begin修改于原get方法()和collect方法()
        if (x == null) // get&collect
            return;
        if (d == pat.length() && x.val != null) // 前collect，后get
            q.enqueue(pre);
        if (d == pat.length())// get
            return; 
        // end 修改于原get方法()和collect方法()
        
        char next = pat.charAt(d);
        for (char c = 0; c < R; c++)
            if (next == '.' || next == c) //如果next=='.' 就要遍历接在当前字符后面的所有字符！！！
                collect(x.next[c], pre + c, pat, q); //还是神一般的递归想法
    }" title="" data-original-title="复制">
      
      
    public Iterable<String> keysThatMatch(String pat) {
        Queue<String> q = new Queue<String>();
        collect(root, "", pat, q);
        return q;
    }

    public void collect(Node x, String pre, String pat, Queue<String> q) {
        int d = pre.length(); 
        // begin修改于原get方法()和collect方法()
        if (x == null) // get&collect
            return;
        if (d == pat.length() && x.val != null) // 前collect，后get
            q.enqueue(pre);
        if (d == pat.length())// get
            return; 
        // end 修改于原get方法()和collect方法()
        
        char next = pat.charAt(d);
        for (char c = 0; c < R; c++)
            if (next == '.' || next == c) //如果next=='.' 就要遍历接在当前字符后面的所有字符！！！
                collect(x.next[c], pre + c, pat, q); //还是神一般的递归想法
    }

      
      
      
      
      
private Node get(Node x, String key, int d) { 
        if (x == null)
            return null;
        if (d == key.length())
            return x;
        char c = key.charAt(d); // Use dth key char to identify subtrie.
        return get(x.next[c], key, d + 1);
    }
最长前缀

      
      
      
      
      
    public String longestPrefixOf(String s) {
        int length = seArch(root, s, 0, 0);
        return s.substring(0, length);
    }

    //当前搜索的是第d位字符，返回的是具有最长length位前缀的子字符
    private int search(Node x, String s, int d, int length) {
        if (x == null) // 查到单词树的尽头了
            return length;
        if (x.val != null) // 如果存在这个单词，就更新length值
            length = d;
        if (d == s.length()) // 查到字符串的尽头了（一定要先做上一步）
            return length;
        char c = s.charAt(d);
        return search(x.next[c], s, d + 1, length); //递归搜索下一位
    }
删除操作

依旧是神一般的递归思路

如果它的链接不为空
删去这个结点的值即可


如果它的所有链接都为空

删去这个结点

检查这个结点的父结点的所有链接是否为空

不为空，结束
为空，删去父结点并检查父结点的父结点是否为空
  ……循环如此






      
      
      lete(String key) {
        root = delete(root, key, 0);
    }

    private Node delete(Node x, String key, int d) {
        if (x == null)
            return null;
        if (d == key.length()) //找到了的话就将该结点的值删去
            x.val = null;
        else {
            char c = key.charAt(d);
            x.next[c] = delete(x.next[c], key, d + 1); //依旧是神一般的递归思路
        }
        if (x.val != null) //如果该结点有值，则不管当前结点链接是否为空，该结点都在树里，不能被删去。
            return x;
        for (char c = 0; c < R; c++) //否则就看该结点链接是否为空（即该结点没有值）
            if (x.next[c] != null) //如果当前结点链接不为空，则返回当前结点
                return x; 
        return null; //否则返回为空。（因为是返回上一层递归， 即把自己置为空，也即删除了自己）
    }" title="" data-original-title="复制">
      
      
    public void delete(String key) {
        root = delete(root, key, 0);
    }

    private Node delete(Node x, String key, int d) {
        if (x == null)
            return null;
        if (d == key.length()) //找到了的话就将该结点的值删去
            x.val = null;
        else {
            char c = key.charAt(d);
            x.next[c] = delete(x.next[c], key, d + 1); //依旧是神一般的递归思路
        }
        if (x.val != null) //如果该结点有值，则不管当前结点链接是否为空，该结点都在树里，不能被删去。
            return x;
        for (char c = 0; c < R; c++) //否则就看该结点链接是否为空（即该结点没有值）
            if (x.next[c] != null) //如果当前结点链接不为空，则返回当前结点
                return x; 
        return null; //否则返回为空。（因为是返回上一层递归， 即把自己置为空，也即删除了自己）
    }
复杂度
字母表大小为R，N个随机键组成的单词查找树中

时间：

查找和插入最差时间为：键的长度+1
查找未命中的平均时间为：LOG_RN   （查找未命中的时间和键的长度无关）




空间

与 R和所有键的字符总数之积 成正比

链接

一棵单词查找树的链接总数为RN到RNw之间，w为键的平均长度
当所有键较短时，链接总数接近RN
当所有键较长时，链接总数接近RNw
缩小R能节省大量空间




三向单词查找树

避免空间消耗
每个结点含有一个字符，三个链接（小于，等于，大于），可能含有一个值
字符是显式保存在每个结点中

TST 代码

      
      
      
      
      
public class TST<Value> {
    private Node root; // root of trie

    private class Node {
        char c; // character
        Node left, mid, right; // left, middle, and right subtries
        Value val; // value associated with string
    }

    public Value get(String key) {
        Node x = get(root, key, 0);
        if (x == null)
            return null;
        return (Value) x.val;
    }

    private Node get(Node x, String key, int d) {
        if (x == null)
            return null;
        char c = key.charAt(d);
        if (c < x.c)
            return get(x.left, key, d);
        else if (c > x.c)
            return get(x.right, key, d);
        else if (d < key.length() - 1)
            return get(x.mid, key, d + 1);
        else
            return x;
    }

    public void put(String key, Value val) {
        root = put(root, key, val, 0);
    }

    private Node put(Node x, String key, Value val, int d) {
        char c = key.charAt(d);
        if (x == null) {
            x = new Node();
            x.c = c;
        }
        if (c < x.c)
            x.left = put(x.left, key, val, d);
        else if (c > x.c)
            x.right = put(x.right, key, val, d);
        else if (d < key.length() - 1)
            x.mid = put(x.mid, key, val, d + 1);
        else
            x.val = val;
        return x;
    }
}