java - 从键以某个表达式开头的 Map 获取所有值的最快方法

Question

考虑你有一个map<String, Object> myMap.

给定表达式"some.string.*"，我必须检索所有myMap键以该表达式开头的值。

我试图避免for loops 因为myMap将给出一组表达式，而不仅仅是一个表达式，并且for loop为每个表达式使用变得繁琐的性能明智。

最快的方法是什么？

score 42 · Accepted Answer

如果您使用NavigableMap（例如TreeMap），您可以使用底层树数据结构的好处，并执行以下操作（具有O(lg(N))复杂性）：

public SortedMap<String, Object> getByPrefix( 
        NavigableMap<String, Object> myMap, 
        String prefix ) {
    return myMap.subMap( prefix, prefix + Character.MAX_VALUE );
}

更多扩展示例：

import java.util.NavigableMap;
import java.util.SortedMap;
import java.util.TreeMap;

public class Test {

    public static void main( String[] args ) {
        TreeMap<String, Object> myMap = new TreeMap<String, Object>();
        myMap.put( "111-hello", null );
        myMap.put( "111-world", null );
        myMap.put( "111-test", null );
        myMap.put( "111-java", null );

        myMap.put( "123-one", null );
        myMap.put( "123-two", null );
        myMap.put( "123--three", null );
        myMap.put( "123--four", null );

        myMap.put( "125-hello", null );
        myMap.put( "125--world", null );

        System.out.println( "111 \t" + getByPrefix( myMap, "111" ) );
        System.out.println( "123 \t" + getByPrefix( myMap, "123" ) );
        System.out.println( "123-- \t" + getByPrefix( myMap, "123--" ) );
        System.out.println( "12 \t" + getByPrefix( myMap, "12" ) );
    }

    private static SortedMap<String, Object> getByPrefix(
            NavigableMap<String, Object> myMap,
            String prefix ) {
        return myMap.subMap( prefix, prefix + Character.MAX_VALUE );
    }
}

输出是：

111     {111-hello=null, 111-java=null, 111-test=null, 111-world=null}
123     {123--four=null, 123--three=null, 123-one=null, 123-two=null}
123--   {123--four=null, 123--three=null}
12      {123--four=null, 123--three=null, 123-one=null, 123-two=null, 125--world=null, 125-hello=null}

score 5 · Accepted Answer

我MapFilter最近写了一个就是为了这样的需要。您还可以过滤过滤后的地图，这非常有用。

如果您的表达式具有“some.byte”和“some.string”之类的共同词根，那么首先按共同词根过滤（在本例中为“some.”）将为您节省大量时间。请参阅main一些简单的示例。

请注意，对过滤后的地图进行更改会更改基础地图。

public class MapFilter<T> implements Map<String, T> {

    // The enclosed map -- could also be a MapFilter.
    final private Map<String, T> map;

    // Use a TreeMap for predictable iteration order.
    // Store Map.Entry to reflect changes down into the underlying map.
    // The Key is the shortened string. The entry.key is the full string.
    final private Map<String, Map.Entry<String, T>> entries = new TreeMap<>();
    // The prefix they are looking for in this map.
    final private String prefix;

    public MapFilter(Map<String, T> map, String prefix) {
        // Store my backing map.
        this.map = map;
        // Record my prefix.
        this.prefix = prefix;
        // Build my entries.
        rebuildEntries();
    }

    public MapFilter(Map<String, T> map) {
        this(map, "");
    }

    private synchronized void rebuildEntries() {
        // Start empty.
        entries.clear();
        // Build my entry set.
        for (Map.Entry<String, T> e : map.entrySet()) {
            String key = e.getKey();
            // Retain each one that starts with the specified prefix.
            if (key.startsWith(prefix)) {
                // Key it on the remainder.
                String k = key.substring(prefix.length());
                // Entries k always contains the LAST occurrence if there are multiples.
                entries.put(k, e);
            }
        }

    }

    @Override
    public String toString() {
        return "MapFilter(" + prefix + ") of " + map + " containing " + entrySet();
    }

    // Constructor from a properties file.
    public MapFilter(Properties p, String prefix) {
        // Properties extends HashTable<Object,Object> so it implements Map.
        // I need Map<String,T> so I wrap it in a HashMap for simplicity.
        // Java-8 breaks if we use diamond inference.
        this(new HashMap<>((Map) p), prefix);
    }

    // Helper to fast filter the map.
    public MapFilter<T> filter(String prefix) {
        // Wrap me in a new filter.
        return new MapFilter<>(this, prefix);
    }

    // Count my entries.
    @Override
    public int size() {
        return entries.size();
    }

    // Are we empty.
    @Override
    public boolean isEmpty() {
        return entries.isEmpty();
    }

    // Is this key in me?
    @Override
    public boolean containsKey(Object key) {
        return entries.containsKey(key);
    }

    // Is this value in me.
    @Override
    public boolean containsValue(Object value) {
        // Walk the values.
        for (Map.Entry<String, T> e : entries.values()) {
            if (value.equals(e.getValue())) {
                // Its there!
                return true;
            }
        }
        return false;
    }

    // Get the referenced value - if present.
    @Override
    public T get(Object key) {
        return get(key, null);
    }

    // Get the referenced value - if present.
    public T get(Object key, T dflt) {
        Map.Entry<String, T> e = entries.get((String) key);
        return e != null ? e.getValue() : dflt;
    }

    // Add to the underlying map.
    @Override
    public T put(String key, T value) {
        T old = null;
        // Do I have an entry for it already?
        Map.Entry<String, T> entry = entries.get(key);
        // Was it already there?
        if (entry != null) {
            // Yes. Just update it.
            old = entry.setValue(value);
        } else {
            // Add it to the map.
            map.put(prefix + key, value);
            // Rebuild.
            rebuildEntries();
        }
        return old;
    }

    // Get rid of that one.
    @Override
    public T remove(Object key) {
        // Do I have an entry for it?
        Map.Entry<String, T> entry = entries.get((String) key);
        if (entry != null) {
            entries.remove(key);
            // Change the underlying map.
            return map.remove(prefix + key);
        }
        return null;
    }

    // Add all of them.
    @Override
    public void putAll(Map<? extends String, ? extends T> m) {
        for (Map.Entry<? extends String, ? extends T> e : m.entrySet()) {
            put(e.getKey(), e.getValue());
        }
    }

    // Clear everything out.
    @Override
    public void clear() {
        // Just remove mine.
        // This does not clear the underlying map - perhaps it should remove the filtered entries.
        for (String key : entries.keySet()) {
            map.remove(prefix + key);
        }
        entries.clear();
    }

    @Override
    public Set<String> keySet() {
        return entries.keySet();
    }

    @Override
    public Collection<T> values() {
        // Roll them all out into a new ArrayList.
        List<T> values = new ArrayList<>();
        for (Map.Entry<String, T> v : entries.values()) {
            values.add(v.getValue());
        }
        return values;
    }

    @Override
    public Set<Map.Entry<String, T>> entrySet() {
        // Roll them all out into a new TreeSet.
        Set<Map.Entry<String, T>> entrySet = new TreeSet<>();
        for (Map.Entry<String, Map.Entry<String, T>> v : entries.entrySet()) {
            entrySet.add(new Entry<>(v));
        }
        return entrySet;
    }

    /**
     * An entry.
     *
     * @param <T> The type of the value.
     */
    private static class Entry<T> implements Map.Entry<String, T>, Comparable<Entry<T>> {

        // Note that entry in the entry is an entry in the underlying map.

        private final Map.Entry<String, Map.Entry<String, T>> entry;

        Entry(Map.Entry<String, Map.Entry<String, T>> entry) {
            this.entry = entry;
        }

        @Override
        public String getKey() {
            return entry.getKey();
        }

        @Override
        public T getValue() {
            // Remember that the value is the entry in the underlying map.
            return entry.getValue().getValue();
        }

        @Override
        public T setValue(T newValue) {
            // Remember that the value is the entry in the underlying map.
            return entry.getValue().setValue(newValue);
        }

        @Override
        public boolean equals(Object o) {
            if (!(o instanceof Entry)) {
                return false;
            }
            Entry e = (Entry) o;
            return getKey().equals(e.getKey()) && getValue().equals(e.getValue());
        }

        @Override
        public int hashCode() {
            return getKey().hashCode() ^ getValue().hashCode();
        }

        @Override
        public String toString() {
            return getKey() + "=" + getValue();
        }

        @Override
        public int compareTo(Entry<T> o) {
            return getKey().compareTo(o.getKey());
        }

    }

    // Simple tests.
    public static void main(String[] args) {
        String[] samples = {
                "Some.For.Me",
                "Some.For.You",
                "Some.More",
                "Yet.More"};
        Map map = new HashMap();
        for (String s : samples) {
            map.put(s, s);
        }
        Map all = new MapFilter(map);
        Map some = new MapFilter(map, "Some.");
        Map someFor = new MapFilter(some, "For.");
        System.out.println("All: " + all);
        System.out.println("Some: " + some);
        System.out.println("Some.For: " + someFor);

        Properties props = new Properties();
        props.setProperty("namespace.prop1", "value1");
        props.setProperty("namespace.prop2", "value2");
        props.setProperty("namespace.iDontKnowThisNameAtCompileTime", "anothervalue");
        props.setProperty("someStuff.morestuff", "stuff");
        Map<String, String> filtered = new MapFilter(props, "namespace.");
        System.out.println("namespace props " + filtered);
    }

}

score 2 · Accepted Answer

删除所有不以所需前缀开头的键：

yourMap.keySet().removeIf(key -> !key.startsWith(keyPrefix));

score 2 · Accepted Answer

接受的答案在 99% 的情况下都有效，但问题在于细节。

Character.MAX_VALUE具体来说，当地图有一个以前缀开头，后跟其他任何内容的键时，接受的答案不起作用。对已接受答案发表的评论会产生一些小的改进，但仍不能涵盖所有情况。

以下解决方案还使用NavigableMap来挑选给定键前缀的子地图。解决方案是subMapFrom()方法，诀窍是不要增加/增加前缀的最后一个字符，而是MAX_VALUE在切断所有尾随MAX_VALUEs 的同时不增加最后一个字符。例如，如果前缀是“abc”，我们将其递增为“abd”。但是，如果前缀是“ab”+ MAX_VALUE，我们会删除最后一个字符并替换前一个字符，从而产生“ac”。

import static java.lang.Character.MAX_VALUE;

public class App
{
    public static void main(String[] args) {
        NavigableMap<String, String> map = new TreeMap<>();
        
        String[] keys = {
                "a",
                "b",
                "b" + MAX_VALUE,
                "b" + MAX_VALUE + "any",
                "c"
        };
        
        // Populate map
        Stream.of(keys).forEach(k -> map.put(k, ""));
        
        // For each key that starts with 'b', find the sub map
        Stream.of(keys).filter(s -> s.startsWith("b")).forEach(p -> {
            System.out.println("Looking for sub map using prefix \"" + p + "\".");
            
            // Always returns expected sub maps with no misses
            // [b, b, bany], [b, bany] and [bany]
            System.out.println("My solution: " +
                    subMapFrom(map, p).keySet());
            
            // WRONG! Prefix "b" misses "bany"
            System.out.println("SO answer:   " +
                    map.subMap(p, true, p + MAX_VALUE, true).keySet());
            
            // WRONG! Prefix "b" misses "b" and "bany"
            System.out.println("SO comment:  " +
                    map.subMap(p, true, tryIncrementLastChar(p), false).keySet());
            
            System.out.println();
        });
    }
    
    private static <V> NavigableMap<String, V> subMapFrom(
            NavigableMap<String, V> map, String keyPrefix)
    {
        final String fromKey = keyPrefix, toKey; // undefined
        
        // Alias
        String p = keyPrefix;
        
        if (p.isEmpty()) {
            // No need for a sub map
            return map;
        }
        
        // ("ab" + MAX_VALUE + MAX_VALUE + ...) returns index 1
        final int i = lastIndexOfNonMaxChar(p);
        
        if (i == -1) {
            // Prefix is all MAX_VALUE through and through, so grab rest of map
            return map.tailMap(p, true);
        }
        
        if (i < p.length() - 1) {
            // Target char for bumping is not last char; cut out the residue
            // ("ab" + MAX_VALUE + MAX_VALUE + ...) becomes "ab"
            p = p.substring(0, i + 1);
        }
        toKey = bumpChar(p, i);
        
        return map.subMap(fromKey, true, toKey, false);
    }
    
    private static int lastIndexOfNonMaxChar(String str) {
        int i = str.length();
        
        // Walk backwards, while we have a valid index
        while (--i >= 0) {
            if (str.charAt(i) < MAX_VALUE) {
                return i;
            }
        }
        
        return -1;
    }
    
    private static String bumpChar(String str, int pos) {
        assert !str.isEmpty();
        assert pos >= 0 && pos < str.length();
        
        final char c = str.charAt(pos);
        assert c < MAX_VALUE;
        
        StringBuilder b = new StringBuilder(str);
        b.setCharAt(pos, (char) (c + 1));
        return b.toString();
    }
    
    private static String tryIncrementLastChar(String p) {
        char l = p.charAt(p.length() - 1);
        return l == MAX_VALUE ?
                // Last character already max, do nothing
                p :
                // Bump last character
                p.substring(0, p.length() - 1) + ++l;
    }
}

输出：

Looking for sub map using prefix "b".
My solution: [b, b, bany]
SO answer:   [b, b]
SO comment:  [b, b, bany]

Looking for sub map using prefix "b".
My solution: [b, bany]
SO answer:   [b, bany]
SO comment:  []

Looking for sub map using prefix "bany".
My solution: [bany]
SO answer:   [bany]
SO comment:  [bany]

也许应该补充一点，我还尝试了各种其他方法，包括我在互联网上其他地方找到的代码。所有这些都因产生不正确的结果而失败，或者因各种异常而直接崩溃。

score 1 · Accepted Answer

map 的键集没有特殊的结构，所以我认为无论如何你都必须检查每个键。所以你找不到比单循环更快的方法......

score 1 · Accepted Answer

我使用此代码进行了速度试验：

public class KeyFinder {

    private static Random random = new Random();

    private interface Receiver {
        void receive(String value);
    }

    public static void main(String[] args) {
        for (int trials = 0; trials < 10; trials++) {
            doTrial();
        }
    }

    private static void doTrial() {

        final Map<String, String> map = new HashMap<String, String>();
        giveRandomElements(new Receiver() {
            public void receive(String value) {
                map.put(value, null);
            }
        }, 10000);

        final Set<String> expressions = new HashSet<String>();
        giveRandomElements(new Receiver() {
            public void receive(String value) {
                expressions.add(value);
            }
        }, 1000);

        int hits = 0;
        long start = System.currentTimeMillis();
        for (String expression : expressions) {
            for (String key : map.keySet()) {
                if (key.startsWith(expression)) {
                    hits++;
                }
            }
        }
        long stop = System.currentTimeMillis();
        System.out.printf("Found %s hits in %s ms\n", hits, stop - start);
    }

    private static void giveRandomElements(Receiver receiver, int count) {
        for (int i = 0; i < count; i++) {
            String value = String.valueOf(random.nextLong());
            receiver.receive(value);
        }

    }
}

输出是：

Found 0 hits in 1649 ms
Found 0 hits in 1626 ms
Found 0 hits in 1389 ms
Found 0 hits in 1396 ms
Found 0 hits in 1417 ms
Found 0 hits in 1388 ms
Found 0 hits in 1377 ms
Found 0 hits in 1395 ms
Found 0 hits in 1399 ms
Found 0 hits in 1357 ms

这计算 10000 个随机键中有多少以 1000 个随机字符串值中的任何一个开始（10M 检查）。

所以在一台简单的双核笔记本电脑上大约需要 1.4 秒；这对你来说太慢了吗？

java - 从键以某个表达式开头的 Map 获取所有值的最快方法

6 回答 6

Related

Reference