考虑你有一个map<String, Object> myMap
.
给定表达式"some.string.*"
,我必须检索所有myMap
键以该表达式开头的值。
我试图避免for loop
s 因为myMap
将给出一组表达式,而不仅仅是一个表达式,并且for loop
为每个表达式使用变得繁琐的性能明智。
最快的方法是什么?
考虑你有一个map<String, Object> myMap
.
给定表达式"some.string.*"
,我必须检索所有myMap
键以该表达式开头的值。
我试图避免for loop
s 因为myMap
将给出一组表达式,而不仅仅是一个表达式,并且for loop
为每个表达式使用变得繁琐的性能明智。
最快的方法是什么?
如果您使用NavigableMap(例如TreeMap),您可以使用底层树数据结构的好处,并执行以下操作(具有O(lg(N))
复杂性):
public SortedMap<String, Object> getByPrefix(
NavigableMap<String, Object> myMap,
String prefix ) {
return myMap.subMap( prefix, prefix + Character.MAX_VALUE );
}
更多扩展示例:
import java.util.NavigableMap;
import java.util.SortedMap;
import java.util.TreeMap;
public class Test {
public static void main( String[] args ) {
TreeMap<String, Object> myMap = new TreeMap<String, Object>();
myMap.put( "111-hello", null );
myMap.put( "111-world", null );
myMap.put( "111-test", null );
myMap.put( "111-java", null );
myMap.put( "123-one", null );
myMap.put( "123-two", null );
myMap.put( "123--three", null );
myMap.put( "123--four", null );
myMap.put( "125-hello", null );
myMap.put( "125--world", null );
System.out.println( "111 \t" + getByPrefix( myMap, "111" ) );
System.out.println( "123 \t" + getByPrefix( myMap, "123" ) );
System.out.println( "123-- \t" + getByPrefix( myMap, "123--" ) );
System.out.println( "12 \t" + getByPrefix( myMap, "12" ) );
}
private static SortedMap<String, Object> getByPrefix(
NavigableMap<String, Object> myMap,
String prefix ) {
return myMap.subMap( prefix, prefix + Character.MAX_VALUE );
}
}
输出是:
111 {111-hello=null, 111-java=null, 111-test=null, 111-world=null}
123 {123--four=null, 123--three=null, 123-one=null, 123-two=null}
123-- {123--four=null, 123--three=null}
12 {123--four=null, 123--three=null, 123-one=null, 123-two=null, 125--world=null, 125-hello=null}
我MapFilter
最近写了一个就是为了这样的需要。您还可以过滤过滤后的地图,这非常有用。
如果您的表达式具有“some.byte”和“some.string”之类的共同词根,那么首先按共同词根过滤(在本例中为“some.”)将为您节省大量时间。请参阅main
一些简单的示例。
请注意,对过滤后的地图进行更改会更改基础地图。
public class MapFilter<T> implements Map<String, T> {
// The enclosed map -- could also be a MapFilter.
final private Map<String, T> map;
// Use a TreeMap for predictable iteration order.
// Store Map.Entry to reflect changes down into the underlying map.
// The Key is the shortened string. The entry.key is the full string.
final private Map<String, Map.Entry<String, T>> entries = new TreeMap<>();
// The prefix they are looking for in this map.
final private String prefix;
public MapFilter(Map<String, T> map, String prefix) {
// Store my backing map.
this.map = map;
// Record my prefix.
this.prefix = prefix;
// Build my entries.
rebuildEntries();
}
public MapFilter(Map<String, T> map) {
this(map, "");
}
private synchronized void rebuildEntries() {
// Start empty.
entries.clear();
// Build my entry set.
for (Map.Entry<String, T> e : map.entrySet()) {
String key = e.getKey();
// Retain each one that starts with the specified prefix.
if (key.startsWith(prefix)) {
// Key it on the remainder.
String k = key.substring(prefix.length());
// Entries k always contains the LAST occurrence if there are multiples.
entries.put(k, e);
}
}
}
@Override
public String toString() {
return "MapFilter(" + prefix + ") of " + map + " containing " + entrySet();
}
// Constructor from a properties file.
public MapFilter(Properties p, String prefix) {
// Properties extends HashTable<Object,Object> so it implements Map.
// I need Map<String,T> so I wrap it in a HashMap for simplicity.
// Java-8 breaks if we use diamond inference.
this(new HashMap<>((Map) p), prefix);
}
// Helper to fast filter the map.
public MapFilter<T> filter(String prefix) {
// Wrap me in a new filter.
return new MapFilter<>(this, prefix);
}
// Count my entries.
@Override
public int size() {
return entries.size();
}
// Are we empty.
@Override
public boolean isEmpty() {
return entries.isEmpty();
}
// Is this key in me?
@Override
public boolean containsKey(Object key) {
return entries.containsKey(key);
}
// Is this value in me.
@Override
public boolean containsValue(Object value) {
// Walk the values.
for (Map.Entry<String, T> e : entries.values()) {
if (value.equals(e.getValue())) {
// Its there!
return true;
}
}
return false;
}
// Get the referenced value - if present.
@Override
public T get(Object key) {
return get(key, null);
}
// Get the referenced value - if present.
public T get(Object key, T dflt) {
Map.Entry<String, T> e = entries.get((String) key);
return e != null ? e.getValue() : dflt;
}
// Add to the underlying map.
@Override
public T put(String key, T value) {
T old = null;
// Do I have an entry for it already?
Map.Entry<String, T> entry = entries.get(key);
// Was it already there?
if (entry != null) {
// Yes. Just update it.
old = entry.setValue(value);
} else {
// Add it to the map.
map.put(prefix + key, value);
// Rebuild.
rebuildEntries();
}
return old;
}
// Get rid of that one.
@Override
public T remove(Object key) {
// Do I have an entry for it?
Map.Entry<String, T> entry = entries.get((String) key);
if (entry != null) {
entries.remove(key);
// Change the underlying map.
return map.remove(prefix + key);
}
return null;
}
// Add all of them.
@Override
public void putAll(Map<? extends String, ? extends T> m) {
for (Map.Entry<? extends String, ? extends T> e : m.entrySet()) {
put(e.getKey(), e.getValue());
}
}
// Clear everything out.
@Override
public void clear() {
// Just remove mine.
// This does not clear the underlying map - perhaps it should remove the filtered entries.
for (String key : entries.keySet()) {
map.remove(prefix + key);
}
entries.clear();
}
@Override
public Set<String> keySet() {
return entries.keySet();
}
@Override
public Collection<T> values() {
// Roll them all out into a new ArrayList.
List<T> values = new ArrayList<>();
for (Map.Entry<String, T> v : entries.values()) {
values.add(v.getValue());
}
return values;
}
@Override
public Set<Map.Entry<String, T>> entrySet() {
// Roll them all out into a new TreeSet.
Set<Map.Entry<String, T>> entrySet = new TreeSet<>();
for (Map.Entry<String, Map.Entry<String, T>> v : entries.entrySet()) {
entrySet.add(new Entry<>(v));
}
return entrySet;
}
/**
* An entry.
*
* @param <T> The type of the value.
*/
private static class Entry<T> implements Map.Entry<String, T>, Comparable<Entry<T>> {
// Note that entry in the entry is an entry in the underlying map.
private final Map.Entry<String, Map.Entry<String, T>> entry;
Entry(Map.Entry<String, Map.Entry<String, T>> entry) {
this.entry = entry;
}
@Override
public String getKey() {
return entry.getKey();
}
@Override
public T getValue() {
// Remember that the value is the entry in the underlying map.
return entry.getValue().getValue();
}
@Override
public T setValue(T newValue) {
// Remember that the value is the entry in the underlying map.
return entry.getValue().setValue(newValue);
}
@Override
public boolean equals(Object o) {
if (!(o instanceof Entry)) {
return false;
}
Entry e = (Entry) o;
return getKey().equals(e.getKey()) && getValue().equals(e.getValue());
}
@Override
public int hashCode() {
return getKey().hashCode() ^ getValue().hashCode();
}
@Override
public String toString() {
return getKey() + "=" + getValue();
}
@Override
public int compareTo(Entry<T> o) {
return getKey().compareTo(o.getKey());
}
}
// Simple tests.
public static void main(String[] args) {
String[] samples = {
"Some.For.Me",
"Some.For.You",
"Some.More",
"Yet.More"};
Map map = new HashMap();
for (String s : samples) {
map.put(s, s);
}
Map all = new MapFilter(map);
Map some = new MapFilter(map, "Some.");
Map someFor = new MapFilter(some, "For.");
System.out.println("All: " + all);
System.out.println("Some: " + some);
System.out.println("Some.For: " + someFor);
Properties props = new Properties();
props.setProperty("namespace.prop1", "value1");
props.setProperty("namespace.prop2", "value2");
props.setProperty("namespace.iDontKnowThisNameAtCompileTime", "anothervalue");
props.setProperty("someStuff.morestuff", "stuff");
Map<String, String> filtered = new MapFilter(props, "namespace.");
System.out.println("namespace props " + filtered);
}
}
删除所有不以所需前缀开头的键:
yourMap.keySet().removeIf(key -> !key.startsWith(keyPrefix));
接受的答案在 99% 的情况下都有效,但问题在于细节。
Character.MAX_VALUE
具体来说,当地图有一个以前缀开头,后跟其他任何内容的键时,接受的答案不起作用。对已接受答案发表的评论会产生一些小的改进,但仍不能涵盖所有情况。
以下解决方案还使用NavigableMap来挑选给定键前缀的子地图。解决方案是subMapFrom()
方法,诀窍是不要增加/增加前缀的最后一个字符,而是MAX_VALUE
在切断所有尾随MAX_VALUE
s 的同时不增加最后一个字符。例如,如果前缀是“abc”,我们将其递增为“abd”。但是,如果前缀是“ab”+ MAX_VALUE
,我们会删除最后一个字符并替换前一个字符,从而产生“ac”。
import static java.lang.Character.MAX_VALUE;
public class App
{
public static void main(String[] args) {
NavigableMap<String, String> map = new TreeMap<>();
String[] keys = {
"a",
"b",
"b" + MAX_VALUE,
"b" + MAX_VALUE + "any",
"c"
};
// Populate map
Stream.of(keys).forEach(k -> map.put(k, ""));
// For each key that starts with 'b', find the sub map
Stream.of(keys).filter(s -> s.startsWith("b")).forEach(p -> {
System.out.println("Looking for sub map using prefix \"" + p + "\".");
// Always returns expected sub maps with no misses
// [b, b, bany], [b, bany] and [bany]
System.out.println("My solution: " +
subMapFrom(map, p).keySet());
// WRONG! Prefix "b" misses "bany"
System.out.println("SO answer: " +
map.subMap(p, true, p + MAX_VALUE, true).keySet());
// WRONG! Prefix "b" misses "b" and "bany"
System.out.println("SO comment: " +
map.subMap(p, true, tryIncrementLastChar(p), false).keySet());
System.out.println();
});
}
private static <V> NavigableMap<String, V> subMapFrom(
NavigableMap<String, V> map, String keyPrefix)
{
final String fromKey = keyPrefix, toKey; // undefined
// Alias
String p = keyPrefix;
if (p.isEmpty()) {
// No need for a sub map
return map;
}
// ("ab" + MAX_VALUE + MAX_VALUE + ...) returns index 1
final int i = lastIndexOfNonMaxChar(p);
if (i == -1) {
// Prefix is all MAX_VALUE through and through, so grab rest of map
return map.tailMap(p, true);
}
if (i < p.length() - 1) {
// Target char for bumping is not last char; cut out the residue
// ("ab" + MAX_VALUE + MAX_VALUE + ...) becomes "ab"
p = p.substring(0, i + 1);
}
toKey = bumpChar(p, i);
return map.subMap(fromKey, true, toKey, false);
}
private static int lastIndexOfNonMaxChar(String str) {
int i = str.length();
// Walk backwards, while we have a valid index
while (--i >= 0) {
if (str.charAt(i) < MAX_VALUE) {
return i;
}
}
return -1;
}
private static String bumpChar(String str, int pos) {
assert !str.isEmpty();
assert pos >= 0 && pos < str.length();
final char c = str.charAt(pos);
assert c < MAX_VALUE;
StringBuilder b = new StringBuilder(str);
b.setCharAt(pos, (char) (c + 1));
return b.toString();
}
private static String tryIncrementLastChar(String p) {
char l = p.charAt(p.length() - 1);
return l == MAX_VALUE ?
// Last character already max, do nothing
p :
// Bump last character
p.substring(0, p.length() - 1) + ++l;
}
}
输出:
Looking for sub map using prefix "b".
My solution: [b, b, bany]
SO answer: [b, b]
SO comment: [b, b, bany]
Looking for sub map using prefix "b".
My solution: [b, bany]
SO answer: [b, bany]
SO comment: []
Looking for sub map using prefix "bany".
My solution: [bany]
SO answer: [bany]
SO comment: [bany]
也许应该补充一点,我还尝试了各种其他方法,包括我在互联网上其他地方找到的代码。所有这些都因产生不正确的结果而失败,或者因各种异常而直接崩溃。
map 的键集没有特殊的结构,所以我认为无论如何你都必须检查每个键。所以你找不到比单循环更快的方法......
我使用此代码进行了速度试验:
public class KeyFinder {
private static Random random = new Random();
private interface Receiver {
void receive(String value);
}
public static void main(String[] args) {
for (int trials = 0; trials < 10; trials++) {
doTrial();
}
}
private static void doTrial() {
final Map<String, String> map = new HashMap<String, String>();
giveRandomElements(new Receiver() {
public void receive(String value) {
map.put(value, null);
}
}, 10000);
final Set<String> expressions = new HashSet<String>();
giveRandomElements(new Receiver() {
public void receive(String value) {
expressions.add(value);
}
}, 1000);
int hits = 0;
long start = System.currentTimeMillis();
for (String expression : expressions) {
for (String key : map.keySet()) {
if (key.startsWith(expression)) {
hits++;
}
}
}
long stop = System.currentTimeMillis();
System.out.printf("Found %s hits in %s ms\n", hits, stop - start);
}
private static void giveRandomElements(Receiver receiver, int count) {
for (int i = 0; i < count; i++) {
String value = String.valueOf(random.nextLong());
receiver.receive(value);
}
}
}
输出是:
Found 0 hits in 1649 ms
Found 0 hits in 1626 ms
Found 0 hits in 1389 ms
Found 0 hits in 1396 ms
Found 0 hits in 1417 ms
Found 0 hits in 1388 ms
Found 0 hits in 1377 ms
Found 0 hits in 1395 ms
Found 0 hits in 1399 ms
Found 0 hits in 1357 ms
这计算 10000 个随机键中有多少以 1000 个随机字符串值中的任何一个开始(10M 检查)。
所以在一台简单的双核笔记本电脑上大约需要 1.4 秒;这对你来说太慢了吗?