我正在用 Java 编写一个随机索引实现,它需要处理大型语料库并以某种方式存储单个标记的上下文和索引向量。HashMap 看起来很自然(String -> Token 对象),但是在运行 Xprof 时,似乎有不成比例的大部分处理将令牌添加到 HashMap。
我是否正确读取输出?为什么会这样,有什么方法可以让我加快速度吗?
Flat profile of 16.18 secs (606 total ticks): main
Interpreted + native Method
6.9% 0 + 42 java.io.FileInputStream.readBytes
5.0% 0 + 30 java.lang.Object.getClass
1.8% 11 + 0 java.lang.String.toLowerCase
1.5% 9 + 0 java.util.HashMap.resize
1.3% 8 + 0 opennlp.tools.tokenize.AbstractTokenizer.tokenize
1.3% 0 + 8 java.util.zip.ZipFile.read
1.2% 0 + 7 java.util.zip.ZipFile.open
0.8% 5 + 0 java.util.Arrays.copyOfRange
0.5% 0 + 3 java.io.FileInputStream.available
0.3% 2 + 0 java.util.HashMap.put
0.3% 0 + 2 sun.misc.Unsafe.compareAndSwapLong
0.3% 2 + 0 java.lang.CharacterDataLatin1.toLowerCase
0.3% 2 + 0 java.util.ArrayList.grow
0.3% 2 + 0 semanticspace.SparseVector.get
0.3% 2 + 0 java.lang.CharacterData.of
0.2% 1 + 0 java.util.HashMap.createEntry
0.2% 1 + 0 java.util.Arrays.copyOf
0.2% 1 + 0 java.lang.Integer.valueOf
0.2% 1 + 0 java.lang.Integer.toString
0.2% 1 + 0 sun.misc.JarIndex.addToList
0.2% 1 + 0 java.util.ArrayList.toArray
0.2% 1 + 0 java.net.URL.toString
0.2% 1 + 0 semanticspace.SparseVector.add
0.2% 1 + 0 sun.reflect.NativeMethodAccessorImpl.invoke0
0.2% 1 + 0 java.io.BufferedInputStream.read1
26.2% 65 + 94 Total interpreted (including elided)
Compiled + native Method
36.5% 217 + 4 java.util.HashMap.put
24.3% 133 + 14 semanticspace.SparseVector.add
2.6% 15 + 1 semanticspace.RandomIndexing.getToken
1.3% 8 + 0 java.lang.String.toLowerCase
1.3% 8 + 0 semanticspace.RandomIndexing.read
0.5% 0 + 3 java.util.HashMap.newKeyIterator
0.2% 0 + 1 semanticspace.SparseVector.get
0.2% 1 + 0 java.util.HashMap.containsKey
66.8% 382 + 23 Total compiled
Stub + native Method
6.9% 0 + 42 java.lang.System.arraycopy
6.9% 0 + 42 Total stub
Flat profile of 0.00 secs (1 total ticks): DestroyJavaVM
Thread-local ticks:
100.0% 1 Blocked (of total)
Flat profile of 16.17 secs (608 total ticks): Monitor Ctrl-Break
Interpreted + native Method
98.2% 0 + 597 java.net.PlainSocketImpl.socketAccept
1.0% 0 + 6 java.net.PlainSocketImpl.initProto
0.7% 0 + 4 java.net.NetworkInterface.getAll
0.2% 0 + 1 java.lang.ClassLoader$NativeLibrary.load
100.0% 0 + 608 Total interpreted
Global summary of 16.33 seconds:
100.0% 1326 Received ticks
53.2% 706 Received GC ticks
6.8% 90 Compilation
0.1% 1 Other VM operations