0

I have Neo4j 1.9.4 installed on 24 core 24Gb ram (centos) machine and for most queries CPU usage spikes goes to 200% with only few concurrent requests.

Domain:

some sort of social application where few types of nodes(profiles) with 3-30 text/array properties and 36 relationship types with at least 3 properties. Most of nodes currently has ~300-500 relationships.

Current data set footprint(from console):

LogicalLogSize=4294907 (32MB)
ArrayStoreSize=1675520 (12MB)
NodeStoreSize=1342170 (10MB)
PropertyStoreSize=1739548 (13MB)
RelationshipStoreSize=6395202 (48MB)
StringStoreSize=1478400 (11MB)

which is IMHO really small. most queries looks like this one(with more or less WITH .. MATCH .. statements and few queries with variable length relations but the often fast):

START
    targetUser=node({id}),
    currentUser=node({current})
MATCH
    targetUser-[contact:InContactsRelation]->n,
    n-[:InLocationRelation]->l,
    n-[:InCategoryRelation]->c
WITH
    currentUser, targetUser,n, l,c, contact.fav is not null as inFavorites
MATCH
    n<-[followers?:InContactsRelation]-()
WITH
    currentUser, targetUser,n, l,c,inFavorites, COUNT(followers) as numFollowers
RETURN
    id(n) as id,
    n.name? as name,
    n.title? as title,
    n._class as _class,
    n.avatar? as avatar,
    n.avatar_type? as avatar_type,
    l.name as location__name,
    c.name as category__name,
    true as isInContacts,
    inFavorites as isInFavorites,
    numFollowers

it runs in ~1s-3s(for first run) and ~1s-70ms (for consecutive and it depends on query) and there is about 5-10 queries runs for each impression. Another interesting behavior is when i try run query from console(neo4j) on my local machine many consecutive times(just press ctrl+enter for few seconds) it has almost constant execution time but when i do it on server it goes slower exponentially and i guess it somehow related with my problem.

Problem:

So my problem is that neo4j is very CPU greedy(for 24 core machine its may be not an issue but its obviously overkill for small project). First time i used AWS EC2 m1.large instance but over all performance was bad, during testing, CPU always was over 100%.

Some relevant parts of configuration:

neostore.nodestore.db.mapped_memory=1280M
wrapper.java.maxmemory=8192

note: I already tried configuration where all memory related parameters where HIGH and it didn't worked(no change at all).

Question:

Where to digg? configuration? scheme? queries? what i'm doing wrong?

if need more info(logs, configs) just ask ;)

4

1 回答 1

0

使用缓存可以很容易地解释相同查询的后续调用要快得多的原因。一种常见的策略是在启动时运行缓存预热查询,例如

start n=node(*) match n--m return count(n)

24 核上 200% 的 CPU 使用率意味着机器非常懒惰,因为只有 2 个核很忙。当查询正在进行时,CPU 在运行时达到 100% 是正常的。

上面的 Cypher 语句使用可选匹配(在第二个匹配子句中)。这些可选匹配被称为可能很慢。如果您将此设置为非可选匹配,请检查运行时是否更改。

当返回更大的结果集时,请考虑传输响应是由网络速度驱动的。考虑在这种情况下使用流式传输,请参阅http://docs.neo4j.org/chunked/milestone/rest-api-streaming.html

您还应该设置wrapper.java.minmemory为与 相同的值wrapper.java.maxmemory

您相当小的图形的另一种方法是关闭 MMIO 缓存并用于cache_type=strong将完整数据集保留在对象缓存中。在这种情况下,您可能需要增加wrapper.java.minmemorywrapper.java.maxmemory

于 2013-11-09T09:49:54.537 回答