我正在将 SQLite3 数据库中的图表填充到 neo4j 中,在 Ubuntu linux 上使用 py2neo 和 Python 3.2。尽管速度不是最重要的问题,但在总共 500 万行中,该图仅在大约 3 小时内获得了 40K 行(每个 sql 行一个关系)。
这是主循环:
from py2neo import neo4j as neo
import sqlite3 as sql
#select all 5M rows from sql-database
sql_str = """select * from bigram_with_number"""
#loop through each row
for (freq, first, firstfreq, second, secondfreq) in sql_cursor.execute(sql_str):
# create the Cypher query string using cypher 2.0 with merge
# so that nodes are created only if needed
query = neo.CypherQuery(neo4j_db,"""
CYPHER 2.0
merge (n:word {form: {firstvar}, freq: {freqfirst}})
merge(m:word {form: {secondvar}, freq: {freqsecond}})
create unique (n)-[:bigram {freq: {freqbigram}}]->(m) return n, m""")
#execute the string with parameters from sql-query
result = query.execute(freqbigram = freq, firstvar = first, freqfirst=firstfreq, secondvar=second, freqsecond=secondfreq)
尽管数据库填充得很好,但它还需要数周才能完成。我怀疑有可能更快地做到这一点。