我想用来PyJulia
加速部分代码
import numpy as np
import julia
import pandas as pd
import random
from julia import Base
from julia import Main
from julia import DataFrames
n = 100000
randomlist = []
for i in range(0,n):
num = random.randint(1,100)
randomlist.append(num)
data = {
'Score': list(randomlist),
'ScoreBin': list(np.zeros(n))
}
df = pd.DataFrame(data, columns = ['Score', 'ScoreBin'])
Main.dfj = df
Main.eval("""
for i = 1:10
#println(i)
if dfj.Score[i] >= 10
println(dfj.Score[i])
end
end
"""
)
但是我收到以下错误消息:
JuliaError: Exception 'TypeError: non-boolean (PyObject) used in boolean context' occurred while calling julia code:
此外,以下命令:
Main.eval("""
println(dfj.Score[1])
"""
)
给出输出(这似乎不是 Julia DataFrame):
PyObject 84
有没有办法将 pandas DataFrame 转换为 Julia DataFrame?
编辑 1
感谢@PrzemyslawSzufel 的回答,现在可以使用以下代码:
import numpy as np
import julia
import pandas as pd
import random
import copy
from julia import Base
from julia import Main
from julia import DataFrames
from julia import Pandas
#julia.install(DataFrame)
%load_ext julia.magic
n = 100000
randomlist = []
for i in range(0,n):
num = random.randint(1,100)
randomlist.append(num)
data = {
'Score': list(randomlist),
'ScoreBin': list(np.zeros(n))
}
df = pd.DataFrame(data, columns = ['Score', 'ScoreBin'])
Main.df = df;
Main.eval("""
dfj = df |> Pandas.DataFrame|> DataFrames.DataFrame;
""")
但是,虽然我;
在行尾放了 a ,但我总是从 dfj 得到一个打印输出,它不需要而且很长(100000 行)并且需要大约一秒钟。有没有办法避免打印输出?
此外,如果我现在修改 Julia 中的数据框(这比在 python 中执行此操作和整个问题的目标要快得多)并希望它将其转换回 python pandas,我也会收到错误消息
Main.eval("""
for i = 1:length(dfj[:, :Score])
if dfj[i, :Score] > 50
dfj[i, :ScoreBin] = 1
end
end
"""
)
dfjpy = pd.DataFrame(Main.dfj)
dfjpy
RuntimeError: Julia exception: MethodError: no method matching iterate(::DataFrames.DataFrame)
Closest candidates are:
iterate(!Matched::Core.SimpleVector) at essentials.jl:568
iterate(!Matched::Core.SimpleVector, !Matched::Any) at essentials.jl:568
iterate(!Matched::ExponentialBackOff) at error.jl:199
...
Stacktrace:
[1] jlwrap_iterator(::DataFrames.DataFrame) at /Users/mymac/.julia/packages/PyCall/zqDXB/src/pyiterator.jl:144
[2] pyjlwrap_getiter(::Ptr{PyCall.PyObject_struct}) at /Users/mymac/.julia/packages/PyCall/zqDXB/src/pyiterator.jl:125
顺便说一句,命令type(dfjpy)
作为PyCall.jlwrap
输出
编辑 2
为了将 julia Dataframe 转换为 Python Pandas,您必须首先将其转换为 Julia Pandas。是最新的工作代码
n = 100000
randomlist = []
for i in range(0,n):
num = random.randint(1,100)
randomlist.append(num)
data = {
'Score': list(randomlist),
'ScoreBin': list(np.zeros(n))
}
df = pd.DataFrame(data, columns = ['Score', 'ScoreBin'])
Main.df = df;
Main.eval("""
dfj = df |> Pandas.DataFrame|> DataFrames.DataFrame;
for i = 1:length(dfj[:, :Score])
if dfj[i, :Score] > 50
dfj[i, :ScoreBin] = 1
end
end
dfjp = dfj |> Pandas.DataFrame;
"""
)
dfjpy = Main.dfjp
dfjpy