2

我对 Julia 编程很陌生。

我有一个包含 CSV 文件 (14) 的文件夹,我加入了一个大数据框,我正在尝试保存大 CSV。(行 - 262673020 x 列 - 77)当我使用 CSV.write - 我收到此错误:BoundsError:尝试访问索引 [1:4194305] 处的 4194304 元素数组 {UInt8,1}。

所以我试图将它保存到一个羽毛文件中,但我收到了这个错误:InexactError: trunc(Int32, 2147483662) - 这个错误看起来最多达到 32 个。但不知道为什么

我不确定发生了什么,只需要一些帮助来了解该怎么做。

包版本 - Julia 版本 1.5.2, - Glob v1.3.0, - CSV v0.5.23, - 表格 v0.2.11, - Feather v0.5.4

更新到包 - Julia 版本 1.5.2 - CSV 0.7.7 - DataFrames v0.21.8 - Glob v1.3.0 - 表格 v1.1.0 - Feather v0.5.6 -

using Glob, CSV, Tables, Feather

fileDirectory = "location/CSV"
files = glob("*.csv", fileDirectory)

list_df = [DataFrame(CSV.read(f)) for f in files]
Join_DF = join(list_df[3], list_df[4], list_df[5], list_df[6], list_df[7], list_df[8], list_df[9], list_df[10], list_df[11], list_df[12], list_df[13], list_df[14], on = :INC_KEY, kind = :outer)


Feather.write("location/join_files.feather", Join_DF) 
# ERROR: InexactError: trunc(Int32, 2147483662)

CSV.write("location/join_files.csv", Join_DF) 
# ERROR: BoundsError: attempt to access 4194304-element Array{UInt8,1} at index [1:4194305].
CSV - 
Stacktrace:
 [1] throw_boundserror(::Array{UInt8,1}, ::Tuple{UnitRange{Int64}}) at ./abstractarray.jl:541
 [2] checkbounds at ./abstractarray.jl:506 [inlined]
 [3] view at ./subarray.jl:158 [inlined]
 [4] writecell(::Array{UInt8,1}, ::Int64, ::Int64, ::IOStream, ::Int64, ::CSV.Options{UInt8,UInt8,Nothing,Tuple{}}) at /Users/.julia/packages/CSV/4GOjG/src/write.jl:147
 [5] #64 at /Users/.julia/packages/CSV/4GOjG/src/write.jl:182 [inlined]
 [6] macro expansion at /Users/.julia/packages/Tables/FXXeK/src/utils.jl:54 [inlined]
 [7] eachcolumn at /Users/.julia/packages/Tables/FXXeK/src/utils.jl:48 [inlined]
[8] writerow(::Array{UInt8,1}, ::Base.RefValue{Int64}, ::Int64, ::IOStream, ::Tables.Schema{(
[9] #55 at /Users/.julia/packages/CSV/4GOjG/src/write.jl:80 [inlined]
[10] (::CSV.var"#62#63"{CSV.var"#55#56"{Bool,Tables.Schema{(
[12] open(::Function, ::String, ::String) at ./io.jl:323
 [13] with at /Users/.julia/packages/CSV/4GOjG/src/write.jl:139 [inlined]
 [14] #write#54 at /Users/.julia/packages/CSV/4GOjG/src/write.jl:73 [inlined]
 [15] write(::Tables.Schema{(
 [16] write(::String, ::DataFrame; delim::Char, quotechar::Char, openquotechar::Nothing, closequotechar::Nothing, escapechar::Char, newline::Char, decimal::Char, dateformat::Nothing, quotestrings::Bool, missingstring::String, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /Users/.julia/packages/CSV/4GOjG/src/write.jl:60
 [17] write(::String, ::DataFrame) at /Users/.julia/packages/CSV/4GOjG/src/write.jl:53
 [18] top-level scope at none:1

Feather -
ERROR: InexactError: trunc(Int32, 2147483662)
Stacktrace:
 [1] throw_inexacterror(::Symbol, ::Type{Int32}, ::Int64) at ./boot.jl:558
 [2] checked_trunc_sint at ./boot.jl:580 [inlined]
 [3] toInt32 at ./boot.jl:617 [inlined]
 [4] Int32 at ./boot.jl:707 [inlined]
 [5] convert at ./number.jl:7 [inlined]
 [6] setindex! at ./array.jl:847 [inlined]
 [7] offsets(::Type{Int32}, ::Type{UInt8}, ::PooledArrays.PooledArray{Union{Missing, String},UInt32,1,Array{UInt32,1}}) at /Users/.julia/packages/Arrow/q3tEJ/src/lists.jl:300
 [8] Arrow.NullableList{String,Int32,P} where P<:Arrow.AbstractPrimitive(::Type{UInt8}, ::PooledArrays.PooledArray{Union{Missing, String},UInt32,1,Array{UInt32,1}}) at /Users/.julia/packages/Arrow/q3tEJ/src/lists.jl:243
 [9] NullableList at /Users/.julia/packages/Arrow/q3tEJ/src/lists.jl:251 [inlined]
 [10] arrowformat at /Users/.julia/packages/Arrow/q3tEJ/src/arrowvectors.jl:242 [inlined]
 [11] getarrow(::PooledArrays.PooledArray{Union{Missing, String},UInt32,1,Array{UInt32,1}}) at /Users/.julia/packages/Feather/y64Pt/src/sink.jl:40
 [12] write(::IOStream, ::DataFrame; description::String, metadata::String) at /Users/.julia/packages/Feather/y64Pt/src/sink.jl:18
 [13] #20 at /Users/.julia/packages/Feather/y64Pt/src/sink.jl:32 [inlined]
 [14] open(::Feather.var"#20#21"{String,String,DataFrame}, ::String, ::Vararg{String,N} where N; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at ./io.jl:325
 [15] open at ./io.jl:323 [inlined]
 [16] #write#19 at /Users/.julia/packages/Feather/y64Pt/src/sink.jl:31 [inlined]
 [17] write(::String, ::DataFrame) at /Users/.julia/packages/Feather/y64Pt/src/sink.jl:31
 [18] top-level scope at none:1
4

0 回答 0