StackOverflowError with tuple

2019-03-02 14:18发布

I have written a recursive function for getting objects in larger arrays in julia. The following error occured:

ERROR: LoadError: StackOverflowError:
     in cat_t at abstractarray.jl:831
     in recGetObjChar at /home/user/Desktop/program.jl:1046
     in recGetObjChar at /home/user/Desktop/program.jl:1075 (repeats 9179 times)
     in getImChars at /home/user/Desktop/program.jl:968
     in main at /home/user/Desktop/program.jl:69
     in include at ./boot.jl:261
     in include_from_node1 at ./loading.jl:304
     in process_options at ./client.jl:308
     in _start at ./client.jl:411
    while loading /home/user/Desktop/program.jl, in expression starting on line 78

If you want to have a look at the code, I have already opened an issue (Assertion failed, process aborted). After debugging my code for julia v 0.4, it is more obvious, what causes the problem. The tupel locObj gets much bigger than 9000 entries, because one object can be e.g. 150 x 150 big. That would result in a length of 22500 for locObj. How big can tupels get, and how can I avoid a stackoverflow? Is there another way to save my values?

标签: julia
1条回答
beautiful°
2楼-- · 2019-03-02 15:01

As it's commented, I think better approaches exist to work with big arrays of data, and this answer is mainly belongs to this part of your question:

Is there another way to save my values?

I have prepared a test to show how using mmap is helpful when dealing with big array of data, following functions both do the same thing: they create a vector of 3*10E6 float64, then fill it, calculate sum and print result, in the first one (mmaptest()), a memory-map structure have been used to store Vector{Float64} while second one (ramtest()) do the work on machine ram:

function mmaptest()
  s = open("./tmp/mmap.bin","w+") # tmp folder must exists in pwd() path
  A = Mmap.mmap(s, Vector{Float64}, 3_000_000)
  for j=1:3_000_000
    A[j]=j
  end
  println("sum = $(sum(A))")
  close(s)
end

function ramtest()
  A = Vector{Float64}(3_000_000)
  for j=1:3_000_000
    A[j]=j
  end
  println("sum = $(sum(A))")
end

then both functions have been called and memory allocation size was calculated:

julia> gc(); # => remove old handles to closed stream

julia> @allocated mmaptest()
  sum = 4.5000015e12
  861684

julia> @allocated ramtest()
  sum = 4.5000015e12
  24072791

It's obvious from those tests that with a memory-map object, memory allocation is much smaller.

julia> gc()

julia> @time ramtest()
  sum = 4.5000015e12
  0.012584 seconds (29 allocations: 22.889 MB, 3.43% gc time)

julia> @time mmaptest()
  sum = 4.5000015e12
  0.019602 seconds (58 allocations: 2.277 KB)

as it's clear from @time test, using mmap makes the code slower while needs less memory.

I wish it helps you, regards.

查看更多
登录 后发表回答