How can you improve the speed of nested arrays in Julia?

The following function nested_arrays

generates a (surprisingly) nested "depth" array n

. However, when working with small values n

( 2

, 3

etc.), it takes a long time for the output to run and display.

julia> nested_arrays(n) = n == 1 ? [1] : [nested_arrays(n - 1)]
nested_arrays (generic function with 1 method)

julia> nested_arrays(1)
1-element Array{Int64,1}:
 1

julia> nested_arrays(2)
1-element Array{Array{Int64,1},1}:
 [1]

julia> nested_arrays(3)
1-element Array{Array{Array{Int64,1},1},1}:
 Array{Int64,1}[[1]]

julia> nested_arrays(10)
1-element Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1}:
 Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1}[Array{Array{Array{Array{Array{Int64,1},1},1},1},1}[Array{Array{Array{Array{Int64,1},1},1},1}[Array{Array{Array{Int64,1},1},1}[Array{Array{Int64,1},1}[Array{Int64,1}[[1]]]]]]]]]

      

Interestingly, when using a macro @time

or ;

at the end of a line, the result takes relatively little time to compute. Instead, it takes most of the time to actually display the result in the REPL.

This strange behavior is not shown, for example, in Python.

In [1]: def nested_lists(n):
   ...:     if n == 1:
   ...:         return [1]
   ...:     return [nested_lists(n - 1)]
   ...: 

In [2]: nested_lists(10)
Out[2]: [[[[[[[[[[1]]]]]]]]]]

In [3]: %time nested_lists(100)
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 37.7 µs
Out[3]: [[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[1]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]

      

Why is this feature so slow in Julia? Julia recompiles the function display

for different types T

in Array{T, 1}

? If so, why?

Could the speed of this code be improved or it just won't be done in Julia? My main concern for this in a practical sense would be, for example, loading a complex, nested JSON file where simply using a n

-dimensional array would not be possible.

+3


source to share


1 answer


Yes, this is entirely due to compile time. You can see this at @time

-in << 21>. The second time you show it quickly:

julia> nested_arrays(n) = n == 1 ? [1] : [nested_arrays(n - 1)]
nested_arrays (generic function with 1 method)

julia> @time display(nested_arrays(15));
1-element Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1},1},1},1}:
 Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1}[Array{Array{Array{Array{Array{Int64,1},1},1},1},1}[Array{Array{Array{Array{Int64,1},1},1},1}[Array{Array{Array{Int64,1},1},1}[Array{Array{Int64,1},1}[Array{Int64,1}[[1]]]]]]]]]]]]]]
 11.682721 seconds (8.83 M allocations: 371.698 MB, 1.82% gc time)

julia> @time display(nested_arrays(15));
1-element Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1},1},1},1}:
 Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1}[Array{Array{Array{Array{Array{Int64,1},1},1},1},1}[Array{Array{Array{Array{Int64,1},1},1},1}[Array{Array{Array{Int64,1},1},1}[Array{Array{Int64,1},1}[Array{Int64,1}[[1]]]]]]]]]]]]]]
  0.001688 seconds (2.38 k allocations: 102.766 KB)

      

So why is it so slow? The display here recursively loops through all the arrays and prints them nested within each other. This calls recursively show

with 14 different types - one with 14 nested arrays, then its element with 13 nested arrays, then its element with 12 ... and so on! Each of these methods is show

automatically compiled. Compiling specialized methods for specific element types is a key part of how Julia can create very efficient code. This means that it can specialize each operation performed on each element without checking or sending the runtime type. Unfortunately, in this case it gets in the way.



You can work around this with an array Any[]

; in the context of a JSON file, this makes a lot of sense as you don't know if they will contain strings or arrays or numbers, etc. This is much faster because you only need to compose the show method on the Any[]

array once, and then it recursively uses it.

# new session
julia> nested_arrays(n) = n == 1 ? Any[1] : Any[nested_arrays(n - 1)]
nested_arrays (generic function with 1 method)

julia> @time display(nested_arrays(15));
1-element Array{Any,1}:
 Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[1]]]]]]]]]]]]]]
  1.571632 seconds (767.12 k allocations: 32.472 MB, 1.04% gc time)

julia> @time display(nested_arrays(15));
1-element Array{Any,1}:
 Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[1]]]]]]]]]]]]]]
  0.000606 seconds (839 allocations: 30.859 KB)

julia> @time display(nested_arrays(100));
1-element Array{Any,1}:
 Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[1]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
  0.002523 seconds (17.76 k allocations: 579.297 KB)

      

+5


source







All Articles