Julia uninitialize an array at a specific index

Question

Julia uninitialize an array at a specific index

I am writing a neural network in Julia that tests random topologies. I have left all the indices in the node array that are not occupied by the node (but which may be under the future topology) undefined as it saves memory. When a node in the old topology no longer connects to other nodes in the new topology, is there a way to initialize the index that the node belongs to? Also, are there any reasons not to do it this way?

local preSyns = Array(Vector{Int64}, (2,2)) 
println(preSyns)

preSyns[1] = [3]
println(preSyns)

Output

[#undef #undef

#undef #undef]

[[1] #undef

#undef #undef]

How do I create the first undefined index like it did during the first printout?

If you don't believe me please see below for memory issue

function memtest()
    y = Array(Vector{Int64}, 100000000)
end

function memtestF()
    y = fill!(Array(Vector{Int64}, 100000000),[])
end

@time memtest()
@time memtestF()

Output

elapsed time: 0.468254929 seconds (800029916 bytes allocated)
elapsed time: 30.801266299 seconds (5600291712 bytes allocated, 69.42% gc time)

an un-initialized array takes 0.8 gig, and an initialized one takes 5 gigs. The activity monitor also confirms this.

+3

arrays julia-lang

James beezho 06 May '15 at 12:11

source to share

1 answer

Matt B. · Accepted Answer · 2015-05-11T17:20:29+0000

Undefined values are essentially null pointers, and there is no first-class way to "undo" an array element back to a null pointer. This gets trickier with very large arrays, as you don't want to have much (or any) overhead for your sentinel values, which are unrelated nodes. On a 64 bit system, an array of 100 million elements takes up ~ 800 MB for pointers only, and an empty array takes 48 bytes for its header metadata. So if you assign a separate empty array to each element, you end up with ~ 5GB array headers.

The behavior fill!

in Julia 0.3 is a bit bad (and was fixed in 0.4). If instead of filling your array []

, you are fill!

explicitly typed Int64[]

, each element will point to the same empty array. This means that your array and its elements will be no more than 48 bytes long than an uninitialized array. But this also means that changing this subarray for one of your nodes (for example with help push!

) will mean that all nodes will receive this connection. This is probably not what you want. You can still use an empty array as a reference, but you must be very careful not to change it.

If your array will be densely packed with subarrays, then there is no easy way to get around this overhead for array headers. More reliable and able to move way to initialize an array with independent empty arrays is to understand: Vector{Int64}[Int64[] for i=1:2, j=1:2]

. It will also be more efficient at 0.3 since it doesn't need to convert []

to Int64[]

for every element. If each element is likely to contain a non-empty array, you will need to pay the array overhead anyway. To remove a node from the topology, you simply call empty!

.

If, however, your node array will rarely be packed, you can try another data structure that will directly support unset items. Depending on your use case, you can use a default dictionary that maps an index tuple to your vectors (from DataStructures.jl ; use a function to ensure that the "default" is a newly allocated and independent empty array each time), or try the package. intended for topologies like Graphs.jl or LightGraphs.jl .

Finally, to answer the actual question you asked, yes, there is a hacky way to reverse an array element to #undef

. This is unsupported and may break at any time:

function unset!{T}(A::Array{T}, I::Int...)
    isbits(T) && error("cannot unset! an element of a bits array")
    P = pointer_to_array(convert(Ptr{Ptr{T}}, pointer(A)), size(A))        
    P[I...] = C_NULL
    return A
end

Julia uninitialize an array at a specific index

More articles: