Fast tensor initialization in Julia

I would like to initialize a 3d tensor (multidimensional array) with "diagonal gaussian" values

exp(-32*(u^2 + 16*(v^2 + w^2)))

      

where u = 1/sqrt(3)*(x+y+z)

and v,w

are any two coordinates orthogonal to k u

, discretized on a uniform grid on [-1,1]^3

. The following code achieves this:

function gaussian3d(n)
    q = qr(ones(3,1), thin=false)[1]
    x = linspace(-1.,1., n)
    p = Array(Float64,(n,n,n))
    square(x) = x*x
    Base.@nloops 3 i p begin
        @inbounds p[i_1,i_2,i_3] = 
            exp(
                -32*(
                    square(q[1,1]*x[i_1] + q[2,1]*x[i_2] + q[3,1]*x[i_3])
                    + 16*(
                        square(q[1,2]*x[i_1] + q[2,2]*x[i_2] + q[3,2]*x[i_3]) + 
                        square(q[1,3]*x[i_1] + q[2,3]*x[i_2] + q[3,3]*x[i_3])
                    )
                )
            )
    end
    return p
end

      

However, it seems to be quite slow. For example, if I replace the defining function with exp(x*y*z)

, the code runs 50 times faster. Also, the macro @time

reports ~ 20% GC time for the above code, which I don't understand where they are from. (These numerical values ​​were obtained using n = 128

.) My questions are therefore

  • How can I speed up this piece of code?
  • Where is the memory allocation hidden that causes the GC overhead?
+3


source to share


1 answer


Not knowing anything about 3D tensors with "diagonal gauss" values, using a comment square

from the original post, "typing" q

( @code_warntype

helps here
: big leap in performance!), And further specializing in @nloops

, this works much faster on platforms I've tried it on.



julia> square(x::Float64) = x * x
square (generic function with 1 method)

julia> function my_gaussian3d(n)
           q::Array{Float64,2} = qr(ones(3,1), thin=false)[1]
           x = linspace(-1.,1., n)
           p = Array(Float64,(n,n,n))
           Base.@nloops 3 i p d->x_d=x[i_d] begin
               @inbounds p[i_1,i_2,i_3] =
                   exp(
                       -32*(
                           square(q[1,1]*x_1 + q[2,1]*x_2 + q[3,1]*x_3)
                           + 16*(
                               square(q[1,2]*x_1 + q[2,2]*x_2 + q[3,2]*x_3) +
                               square(q[1,3]*x_1 + q[2,3]*x_2 + q[3,3]*x_3)
                           )
                       )
                   )
           end
           return p
       end
my_gaussian3d (generic function with 1 method)

julia> @time gaussian3d(128);
elapsed time: 3.952389337 seconds (1264 MB allocated, 4.50% gc time in 57 pauses with 0 full sweep)

julia> @time gaussian3d(128);
elapsed time: 3.527316699 seconds (1264 MB allocated, 4.42% gc time in 58 pauses with 0 full sweep)

julia> @time my_gaussian3d(128);
elapsed time: 0.285837566 seconds (16 MB allocated)

julia> @time my_gaussian3d(128);
elapsed time: 0.28476448 seconds (16 MB allocated, 1.22% gc time in 0 pauses with 0 full sweep)

julia> my_gaussian3d(128) == gaussian3d(128)
true

      

+2


source







All Articles