How to store data with tree structure in Julia

I want to keep high frequency financial data in memory while I work with it in Julia.

My data is in lots of Float64 arrays. Each array stores high frequency data for one day, for some security, in some market. For example, for the date 2010-01-04 for IBM listed on the NYSE (New York Stock Exchange), there is one array Float64.

As said, I have many of these arrays spanning multiple dates, markets and securities. I want to store them all in one object so that it is easy to get any given array (perhaps using a metadata tree structure).

In Matlab, I used this in a framework where the first level is the market, the next level is security, the next level is the date, and then at the end of the tree is the corresponding array. At each level, I also kept a list of fields at that level.

Julia doesn't really have an equivalent to Matlab structures , so what's the best way to do this in Julia?

Currently, the best I can find is a sequence of nested composite types, each with two fields. For example:

type HighFrequencyData
    dateList::Array{Date, 1}
    dataArray::Array{Any, 1}
end

      

where dateList

is stored a list of dates corresponding to the sequence of Float64 arrays stored in dataArray

(i.e. dateList

and dataArray

will have the same length). Then:

type securitiesData
    securityList::Array{String, 1}
    highFrequencyArray::Array{Any, 1}
end

      

where securityList

is stored a list of securities matching the type sequence HighFrequencyData

stored in highFrequencyArray

. Then:

type marketsData
    marketList::Array{String, 1}
    securitiesArray::Array{Any, 1}
end

      

where marketList

is stored a list of markets that match the type sequence securitiesData

stored in securitiesArray

.

With this in mind, now all data can be stored in a type variable marketsData

and will be searched using marketList

, securityList

and dateList

at each nesting level.

But that seems a little cumbersome ...

+3


source to share


1 answer


Your type hierarchy looks ok, but maybe dictionaries are all you need?

all_data = ["Market1" => {
             ["Sec1" => {[20140827, 20140825], [1.05, 10.6]}],
             ["Sec2" => {[20140827, 20140825], [1.05, 10.6]}]},
            "Market2" => {
             ["Sec1" => {[20140827, 20140825], [1.05, 10.6]}],
             ["Sec2" => {[20140827, 20140825], [1.05, 10.6]}]},
            ...]

println(all_data["Market1"]["Sec1"] ./ all_data["Market2"]["Sec1"])

      

If you could post what the MATLAB code looks like, that might be helpful too.



I would reformulate your types a bit, maybe something simpler like

type TimeSeries
    dates::Vector{Date}
    data::Vector{Any}
end

typealias Security (String,TimeSeries)
typealias Market Vector{Security}

markets = Market[]

push!(markets, [("Sec1",TimeSeries(...)), ("Sec2",TimeSeries(...)])

      

Also, don't forget to check out https://github.com/JuliaStats/TimeSeries.jl

+5


source







All Articles