Tokyo Cabinet and C ++ Variable Objects

I am building a system with C ++ that uses Tokyo Cabinet (original API in C). The problem is, I want to keep the class, for example:

    class Entity {
      public:
        string entityName;
        short type;
        vector<another_struct> x;
        vector<another_struct> y
        vector<string> z;
    };

      

The problem is that vectors and strings are of variable length. When I pass void * (my object) to Tokyo Cabinet so that it can store it, I also need to pass the size of the object in bytes. But this cannot be done trivially.

What's the best way to determine the number of bytes of an object? Or what is the best way to store variable length objects in Tokyo Cabinet.

I am already considering looking for serialization libraries.

thank

+2


source to share


5 answers


You cannot port the handling of non-POD C ++ struct / class as a raw sequence of bytes - this is regardless of the use of pointers or std::string

and std::vector

, although the latter actually guarantee that they will be broken in practice. You need to first serialize the object into a character sequence - I would suggest Boost.Serialization for a nice, flexible cross-platform serialization platform.



+9


source


I think this is worse. The actual storage for the vectors does not touch the rest of the object. You see that they std::vector<>

are storing their data in separate allocations on the heap (so they can expand if needed). You will need an API that understands C ++ and STL.



In short. It won't work.

+4


source


I had a similar problem even though I was using HDF5 . In my case, there is an additional requirement that I can read the sub-parts of an object, and therefore serialization is not really an option.

HDF is very similar to a large array where the index is used to access the data. The solution I'm using is to add a "previous index" to the table that holds the type another_struct

.

Taking your example, if "x" and "y" had 3 and 2 elements each, then the data would be stored like this:

[ index ] [ another_struct data here ] [ previous_index ]
[   0   ] [       x data 0           ] [ -1 ]
[   1   ] [       x data 1           ] [  0 ]
[   2   ] [       x data 2           ] [  1 ]
[   3   ] [       y data 0           ] [ -1 ]
[   4   ] [       y data 1           ] [  3 ]

      

And then the last added index is stored in the main Entity table:

[ index ] [ Entity data here ] [ x ] [  y ]
[   0   ] [        ...       ] [ 2 ] [  4 ]

      

I don't know how the Tokyo Cabinet works, although this approach should work, it may not be optimal for this data format. Ideally, if you can have pointers to real Tokyo cabinet objects, instead of using indices like I did above, you can store those pointers.

0


source


yes you are better off using forced serialization or protobuf to sterilize the object and put it in the cabinet

0


source


I am using "Protocol Buffers" to store my C ++ objects as data values ​​in a Tokyo closet.

In protocol buffers, you specify the structure and generate marshalling / unmarking code for C ++, Python, and Java. In your case, the .proto file will look like this:

message Entity {
    optional string entityName = 1;
    optional int32 type = 2; //protobuf has no short
    short type = 3;
    repeated AnotherStruct x = 4;
    repeated AnotherStruct y = 5;
    repeated string z = 6;
};

      

Especially if the database has existed for a long period of time, a system that can be updated, for example. to cover new fields is very valuable. Unlike XML and others, protobuf is pretty fast.

0


source







All Articles