F # GPU programming vs KDB for data crunching, which is fastest?

Question

F # GPU programming vs KDB for data crunching, which is fastest?

Hi, I would like to ask someone to know what is the most economical and efficient way of crunching huge amounts of data with F # GPUs (for example using a Nivida GPI api type C program) while programming against KDB to crunch data.

I know both approaches are completely different, but they just want advice from people who have worked before investing in one or both technologies.

For the GPU side, I am planning to work with hanging relational DB or NoSQL DB like mongodb using separate tables and simple joins from 2-3 other tables.

Does anyone know of any metrics or comparisons (speed mostly) between both approaches?

+3

f # gpu-programming kdb

Nikos 07 Feb 13 at 17:33

source to share

2 answers

In my honest opinion, it is much easier to create complex queries in KDB (and understand them later) than "something like MongoDB".

I'm a F # fan too.

Now, either F # or KDB + can help you think in a GPU-compatible way (array, whole task, less linear, parallelism). Whichever choice you choose, think about the process that will get you there, and whether you are locked in one particular worldview or not.

When it comes to modeling, context is pretty important. It really depends on what models you want to run and how the bandwidth factors are.

KDB + agility, speed and speed are amazing. Likewise, F # is great for type safety and for materials-based research like life sciences.

Nothing prevents you from using both together. Oh, and the 32-bit version of KDB + can now be used in both commercial and non-commercial ways.

Like John, I have also tried many variations from BerkeleyDB and above. In particular, the non-KDB + columnar options lacked multiple ways (not just performance). I looked at it from a kernel perspective and even spoke to some of the engineers who were working on those kernels when the sales teams gave up. There are underlying reasons why KDB +, beyond benchmarks, is a smart way to move forward.

Speed is a factor that weighs more or less depending on the application. Other factors and how they relate to product cards are probably universal.

+1

sgtz 04 jul. '14 at 9:10

source to share

John at TimeStored · Accepted Answer · 2013-02-07T20:46:22+0000

As others have said, too much depends on your use case which is faster. I previously helped build a 15 query test framework and some algorithmic strategies against several different stock databases:

PostGreSQL
mysql - in memory version
mongodb - for supported queries
KDB
plus a few other new nosql and column oriented databases.

Database

kdb was significantly faster than the ones mentioned above in most requests. One database was close in performance, but it was significantly more difficult to get it to do the calculations I wanted.

Not. I cannot give hard numbers because it is against the terms of some database vendors. But I would like to emphasize that if you are going to build a system, the skills of your team should influence the choice. Plus, you can quickly change the system and its programming.

F # GPU programming vs KDB for data crunching, which is fastest?

More articles: