Parallel loading odbc in Stata

Question

Parallel loading odbc in Stata

I am trying to parallelize loading data over ODBC connection to Stata using a parallel command from SSC.

Here is a sample file myloop.do that iterates over two IDs:

#delimit;

foreach id in 77860668 1124831359 {;
    odbc load, exec(`"
    SELECT
        user_slctd_id
        ,user_id
        FROM dw_table
    WHERE user_id =`id'
    "') dsn("my_dsn") lowercase sqlshow clear;

    save "id_`id'.dta", replace;
};

It works well. However, when I try to run this code in parallel, follow these steps:

parallel setclusters 2
parallel do "myloop.do"

It hangs on "Waiting for clusters to complete".

How do I diagnose what's going on? Is there a better way to get data in parallel and add all files together?

I am using parallel module version 1.14.6.17 17jun2014.

+3

parallel-processing odbc stata

Dimitriy V. Masterov 04 Aug 14 at 22:21

source to share

2 answers

Which ODBC driver are you using? What type of database?

This is almost like a situation where a driver can only handle one statement per connection. It's hard to tell without knowing what's going on at the driver level.

I would suggest creating an ODBC trace to see what the ODBC calls are doing while hanging. Check for any underlying errors that might be the root cause of the "wait" message.

Check your driver documentation for the specific surrounding number of connection and stream statements; in case this could be a factor.

+2

PRGSTony 21 Aug 14 at 14:25

source to share

gvegayon · Accepted Answer · 2014-10-04T08:17:36+0000

If you want to know what's going on with the concurrent behind the scenes, you should use the keep option, like this:

parallel setclusters 2
parallel do "myloop.do", keep

This will mean that it should not delete auxiliary files (among them the log file).

Also, if you do this, you can always check out the latest version in the project repo http://github.com/gvegayon/parallel

Please let me know if this helps.

Parallel loading odbc in Stata

More articles: