BigQuery JDBC Driver will not return more than 100,000 rows

I am using StarChema JDBC driver for Google BigQuery in Pentaho PDI:

http://code.google.com/p/starschema-bigquery-jdbc/

My query through the BigQuery web console returns 129993 rows, but when I run the same query using the JDBC driver, it only returns 100,000 rows. Is there some option or limitation that I am not aware of?

+3


source to share


2 answers


The StarSchema code looks like it only returns the first page of results.

The code here here needs to be updated to get the rest of the results. It should look something like this:



public static GetQueryResultsResponse getQueryResults(Bigquery bigquery,
        String projectId, Job completedJob) throws IOException {        
    GetQueryResultsResponse queryResult = bigquery.jobs()
            .getQueryResults(projectId,
                    completedJob.getJobReference().getJobId()).execute();
    while(queryResult.getTotalRows() > queryResult.getRows().size()) {
        queryResult.getRows().addAll(
            bigquery.jobs()
                .getQueryResults(projectId,
                        completedJob.getJobReference().getJobId())
                .setStartIndex(queryResult.getRows().size())
                .execute()
                .getRows());            
    }
    return queryResult;
}

      

+1


source


Modified code based on Jordan answer, solution looks like this:

    public static GetQueryResultsResponse getQueryResults(Bigquery bigquery,
        String projectId, Job completedJob) throws IOException {
    GetQueryResultsResponse queryResult = bigquery.jobs()
            .getQueryResults(projectId,
                    completedJob.getJobReference().getJobId()).execute();
    long totalRows = queryResult.getTotalRows().longValue();
    if(totalRows == 0){ 
//if we don't have results we'll get a nullPointerException on the queryResult.getRows().size()
        return queryResult;
    }
    while( totalRows  > (long)queryResult.getRows().size() ) {
        queryResult.getRows().addAll(
            bigquery.jobs()
                .getQueryResults(projectId,
                        completedJob.getJobReference().getJobId())
                .setStartIndex(BigInteger.valueOf((long)queryResult.getRows().size()) )
                .execute()
                .getRows());           
    }
    return queryResult;
}

      



This should fix the problem. Also uploaded new version to google code called bqjdbc-1.3.1.jar

+1


source







All Articles