R to sqldf

I need to loop over the sqldf statements, and for that I need to call the loop variable inside the sqldf code:

My data tables can be:

data <- read.table(text ="
    loaddate DaysRange DaysRangeNext
1 2014-03-16        30            30
2 2014-03-16         0             0
3 2014-03-16         0             0
4 2014-03-16        60            NA
5 2014-04-16        30            30
6 2014-04-16         0            30
"
,header = TRUE)

      

then i format loaddate as date:

data$loaddate<-as.Date(as.character(data$loaddate), format='%Y-%m-%d')

      

Let's say I have a vector "loaddates":

loaddates<- unique(sort(data$loaddate))

      

And I need to run the following code for each loaddate:

for (i in loaddates) {

sqldf("
SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext, 
COUNT(*) AS clientes
FROM data AS D
WHERE D.loaddate = i
GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
ORDER BY D.DaysRange, D.DaysRangeNext
")        }

      

But I am getting followng error:

Error in sqliteSendQuery (con, statement, bind.data): error in statement: no such column: i

Is there a way to store the value of a variable and use it inside a loop?

Thank.

EDITION:

I tried:

sqldf(
strwrap(sprintf("
SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext, 
COUNT(*) AS clientes 
FROM data AS D
WHERE D.LoadDate = '%s'
GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
ORDER BY D.DaysRange, D.DaysRangeNext
",i),simplify=TRUE,width=1000000))

      

But I got:

> [1] loaddate      DaysRange     DaysRangeNext clientes      <0 rows>
> (or 0-length row.names)

      

+3


source to share


4 answers


The variable i

will not be replaced in the request as is. You need sprintf

to assign a value to it. (I also don't know if you need to account for line breaks, but to be sure I provide it below. Maybe you don't need it for sqldf, in which case just remove strwrap).

#let assume loaddates is the following:
loaddates <- 'something'

      

One of the ways to get the request according to your desire, i.e. without break lines and i

given the loaddates values:

strwrap(sprintf("
                SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext, 
                COUNT(*) AS clientes
                FROM deuda AS D
                WHERE D.loaddate = '%s'
                GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
                ORDER BY D.DaysRange, D.DaysRangeNext
                ",i),simplify=TRUE,width=1000000)

      

which will output:

[1] "SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext, COUNT(*) AS clientes FROM deuda AS D WHERE D.CodEmp = 'TGG' and D.loaddate = something GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext ORDER BY D.DaysRange, D.DaysRangeNext"

      

This is what you need in one line with no break lines or with no variable i

assigned.



In your loop, it should be:

for (i in loaddates) {

strwrap(sprintf("
                SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext, 
                COUNT(*) AS clientes
                FROM deuda AS D
                WHERE D.loaddate = '%s'
                GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
                ORDER BY D.DaysRange, D.DaysRangeNext
                ",i),simplify=TRUE,width=1000000)

}

      

With your dataset:

library(sqldf)
data <- read.table(text ="
    loaddate DaysRange DaysRangeNext
1 2014-03-16        30            30
2 2014-03-16         0             0
3 2014-03-16         0             0
4 2014-03-16        60            NA
5 2014-04-16        30            30
6 2014-04-16         0            30
"
                   ,header = TRUE,stringsAsFactors=F)

loaddates<- unique(sort(data$loaddate))

for (i in loaddates) {

  print(sqldf(
  strwrap(sprintf("
                SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext, 
                COUNT(*) AS clientes
                FROM data AS D
                WHERE D.loaddate = '%s'
                GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
                ORDER BY D.DaysRange, D.DaysRangeNext
                ",i),simplify=TRUE,width=1000000) ))
}

      

Output:

    loaddate DaysRange DaysRangeNext clientes
1 2014-03-16         0             0        2
2 2014-03-16        30            30        1
3 2014-03-16        60            NA        1
    loaddate DaysRange DaysRangeNext clientes
1 2014-04-16         0            30        1
2 2014-04-16        30            30        1

      

+3


source


You can get this by specifying the SQL statement in a loop, but outside the function call.



for (i in loaddates) {

statement = paste( " SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext,
              COUNT(*) AS clientes
              FROM data AS D
              WHERE D.loaddate = ", i,
"GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
ORDER BY D.DaysRange, D.DaysRangeNext " )

sqldf(statement)
}

      

+1


source


First, create a new data frame. Then join it:

num_Pcode <- as.numeric("3550")
df_Pcode_0 <- as.data.frame(num_Pcode)
df_Pcode_0
...

      

returns num_Pcode

.

0


source


fn$sqldf

allows to use $

inside sql statement to interpolate R variables. See Example 5 on the sqldf github home page and see the bottom of the man page ?fn

for many more examples. We could reduce it setNames(loaddates, loaddates)

to just loaddates

if we don't need the output names.

Map(function(i)
  fn$sqldf("
    SELECT D.LoadDate,D.DaysRange, D.DaysRangeNext, 
    COUNT(*) AS clientes
    FROM data AS D
    WHERE D.loaddate = $i
    GROUP BY D.LoadDate,D.DaysRange, D.DaysRangeNext
    ORDER BY D.DaysRange, D.DaysRangeNext
  "), setNames(loaddates, loaddates))

      

giving:

$`2014-03-16`
    loaddate DaysRange DaysRangeNext clientes
1 2014-03-16         0             0        2
2 2014-03-16        30            30        1
3 2014-03-16        60            NA        1

$`2014-04-16`
    loaddate DaysRange DaysRangeNext clientes
1 2014-04-16         0            30        1
2 2014-04-16        30            30        1

      

0


source







All Articles