How can I read part of a dBase file

I have a very large dBase file (1.64Gb). It takes a very long time to load an entire file into R using a standard function foreign::read.dbf()

. I would like to load only a few variables in the dataset. Anyone have a solution?

+3


source to share


1 answer


I think the function read.dbf(...)

in the package foreign

is for reading part of the *.dbf

shapefile, in which case reading in part of the file really doesn't make sense. You seem to want to do something different.

Usage RODBC

may work depending on your system configuration. If you are using Windows and if you have the ODBC dBASE drivers installed, this will probably work for you (note: when you install MSOffice it installs a custom dsn called "dBase files" which should be accessible from RODBC

. So if you have MSOffice, this should work ...).

Important note . This will only work if you are using the 32-bit version of R. This is because there are no 64-bit ODBC dBASE drivers. Typically, when you download 64-bit R, you get both 32- and 64-bit versions, so it just switches between them.

library(RODBC)
# setwd("< directory with your files >")
conn <- odbcConnect(dsn="dBASE Files")
df   <- sqlFetch(conn,"myTable",max=10)   # grab first ten rows
head(df)
#       LENGTH COASTLN010
# 1 0.02482170          1
# 2 0.01832134          2
# 3 0.03117752          3
# 4 0.04269755          4
# 5 0.02696307          5
# 6 0.05047828          6

sqlQuery(conn,"select * from myTable where LENGTH<0.008")
#       LENGTH COASTLN010
# 1 0.00625200        186
# 2 0.00634897        379
# 3 0.00733319       1583
# 4 0.00369786       1617
# 5 0.00722233       1618
# 6 0.00524176       1636

      



The above example is only meant to give you an idea of ​​how to use RODBC

. In this example, I have a file myTable.dbf

in the "with all your files" directory and this dbf has two columns: LENGTH

and COASTLN010

(this file is actually part of the shoreline shapefile, but that doesn't matter ...).

If that doesn't work, try:

 conn <- odbcConnectDbase("myTable.dbf")

      

+7


source







All Articles