Using R to read excel-colorinfo

Is there a way to read cell colorindex from excelfiles with R? While I can set the cell color with packages like XLConnect or XLSX, I haven't found a way to extract color information from existing books.

Thanks for any help Ben

+4


source to share


2 answers


R-Bloggers have provided a feature that will do the job for you. I've included the answer here for future reference.

Read the Excel file using the package xlsx

:

library(xlsx)
wb     <- loadWorkbook("test.xlsx")
sheet1 <- getSheets(wb)[[1]]

# get all rows
rows  <- getRows(sheet1)
cells <- getCells(rows)

      

This part extracts information that will later be used to get the background color (or other style information) of the cells:

styles <- sapply(cells, getCellStyle) #This will get the styles

      

This function

, which defines / retrieves the background color of the cell:

cellColor <- function(style) 
   {
    fg  <- style$getFillForegroundXSSFColor()
    rgb <- tryCatch(fg$getRgb(), error = function(e) NULL)
    rgb <- paste(rgb, collapse = "")
    return(rgb)
   }

      



error

will handle cells with no background color.

Using sapply

, you can get the background color for all cells:

sapply(styles, cellColor)

      

You can also classify / identify them by knowing the RGb codes:

mycolor <- list(green = "00ff00", red = "ff0000")
m     <- match(sapply(styles, cellColor), mycolor)
labs  <-names(mycolor)[m]

      

You can read more and learn how to apply this in R bloggers

You can get RGB codes at RapidTables.com

+6


source


Old question, but maybe it can help someone in the future.

The POI library (Java) has strange behavior (at least on my computer). He's not getting the colors right. The code provided in @ M --'s answer works well when the color is the foreground color (indexed color), but doesn't work when the color is, for example, grayscale. To get around this, you can use the following code using a function getTint ()

. Hue is a number between -1 (dark) and 1 (light), and by combining it with the RGB ( getRgb ()

) function , you can completely restore the color.

cell_color <- function(style){
  fg  <- style$getFillForegroundXSSFColor()

  hex <- tryCatch(fg$getRgb(), error = function(e) NULL)
  hex <- paste0("#", paste(hex, collapse = ""))
  tint <- tryCatch(fg$getTint(), error = function(e) NULL)

  if(!is.null(tint) & !is.null(hex)){   # Tint varies between -1 (dark) and 1 (light)
    rgb_col <- col2rgb(col = hex)

    if(tint < 0) rgb_col <- (1-abs(tint))*rgb_col
    if(tint > 0) rgb_col <- rgb_col + (255-rgb_col)*tint

    hex <- rgb(red = rgb_col[1, 1], 
               green = rgb_col[2, 1], 
               blue = rgb_col[3, 1], 
               maxColorValue = 255)
  }

  return(hex)
}

      

Some links to help:



https://poi.apache.org/apidocs/dev/org/apache/poi/hssf/usermodel/HSSFExtendedColor.html#getTint--

https://bz.apache.org/bugzilla/show_bug.cgi?id=50787

Getting Excel fill colors using Apache POI

0


source







All Articles