Processing CSV files from the Internet using the built-in Java database

Short version: if I don't want to store data for long, how can I create a database programmatically in HSQLDB and load some CSV data into it? My schema will match the files exactly, and the files have matching column names.

This is a raw process.

More details:

I need to apply some simple SQL techniques to three CSV files being downloaded over the internet and then create some DTOs that I can then use with some existing code to process them some more and save them via REST. I don't really want to mess with the databases, but the CSV files are associated with foreign keys, so I thought about using an embedded in-memory database to do the job and then threw away all of it.

I was referring to a command line application that works like this:

  • Create a new database in HSQLDB.
  • Run three HTTP GETs on three threads using Apache HttpClient.
  • Import CSV into three HSQLDB MEMORY.
  • Run some SQL.
  • Parse the results into my existing DTOS.
  • Etc ...

I could use code pointers and utilities useful for items 1 and 3. Also is there an alternative to HSQLDB that I should consider?

0


source to share


2 answers


Check opencvs . This will help you parse CSV files.



0


source


The command line application you are using is the SqlTool utility provided with HSQLDB. Your procedure can be completed as follows:

  1. Create a new in-memory HSQLDB database (just connect to in-memory database).
  2. Run three HTTP GETs using Apache HttpClient to get CSV files.
  3. Create three HSQLDB TEXT tables and set the SOURCE of these tables to CSV
  4. Run some SQL. Parse the results into your existing DTOs.


Creating TEXT tables in pure in-memory tables was not possible when the question was asked. It is now fully supported in HSQLDB 2.x versions.

0


source







All Articles