Using Google Cloud Dataflow to Merge Flat Files and Import to Cloud SQL

We have to read data from CSV files and match two files one column at a time and transfer data to Cloud SQL using Google Cloud Dataflow.

We can read data from CSV files, but stick to the next steps. Please provide me with information or links regarding the following:

  • Concatenate / merge into flat files based on one column or condition with multiple columns
  • Copy merged pcollection to Sloud SQL database
+3
google-app-engine flat-file google-cloud-platform google-cloud-sql google-cloud-dataflow


source to share


1 answer


Here are some pointers that might be helpful:



  • https://cloud.google.com/dataflow/model/joins describes how to combine PCollection into a data stream
  • There is currently no built-in receiver for writing to CloudSQL, however you can either just process your connection results using ParDo, which writes every single record or batch (flushing periodically or in finishBundle ()), or if your needs are more complex than that , consider writing a CloudSQL sink - see https://cloud.google.com/dataflow/model/sources-and-sinks
+2


source to share







All Articles
Loading...
X
Show
Funny
Dev
Pics