ETL / Data Warehousing Approach Using API

I have about 20 different data sources, all small applications that have APIs (some soaps and some others). I need to combine all this data, transform it and store the data in a data warehouse. I am considering an Azure SQL Database. However, I'm not sure which tool I can use to get the API and data. Each data source has restrictions on the connections that can be made in one day, and most of these APs require running multiple APIs to get the data.

I looked at Azure Data Factory ---- It doesn't support SOAP API I looked at Logic apps ---- I'm not sure if it can do complex ETLs and can fetch data

Another option I was thinking about was having an Azure VM with SQL Server installed and then using Talend to get the data through REST AND SOAP Connectors and doing ETL. Another approach I can take is instead of using Talend using SSIS. But I believe I need third party connectors for rest and soap.

I am looking for the most cost effective and scalable solution.

Any suggestion would be very helpful.


source to share

1 answer

Azure Data Factory is (currently) a larger transform operations developer than a full-fledged ETL tool, and it can run custom actions (in C # for example) that can work with web services APIs. Its movement all the time. See here:

Using Azure Factory Data to Get Data from REST API

However, you are correct that given the scale of your transformations, you might consider an alternative. During the conversation, I don't think a lot of people have cracked the halo of how to properly replicate the complex ETL experience in the cloud, but that will change quickly.

The other option you mentioned would work (IaaS Azure VM with ETL software installed). You don't need SQL Server for Talend unless you want to store your data there and not in Azure SQL Database. It comes pre-loaded with many connectors and is a good option if you have Talend skills.

Also SSIS will work as well: its preloaded web service task covers your SOAP APIs -services / and you can use a third party task / connector for REST or use Script Task / Transform i.e. your own C # code.



All Articles