Long running Stored procedure not maintaining connection open to Azure Database

We have a very lengthy stored procedure doing ETL work to load data from a source table into a star schema (Fact-Dimensions) in an Azure database.

This stored procedure takes 10 hours to 20 hours and runs over 10 million rows (using the MERGE statement).

We are currently running a stored procedure from C # (ADO.NET) code with persistence CommandTimeout = 0

(forever). But sometimes the connection is dropped because the connection to the Azure database is unstable.

Is it possible to run a stored procedure at the database level without constantly opening a connection and then log the stored procedure process in the Progress table to track progress?

I see several recommendations:

  • The Job agent seems to be not possible in Azure Database as it is currently not supported.

  • SqlCommand.BeginExecuteNonQuery

    A: I'm not sure if 100% BeginExecuteNonQuery

    still supports the connection open under the hood or not.

Is there any other way to do this?

+3


source to share


2 answers


+3


source


Azure Data Factory has a stored procedure Task that could do this. It has a property timeout

in the section policy

that is optional. If you leave it, it defaults to infinite:

"policy": {
           "concurrency": 1,
           "retry": 3
           },

      

If you specify the timeout as 0 when creating an activity, you will see it disappear when you set a task in the portal. You can also try specifying the timeout for 1 day (24 hours), for example "timeout": "1.00:00:00"

, although I haven't verified that it times out correctly.

You can also set the timeout to 0 on the connection string, although again I haven't tested this option, for example

{
  "name": "AzureSqlLinkedService",
  "properties": {
    "type": "AzureSqlDatabase",
    "typeProperties": {
      "connectionString": "Server=tcp:<servername>.database.windows.net,1433;Database=<databasename>;User ID=<username>@<servername>;Password=<password>;Trusted_Connection=False;Encrypt=True;Connection Timeout=0"
    }
  }
}

      



I find this to be simpler than Azure Automation, but it's a personal choice. Maybe try both and see which one works best for you.

I agree with some of the other comments on MERGE

this volume of recordings taking too long. I suspect that neither your table has the proper indexing to maintain MERGE

, or you are understating the service level too much. What service level are you using, for example Basic, Standard, Premium (P1-P15). Consider creating a separate question with the DDL of your table, including indexes and some sample data, MERGE statement and service level, I'm sure this could go faster.

As a test / quick fix, you can always refactor MERGE as appropriate INSERT

/ UPDATE

/ DELETE

- I'm sure it goes faster. Let us know.

The connection between the Azure Factory data and the Azure database must be stable. If you cannot pick up support tickets. However, for a cloud architecture (and indeed any architecture), you need to make good design decisions to get things done wrong. ... This means that in architecture, you must design the possibility of a connection failure or a job failure. For example, make sure your job is restartable from the time it crashed, make sure the error reporting is good, etc.

Also, based on experience, given your data volumes (which I think are low), this work is taking too long. There must be a problem with him or with the design. I highly recommend you fix this problem.

+3


source







All Articles