Importing process maxing SQL memory

I have an importer process that runs as a windows service (debug as an application) and processes various XML documents and csvs and imports them into a SQL database. Everything was fine until I had to process a large amount of data (120k rows) from another table (like I do XML documents).

Now I've found that the memory usage of the SQL server hits the point where it just hangs. My app never gets a timeout from the server and everything just goes STOP.

I can still make calls to the database server separately, but this application thread just got stuck with no visible thread in SQL Activity Monitor and didn't work in Profiler.

Any ideas on where to start solving this problem would be greatly appreciated as we have been battling it for over a week now.

The main architecture is C # 2.0 using NHibernate as the ORM data is pulled into the actual C # logic and processed and then dumped back to the same database along with the logs to other tables.


The only other problem that sometimes happens is that for some reason the cursor is opened on this massive table, which I can only assume is generated from ADO.net, for example the expression sp_cursorfetch 180153005,16,113602,100 is called thousands times according to Profiler

0


source to share


5 answers


I found out that nHibernate was creating a cursor on a large table. I haven't figured out why yet, but at the same time I replaced the large tabular data access model with direct ado.net addresses



0


source


When did you COMMIT

ting the data? Are there locks or locks (sp_who)? If 120,000 rows are considered large, how much RAM is SQL Server using? When the application freezes, is there anything about where it hangs (is it INSERT

, search SELECT

, or what?)?

It seems to me that the commit size is too small. Typically in SSIS ETL tasks I use a batch size of 100,000 for narrow strings with sources greater than 1,000,000 in capacity, but I never go below 10,000 even for very wide strings.



I wouldn't use an ORM for a large ETL unless the transformations were extremely complex with a lot of business rules. Even with a lot of relatively simple business transformations, I would consider loading the data into simple staging tables and using T-SQL to do all the inserts, searches, etc.

+1


source


Are you running this in SQL using BCP? If not, the transaction logs may not be able to keep up with their input. On a test machine, try switching the recovery mode to "Simple" (no logging) or use BCP methods to get the data (they bypass the T log)

0


source


Adding an answer to a StingyJack question ...

If you are unable to use direct BCP due to processing requirements, have you considered importing on a separate SQL Server (separate block) using your tool and then running BCP?

The key to doing this job will be to keep the staging machine clean, that is, no data other than the current working set. This should result in enough RAM usage for the import to work since you are not colliding with tables, I suppose, millions of records. The end result will be one view or table in this second database, which can easily be BCP'ed to real when all processing is complete.

The downside is, of course, the presence of another box ... and a much more complex architecture. And it all depends on your schema and whether such a thing can be easily maintained ...

I had to do this with a very large and complex import and it has worked well in the past. Expensive but effective.

0


source


Since you are rewriting anyway, you might not be aware that you can call BCP directly from .NET through the System.Data.SqlClient.SqlBulkCopy class. See this article for some interesting information about performance.

0


source







All Articles