Recommended structure for data aggregation

Question

Recommended structure for data aggregation

We have an application that will collect data and store it on local WinXP PCs using Microsoft SQL Server Compact. We want to consolidate this data into one full-blown SQL Server for reporting and archiving. The data transport should be fairly contiguous (ie, not packed), although some latency (max or max) is acceptable.

Data is a one-way push from collectors to the server. Collectors never need to know what other collectors are doing, and the primary server will never update data on the collector. The current plans are for 5 collectors, but they are essentially unlimited for scalability.

We have to assume that we will be "mostly connected", but we cannot guarantee connection from collectors on the server. If the server or network goes down, we will still collect and the data will be thrown back when the server is available again.

Ideally, we would like a non-programming engineer to be able to set up once we have done the infrastructure work. Thus, we have written the code and wizards perfectly, but the end user cannot assume that they know anything about writing the code, although they will have reasonable technical computer literacy.

We now have two technology candidates:

SQL replication
Microsoft Sync Services

We have little experience with the former, but we know that setting up a subscription, etc. on SQL Server is painful and not fun to debug, so we're trying to find an alternative.

We know almost nothing about # 2, only this has been suggested as an alternative for getting device data to a server.

Does anyone have experience with this type of scenario or with any of these technologies or anything that we don't think they can share? SQL Compact on Collectors is a fixed requirement. SQL Server is not required on the server, but desirable since the client already has it.

+1

sql-server sql-server-ce replication microsoft-sync-framework

ctacke 19 nov. '08 at 19:48

source to share

4 answers

I used Microsoft Sync Services before it was fully released. I liked it and it looks like a perfect fit for your application.

I recommend, if you want to make life easier for yourself, use the GUID (Unique SQL Server Identifier) as the primary keys for all tables that you want to sync with the primary server. This will prevent conflicts and a lot of extra coding.

One caveat: I heard that Sync Services has changed significantly since it was released for the first version, so my information is probably out of date.

+1

Scott whitlock 29 nov. '08 at 1:01

source to share

Try syncing and tell me how it works :) I saw the MSFT event and these guys say, "I added these 3 lines of code and everything just syncs ... wooohhooo".

Sounds like going to me.

0

Sam 19 nov. '08 at 20:51

source to share

The problem with replication is that when you change your schema, you will have manual work to be done on each client to run replication over and over again. I have no experience with Sync Services, but I would ask the same question: what happens when the schema changes? If you need to touch every customer, this can be a challenge.

0

Brent ozar 29 nov. '08 at 0:54

source to share

ctacke · Accepted Answer · 2008-12-31T15:55:10+0000

I ended up with option 3: no. Instead, we just use the SqlBulkCopy class periodically (user preferences, but up to 5 seconds by default) to copy the entries across. This works well because it allows us to loop through the IDataReader, so we open the table locally with TableDirect, look for the highest RowID from the remote table, and then pass the reader to the WriteToServer class.

Recommended structure for data aggregation

More articles: