Taskpool system for .NET.

I am currently writing a bulk processing algorithm for determining the pitch of an audio stream from disc. I have tightened my algorithm in such a way that it works in near real time for serial data.

Ideally, I would like the system to be faster than real-time so that I can transmit data in real time and create track altitude data after a short delay.

Now what amazes me is that sequential processing is where I could give a lot of speedup. I am running an i7 quad core (with 8 hardware threads), so I should be able to significantly increase the speed by extending the processing across multiple blocks.

As it happens, I am currently doing the following:

  • Disk data stream
  • Buffer data until I have the window size I want to parse.
  • Process data window.
  • Copy the data back n-samples (where n is the amount I want to shift (this could be like 1ms back in an 80ms window!)
  • rinse and repeat.

Now it seems to me that once I have a window, I can easily copy this data into a given thread buffer (and also provide a memory location to which the result will be written). This way, I could efficiently accumulate up to 7 (keep stream 8 open to pump data) the amount of data that the thread pool will handle.

When I try to send the eighth audio window, I want the pool to block until a stream is available to process data, etc. The idea was that I would keep 7 threads constantly working on data processing. From previous experience, I would expect to see around 5x speedup from this.

In the past, I've written my own task-based system in C ++ that would do the job just fine, but this application is being developed in C #. To get good parallelism with low overhead in C ++, I've spent a significant amount of time building a good locking mechanism without waiting.

I was most likely hoping, under C #, that someone would take the pain of doing it for me. However, I can't find anything that might work. I've looked at System.Threading.ThreadPool and doesn't seem to have a way to check how many threads are currently running. Not to mention, the overhead seems prohibitive. There is a big problem: I cannot reuse an existing preallocated structure (which is important in my processing), forcing me to recreate it every time I submit a work item. This has a huge disadvantage that I will then generate the job faster than I can process it, so not only am I wasting a ton of time setting up structure and workspaces that I really don't need, but memory usage out of the spiral gets out of hand.

Then I found out about System.Threading.Tasks, but that also doesn't seem like the functionality I'm working with.

I think I could just use my C ++ task manager via interop, but I really assumed that in this day and age, someone had already installed something like this. So am I missing something? Or can anyone provide me with a link to such a task management mechanism?

+3


source to share


3 answers


The parallel problem library was designed and implemented specifically for the problems you are trying to solve! You can also pipe this process.

So, you have to make sure:



+4


source


Well, as always in these cases, I recommend using ZeroMQ . This will allow you to easily control the number of consumers.



Regarding the areas of your scratch, firstly, 0.5 GB is not much memory in this day and age. I think my phone has more RAM, not to mention my desktop ... If you want to deal with memory consumption easily, just create one area from scratch for each thread, put everything in a pool and get the manufacturer on areas with zero before enqueuing the task by pinning that task scope. When the consumer is done, return the scratch area back to the pool.

+3


source


I would use the Task parallel data stream library. It is designed to create process boxes that can be chained together with explicit control over the degree of parallelism and blocking semantics.

+1


source







All Articles