Let a given number of threads compute more tasks

I have a predefined number of tasks (Task objects) and also a given number of threads.

Let's assume 8 threads and 32 tasks.

My current solution allows me to run 8 tasks in parallel, and once they are finished, run 8 more in parallel.

With this solution, it may happen that 7 tasks have to wait for the slowest.

I would like to have a solution where, as soon as the thread has finished executing its task, it will be reloaded with a new task until there is nothing left.

My current implementation:

vector <Task> tasks; //the 32 tasks
unsigned int maxThreadCount(8);

for ( unsigned int i = 0; i < tasks.size() - maxThreadCount; i+=maxThreadCount )
{
    vector <thread> threads;
    for (unsigned int j = 0; j < maxThreadCount; ++j)
        threads.push_back (thread(std::bind( &Task::some_function, tasks[i+j])));

    for ( auto &i : threads)
        i.join();
}

      

+3


source to share


4 answers


This is a bit of a hack and clearly depends on your needs. The idea is for the threads to keep fetching data and sending it to the member function until the list of tasks is exhausted, at which point each thread ends with this opening.

Something like that:

static void ThreadProc(
    std::vector<Task>& tasks,
    std::atomic_uint& idx, 
    void (Task::*member)())
{
    unsigned int i = 0;
    while ((i = idx++) < tasks.size())
        (tasks[i].*member)();
}

void RunTasks(
    std::vector<Task>& tasks, 
    unsigned int n_threads = std::max(std::thread::hardware_concurrency(),1U))
{
    // adjust number of threads to be no larger than the task set
    //  there is no reason to spin up new threads that will have
    //   nothing to do but exit.
    if (tasks.size() < n_threads)
        n_threads = tasks.size();

    // shared index into the task container        
    std::atomic_uint idx{0};

    // spin up the work crew
    std::vector<std::thread> threads;
    threads.reserve(n_threads);
    while (n_threads--)
    {
        threads.emplace_back(ThreadProc, 
                             std::ref(tasks), 
                             std::ref(idx), 
                             &Task::some_function);
    }

    for (auto &t : threads)
        t.join();
}

      



For a fixed set of tasks, this is about as easy as getting a thread-proc caller model. It does not require an additional container queue; vector task is enough. And it doesn't require any lock objects, instead a much lighter atom is used instead.

Obviously all of this comes out of the window if you need advanced access to the container, like adding new items to the average task vector of tasks, etc. Something like this would require a different approach, but for a one-time job, an approach to fixed, enumerated tasks that are hard to beat. From your description, that's all you need.

Good luck.

+3


source


Have you considered using a queue to store Jobs, and each thread popped a task out of the queue to process it? Something like that:



http://vichargrave.com/multithreaded-work-queue-in-c/

+3


source


With Boost.Asio it is very easy to do. Just call a method io_service::run

from 8 threads and then io_service::post

your work items in io_service

. An example can be found here .

+1


source


Why not use full-featured parallel libraries or language extensions such as , , ... They are all fairly portable now, supporting major OSes and compilers.

So, you don't have to reinvent the wheel, but let the libraries keep track of threads and load balancing; and avoid these pitfalls: Why is OpenMP superior to streams?

For example, a vector of tasks can run in parallel on the default number of threads as simple as:

cilk_for(int i = 0; i < tasks.size(); i++)
    task[i].some_function();

      

You can also change the number of threads if needed.

+1


source







All Articles