Perl - data access synchronization

I am starting to learn parallel programming and I want to compare a single threaded program with a multi threaded one.

I need to make a very simple algorithm that calculates the largest number of possible primes in one minute and shows me the last calculated prime and its prime position.

For example, prime 23 will display as 23 and its position 9 because it is the 9th number.

Without using streams, the number of primes found was 233,596, and the last one was 3,246,107. But with strings, 229,972 primes were found, and the last prime was 3,192,463.

I think this is wrong, because multithreading should have gotten excellent results for one thread. I believe this is a fairly simple error, but I cannot solve it as I still don't understand a lot of Perl parallelism.

This is the code. It calculates prime numbers in one minute of single-threading, and then the same four-thread calculation using shared variables.

use threads;
use threads::shared;

my $seconds = 60;

 # WITHOUT THREAD #
print "\n\n Calc without Threads:\n";
my $endTime = time() + $seconds;
calcWithoutThread();


print "\n\n ----------------===========================---------------- \n";

 # WITH THREAD #
print "\n\n Calc with Threads:\n";
my $prime :shared = 5;          # Starts from the 5th prime
my $totalPrime :shared = 2; # Starts with prime 2 and prime 3
my $lastPrime :shared = 0;
my $endTime1 = time() + $seconds;
my $thread1 = threads->create(\&calcWithThread);
my $thread2 = threads->create(\&calcWithThread);
my $thread3 = threads->create(\&calcWithThread);
my $thread4 = threads->create(\&calcWithThread);
$thread1->join();
$thread2->join();
$thread3->join();
$thread4->join();
print " Was found $totalPrime prime numbers. Last prime: $lastPrime.";


# SUB #

sub calcWithoutThread{
    $prime = 5;            # Starts from the 5th prime
    $totalPrime = 2;           # Starts with prime 2 and prime 3
    $lastPrime = 0;
    while (time() < $endTime){
        if(calcPrime($prime)){  
            $totalPrime ++;
            $lastPrime = $prime;
        }
        $prime ++;
    }
    print " Was found $totalPrime prime numbers. Last prime: $lastPrime.";
}

sub calcWithThread{
    while (time() < $endTime1) {
        lock($prime);           
        if(calcPrime($prime)){
            $totalPrime ++;
            $lastPrime = $prime;
        }
        $prime ++;
    }   
}

sub calcPrime{
    for($i=2 ; $i< sqrt ($prime) ; $i++){
        if( $prime % $i == 0){
            return 0;
        }
    }
    return 1;   
}

      

The logic is that threads perform this calculation synchronously, if it is a prime number or not, and also do not overlap values ​​in the calculation.

+3


source to share


2 answers


The problem is that your threads are in sync with each other using a locked variable $prime

. This means they do not have the ability to run concurrently and will also suffer from thread reallocation and synchronization of variable access

Symbols are not a great test for parallel processing, since the computation of each stroke depends on the previous result. However, you can do this by keeping a separate variable $prime

but starting at 5, 6, 7 and 8 for four threads and adding 4 between tests. Thus, they do not duplicate each other, but together they cover all integers

With this, the problem immediately arises in that none of the even numbers will ever be prime, so two of the four threads will never produce a result. This is still a valid parallelism test, but clearly very inefficient. You can fix this by running threads 5, 7, 9 and 11 and increasing by 8 for each test case. Then every stream will be profitable

Don't forget that you will have to code the same algorithm for single threaded code, otherwise the parallel partition will have an unfair advantage




Update

Perhaps the best way would be to block $prime

just to get the next number to test and increase it. Thus, all threads can perform their computations in parallel and only in a queue to get another job

Then you need to block $total_prime

to prevent two threads from growing at the same time, and also refresh $last_prime

while this block is in effect. Since parallel streams can generate prime numbers from a sequence, you will also need to check, the one you just found is simply greater than the last one found by any thread

Your subroutine will look like this. I apologize for changing the identifiers, but Perl has traditionally used snake_case

it and I find it camelCase

frustrating to read. Snake's case is also much easier for many people whose first language is not English and who cannot easily capitalize.

sub calc_with_thread {

    while ( time() < $end_time_1 ) {

        my $test = do {
            lock $prime;
            $prime++;
        };

        if ( calc_prime($test) ) {
            lock $total_prime;
            ++$total_prime;
            $last_prime = $test if $test > $last_prime;
        }
    }   
}

      

+4


source


Multithreading is not an automatic speed boost.

Shared variables are bound and bound variables are slow.

Here's a simpler, improved version that gets closer to the maximum number rather than timing. This avoids blocking. Threads are still significantly slower because they each have to do get and set on this sluggish shared variable for every count checked.

#!/usr/bin/env perl

use strict;
use warnings;
use Time::HiRes qw(time);
use v5.10;

use threads;
use threads::shared;

my $Max_Number = 50_000;

{
    print "\n\nCalc without Threads:\n";
    my $start_time = time;
    calcWithoutThread();
    my $end_time = time;
    say "$Max_Number took @{[ $end_time - $start_time ]} seconds.";
}

my $Shared_Prime :shared = 5;
{
    print "\n\nCalc with Threads:\n";
    my $start_time = time;
    my $thread1 = threads->create(\&calcWithThread);
    my $thread2 = threads->create(\&calcWithThread);
    my $thread3 = threads->create(\&calcWithThread);
    my $thread4 = threads->create(\&calcWithThread);
    $thread1->join();
    $thread2->join();
    $thread3->join();
    $thread4->join();
    my $end_time = time;
    say "$Max_Number took @{[ $end_time - $start_time ]} seconds.";
}

sub calcWithoutThread {
    my $prime = 5;                # Starts from the 5th prime

    while( $prime <= $Max_Number ) {
        calcPrime($prime++);
    }
}

sub calcWithThread {
    while( $Shared_Prime <= $Max_Number ) {
        calcPrime($Shared_Prime++);
    }
}

sub calcPrime {
    my $prime = shift;

    for( my $i=2 ; $i < sqrt($prime) ; $i++){
        if( $prime % $i == 0){
            return 0;
        }
    }

    return 1;   
}

      



Instead, avoid sharing or coordinating workers. For example, you can cut a piece into pieces. If you have 4 employees, give them each starting number and each one increases by 4. Then they can cover the whole space without a shared state.

  • Worker 1: 5, 9, 13, ...
  • Worker 2: 6, 10, 14, ...
  • Worker 3: 7, 11, 15, ...
  • Worker 4: 8, 12, 16, ...
#!/usr/bin/env perl

use strict;
use warnings;
use Time::HiRes qw(time);
use v5.10;

use threads;

my $Max_Number = 7_000;

{
    print "\n\nCalc without Threads:\n";
    my $start_time = time;
    calcWithoutThread(5, 1);
    my $end_time = time;
    say "$Max_Number took @{[ $end_time - $start_time ]} seconds.";
}

{
    print "\n\nCalc with Threads:\n";
    my $start_time = time;
    my $thread1 = threads->create(\&calcWithThread, 5, 4 );
    my $thread2 = threads->create(\&calcWithThread, 6, 4 );
    my $thread3 = threads->create(\&calcWithThread, 7, 4 );
    my $thread4 = threads->create(\&calcWithThread, 8, 4 );
    $thread1->join();
    $thread2->join();
    $thread3->join();
    $thread4->join();
    my $end_time = time;
    say "$Max_Number took @{[ $end_time - $start_time ]} seconds.";
}

sub calcWithoutThread {
    my($start, $inc) = @_;

    my @primes;
    for( my $prime = $start; $prime <= $Max_Number; $prime += $inc ) {
        push @primes, $prime if calcPrime($prime);
    }

    return \@primes;
}

sub calcWithThread {
    my($start, $inc) = @_;

    my @primes;
    for( my $prime = $start; $prime <= $Max_Number; $prime += $inc ) {
        push @primes, $prime if calcPrime($prime);
    }

    return \@primes;
}

sub calcPrime {
    my $prime = shift;

    for( my $i=2 ; $i < sqrt($prime) ; $i++){
        if( $prime % $i == 0){
            return 0;
        }
    }

    return 1;   
}

      

For me on my i7 MacBook (2 real cores, 4 virtual) threads are interrupted even around 6000. To get the results, $thread->join

will return a list of primes from calcWithThread

.

+2


source







All Articles