Need help understanding Perl code - Multi process / fork

Question

Need help understanding Perl code - Multi process / fork

I was looking for an example to limit the number of forked processes to run at the same time and I came across this old code

#!/usr/bin/perl
#total forks, max childs, what to run
#function takes 2 scalars and a reference to code to run
sub mfork ($$&) {
        my ($count, $max, $code) = @_;
        # total number of processes to spawn
        foreach my $c (1 .. $count) {
                #what is happening here? why wait vs waitpid?
                wait unless $c <= $max;
                die "Fork failed: $!\n" unless defined (my $pid = fork);
                # i don't undestand the arrow notation here and how it calls a function, 
                #also unless $pid is saying run function unless you're the parent
                exit $code -> ($c) unless $pid;
        }
        #no idea what happening here, why are we waiting twice? for the last process?
        #why 1 until (-1 == wait)? what 1 doing here
        1 until -1 == wait;
}


#code to run
mfork 10, 3, sub {
        print "$$: " . localtime() . ": Starting\n";
        select undef, undef, undef, rand 2;
        print "$$: " . localtime() . ": Exiting\n";
};

+3

perl fork multiprocess

genx1mx6 07 jul. 15 at 6:55

source to share

1 answer

simbabque · Accepted Answer · 2015-07-07T07:56:02+0000

Let's look at the code. The code is yours, with most of your comment removed. All other comments are mine.

#!/usr/bin/perl
# total forks, max childs, what to run
# function takes 2 scalars and a reference to code to run

sub mfork ($$&) {
        my ($count, $max, $code) = @_;

      # total number of processes to spawn
        foreach my $c (1 .. $count) {

                # wait waits for any child to return,
                # waitpid for a specific one
                wait unless $c <= $max;

                die "Fork failed: $!\n" unless defined (my $pid = fork);

                # the arrow is used to call the coderef in $code
                # and the argument is $c. It confusing because it has
                # the space. It a deref arrow, but looks like OOp.
                # You're right about the 'unless $pid' part.
                # If there is $pid it in the parent, so it does
                # nothing. If it is the child, it will run the
                # code and exit.

                exit $code -> ($c) unless $pid;
        }

        # This is reached after the parent is done with the foreach.
        # It will wait in the first line of the foreach while there are
        # still $count tasks remaining. Once it has spawned all of those
        # (some finish and exit and make room for new ones inside the
        # loop) it gets here, where it waits for the remaining ones.
        # wait will return -1 when there are no more children.
        # The '1 until' is just short for having an until loop that
        # doesn't have the block. The 1; is not a costly operation.
        # When wait == -1 it passes the line, returning from the sub.
        1 until -1 == wait;
}


# because of the prototype above there are no () needed here
mfork 10, 3, sub {
        print "$$: " . localtime() . ": Starting\n";
        select undef, undef, undef, rand 2;
        print "$$: " . localtime() . ": Exiting\n";
};

Let's take a closer look at the material.

wait

and waitpid

, wait

will wait until any of the children return. This is useful because the program doesn't care which slot is freed. As soon as one ends, you can create a new one. waitpid

takes a definite argument $pid

. It's not helpful here.
The syntax $code->($c)

runs coderef. Just like %{ $foo }{bar}

hashref will dereference, &{ $baz }()

will dereference (and run that ()

) coderef. An easier way to read $foo->{bar}

. The same is true for $baz->()

. Arrava shares it. See perlref and perlreftut .

While this is nice and useful, it might be wiser to use Parallel :: Forkmanager , which makes it possible to do this in significantly fewer lines of code, and you don't have to worry about how it works.

use strict;
use warnings;
use Parallel::ForkManager;

my $pm = Parallel::ForkManager->new(3); # max 3 at the same time

DATA_LOOP:
foreach my $data (1 .. 10) {
  # Forks and returns the pid for the child:
  my $pid = $pm->start and next DATA_LOOP;

  ... do some work with $data in the child process ...
  print "$$: " . localtime() . ": Starting\n";
  select undef, undef, undef, rand 2;
  print "$$: " . localtime() . ": Exiting\n";

  $pm->finish; # Terminates the child process
}

What is it. The path is clearer to read. :)

Need help understanding Perl code - Multi process / fork

More articles: