Being able to reuse code is a Good Thing, and Perl makes it easy. It's easy to use other people's code, and it's not too hard to write code so that you (and maybe others) can reuse it. This discussion will concentrate on how to reuse your own code; once you understand that, it will be simple to use other people's. As with most of Perl, there's more than one way to do it; we'll start from the least complex, where that means something like ``least fiddling with existing code'', and work up to full-fledged modules.
This is just an introduction, so be warned that we won't cover all there is to know about modules. See the references section for more. Although many modules are Object-Oriented, you don't need to know Perl OO to write modules. We'll only discuss ``regular'' non-OO modules.
Terminology: I'll refer to the code we want to reuse as the ``source'' (library is a better choice but that already has a meaning in Perl). The code we want to use the source in will be referred to as the ``script.''
To motivate the discussion, imagine this scenario: we're writing a package that will parse lots of different log files. Each type of log file gets its own script, but the scripts will have many parts in common, and so we want to put those common bits in a separate file and reuse them when necessary. For instance, we want to use a common config file format in each script. The sample code implements this function.
We'll refer to this sample code throughout the discussion. It's large
enough that I wouldn't want to cut and paste it into each script! More
importantly, I don't want to make changes in each and every script. Note
that there are two subroutines and a global variable. You'd use it like
this: %config=&read_config('/etc/blah.rc'). Here's the source:
$Mode = 022; # Test for writeability; use 066 for read or write
sub read_config { # parse file with "var=value" format
my ($file) = @_;
my %hash = (); # left side of = will be key
if (is_safe($file)) { # check permissions and ownership
open(FH, $file) or die "Can't open $file: $!\n";
while (<FH>) {
next if /^#/; # ignore comments
s/#.*//; # remove trailing comments
s/^\s*//; # remove leading space
s/\s*$//; # remove trailing space
$hash{$1}=$2 if (/(.*)\s*=\s*(.*)/);
}
close FH;
}
return %hash; # return this to the caller
}
sub is_safe { # test file for group or other write permissions
return 1 unless defined $Mode; # undef $Mode to skip this check
my ($file) = @_;
my ($file_mode, $uid) = (stat($file))[2,4] # stat file
or die "Can't stat $file: $!\n";
return 0 if ($file_mode & 0+$Mode); # bitwise and
return 0 if ( ($uid != 0) && ($uid != $<) ); # $< is current UID
return 1; # If get here, it's ok
}
The simplest way to turn some subroutines into a source file is to put that
code into its own file, and add 1; as the last line. You could use other means for this, but 1; is traditional. This is necessary because the last executed command in the
imported code must return a true value. Thus the last few lines of the
above code would read:
return 0 if ($file_mode & 0+$Mode);
return 0 if ( ($uid != 0) && ($uid != $<) );
return 1;
}
1;
What you call your source file depends upon how you will import it into the script. The 'use' directive requires that the file end in .pm -- the 'require' directive doesn't have that restriction. It's safest and most flexible to use the .pm suffix, so we'll call the sample code ``Conf.pm''. The first letter is capitalized because traditionally only pragmas like 'use strict' or 'use vars' are all lower-case.
Save it either in the system Perl directories or in your own private
library, like /home/fprice/perllib/.
Realize that your source code must compile first. You can check this with perl -c Conf.pm.
Now that the source file is prepared, we need to modify the script to
import it. For this you can use 'use' or 'require' (You can also use 'do',
but don't!). We can load /home/fprice/perllib/Conf.pm like this:
use lib '/home/fprice/perllib'; # Add private library to path require Conf; # Load source at runtime, use Conf; # Or load at compile time
Neither directive will reload a function that has already been loaded. Both
look in @INC to find files; the 'use lib' directive prepends a directory to @INC so Perl will look in non-standard places. Sometimes you see a use directive
that looks like this:
use Conf::Read;
In this case, Perl would look in the @INC locations for a directory named Conf, and then try to load a file named Read.pm. If you want to create such a ``hierarchical'' module, see the section on
h2xs for an easy way to set it up.
In either case, call the function just as if it was explicitly written in the script:
%config=&read_config('/etc/blah.rc')>.
The primary difference between require and use is that require loads the source at runtime, while use loads it at compile time. So if your source file doesn't compile or can't be found, require will start running the script and then die; whereas use won't compile at all. For these reasons, it's recommended to use 'use'.
It's important to realize that there is a possibility of variable conflicts
with this method. For instance, if you have a variable
$Mode in your script, it will be redefined by (or overwrite) the value of $Mode in the source. But this is a quick and dirty way to import a function into
a script.
To keep variables distinct, Perl has a concept called ``namespace.'' Every global symbol, by default and until you say otherwise, is in the namespace called 'main'. You can change the namespace by issuing the 'package' declaration. A namespace is in effect until another package declaration is encountered, or until the end of the block or file in which package is declared.
For example, switch namespaces to get distinct variables with the same apparent names:
#!/usr/bin/perl -w
$blah = 20; # by default, in namespace main package foo; # switch to namespace foo print $blah; # undefined; no var $blah in namespace foo $blah = 100; # NOT redefining $blah in namespace main package main; # back in namespace main print $blah; # yields 20
But you don't have to explicitly switch namespaces to access a symbol in a different namespace. Use the syntax <namespace>::<variable>, and prefix it with the appropriate notation for the symbol type ($ for scalars, @ for lists, etc). We can call this the Fully Qualified Name (FQN). So for instance:
$blah = 20; # created in namespace main $foo::blah = 100; # creates $blah in namespace foo print $blah # prints 20 print $foo::blah # FQN; prints 100 $foo::blah = 200; # FQN; change value print $main::blah # FQN; print 20, redundant but ok.
Note that lexical variables -- those created with 'my' -- are outside any
namespace and thus cannot be referred to with a FQN. Also note that special
Perl variables like $_ and $| are visible and modifiable from any namespace; create a localized one with
'local' if you want to.
Finally, realize that packages just provide a grouping mechanism, not enforced privacy! It is quite possible from any namespace to use, create, and modify a variable declared in another package. Packages just make it harder to step on your own toes without realizing it.
Packages form the heart of most Perl modules. The key idea is for each module to declare its own namespace, and then to carefully declare to the script which variables are meant to be used. In turn, the calling script cooperates by only importing recommended variables into its operating namespace. Of course, the module writer needs to document the intended usage of the module!
An obvious modification we can make to Conf.pm is to put its code into a
package. Since the file is called Conf.pm, it's stylistically nice to call the namespace ``Conf''. We could use
another name -- there's no intrinsic connection between module names and
packages -- but that might confuse you as the author, not to mention those
who use your module! However, the namespace in the module and the 'use'
line in the script must match exactly (including :: if there are any).
Here's the change to Conf.pm:
package Conf; # start new namespace; scope extends to EOF
$Mode = 022; # Test for writeability; use 066 for read or write
sub read_config { # parse file with "var=value" format
Once we've done this, we must change the way we call read_config() from the calling script. Since it is now in a different package from
main::, the Conf package must be explicitly specified, either by switching
to the Conf namespace or by using the FQN:
use lib '/home/fprice/perllibs'; # Add private library to path use Conf; # Load module at compile time
$Conf::Mode = 066; # Change value of $Conf::Mode
# Call read_config in Conf package
%config = &Conf::read_config('/etc/blah.conf');
Now we've minimized the chances of variable conflict between the script and the module.
Sometimes you may want to import certain symbols from a module into your script's namespace (typically into main). You can give an optional list of symbols to the 'use' directive in your script, and then just access those symbols as if they were defined in the current package. Although this does reintroduce the problem of variable conflict, the fact that you specifically request symbols tends to minimize it.
use lib '/home/fprice/perllibs';
use Conf qw($Mode); # load Conf.pm and import $Mode into
# namespace main
$Mode = 066; # $Mode is not FQN, but is from Conf::
In turn, the module must be equipped to export this symbol. The easiest way to do this is to tell the module to inherit from the Exporter module. Add this to Conf.pm:
package Conf; # start new namespace; scope extends to EOF use Exporter; # load Exporter module @ISA=qw(Exporter); # Inherit from Exporter # Export $Mode and read_config on request @EXPORT_OK=qw($Mode read_config);
$Mode = 022; # Test for writeability; use 066 for read or write
sub read_config { # parse file with "var=value" format
Putting a symbol in @EXPORT_OK means that the script must specifically request for it to be imported. If
instead you'd like some symbols to be automatically imported when the
module is loaded, use @EXPORT:
@EXPORT=qw(read_config); # export by default
And then in the script:
use Conf; # loads Conf.pm and imports all symbols in @EXPORT
Note that if you give a list to 'use', thus requesting symbols to be
imported, only those specific symbols will be imported no matter what is in @EXPORT. If you put a symbol in @EXPORT, you don't have to also put it in @EXPORT_OK to request it specifically. But don't go overboard with @EXPORT since the importing script might not realize that all those symbols will be
imported.
To request that no symbols be imported into your script, give an empty list to 'use'. You can still access these symbols with their FQN.
use Conf (); # load Conf.pm but doesn't import ANY symbols
Another symbol you can put in a module that uses Exporter is
$VERSION. This represents the version number of the module, and should be something
like 1.01 (not 1.0.1). Then the script can specify a minimal version number
and fail if it isn't high enough.
use Conf 2.0 qw( $Mode read_config); # fail if $VERSION < 2.0
To ensure that symbols in your module can't be accessed or imported into a script, mark them as lexical with 'my'. Since lexical variables are never part of any namespace, and since their scope cannot extend past file boundaries, they will be private to that module or subroutine.
For example, we could make $Mode usable only by read_config() by changing Conf.pm like so:
package Conf; # start new namespace; scope extends to EOF use Exporter; # load Exporter module @ISA=qw(Exporter); # Inherit from Exporter @EXPORT_OK=qw(read_config); # Export read_config() upon request
my $Mode = 022; # $Mode is now lexical and can't be exported or
# changed from the script.
If you are creating an object-oriented module, it is recommended that you
not export any symbols; i.e., leave @EXPORT and @EXPORT_OK
blank. Instead, you should provide access methods for your object
attributes.
If you are creating a regular old module -- a set of functions -- it's
probably best to put symbols in @EXPORT_OK rather than @EXPORT. Then the module user must explictly import symbols into her namespace,
and presumably won't be surprised by variable conflicts.
Sometimes you'd like to execute some code when your module loads; or perhaps when it exits. The setup code is easy: just put regular Perl commands outside any subroutine definitions, and they will be executed when the module is loaded.
For exit code, use an END subroutine. This subroutine will run when the script finishes, or on an error (like a call to die). It is defined just like any other subroutine, except that it must be called 'END' (all upper-case) and you can omit the 'sub' part of the definition. For example:
END {
print "Executing module cleanup now ...\n";
# Do more interesting things here
}
Perl comes with a script called h2xs which can create a standard module skeleton for you to fill out with your
code. To start a new module, run
h2xs -XA -n mod_name. This makes stubs for the named module, including Makefile.PL. It is the
easiest and safest way to get all the details correct! The '-X' option
keeps it from creating C extension stubs, while the '-A' option doesn't use
the Autoloader.
To create a skeleton for the Conf module, run h2xs -XA -n Conf, which creates a directory called Conf containing these files:
Changes -- revision history Conf.pm -- skeleton for the module MANIFEST -- list of files in the module Makefile.PL -- Perl to make a Makefile test.pl -- skeleton for testing the module
If you want to distribute this module, first edit Conf.pm and fill in the stubs with the working code. If you want people to be able
to test the module, put some testing code in test.pl. Once you're done, run these commands from your shell:
perl Makefile.PL # creates a Makefile
make dist # tars everything up in a nice package.
# Uses the value of $VERSION for naming.
Everything that was mentioned here is covered in available documentation. Good references include:
Perl Cookbook, Ch. 12. Tom Christiansen & Nathan Torkington. O'Reilly, 1998.
Advanced Perl Programming, Ch. 6. Sriram Srinivasan. O'Reilly, 1997.
The perlmod section of the Perl manual. 'perldoc perlmod'
The perlmodlib section of the Perl manual. 'perldoc perlmodlib'