## TO DO: # 1) Check to see if backslashing returns works. # 2) Handle errors somehow. =pod =head1 NAME Include =head1 SYNOPSIS This module initializes the program from an external configuration file. This allows many different programs files to have the same configuration and eases the minds of those who find the lack of header files in Perl to be blaspheme. A possible secondary use of Include might come from using C and runtime C statements: you could seperate Cled code out into seperate files and run it was you would like, easing the eyes and bughunting. This could be particularly useful in things (like CGI scripts) which are often based on huge switch-like statements. However, this particular use of Include has not been tested or verified, as it is largely an unintended process. =head1 DESCRIPTION =head2 Use Directive =over 4 =item Basic (One file, No Configuration) Include is called from each program using it as C, just like any other module. If an empty list is passed, Include returns quickly and does nothing. Should you want more functionality, pass Include a pathname to the configuration file to be loaded. This file will be opened and processed according to the rules in L. =item Multiple Files If the first element is an array reference, each element in the array will be treated as the location of a header file to be opened and processed. =item Configuration All elements except the first will be compressed into a hash which controls the behavior of Include as described under L. You may create your own hash and pass a reference. In that case, you may only have two elements in the parameter list: the file location (or file location array reference) and the configuration hash. =back =head2 Configuration File First and foremost, the configuration file needs to be a text file readable by the calling program and must be properly identified by the L. =head1 SIMPLE MODE Z<>For the Truly Impatient, the simplest way to use the configuration file is to just create the file and put into it a series of commands to be executed at compile-time. Commands will be executed one at a time, where "one at a time" is defined as "in groups terminated by end-of-file, C<__COMMANDS__>, or a semicolon followed by optional whitespace and optional #-preceded comment". =head2 Formatting notes =over 4 =item The # symbol The system will assume any # following a semicolon is the beginning of a comment. If you are generating code or other twistedness, prepend the # with a backslash. =item POD Directives For ease of adaptation, any line beginning with an equal sign will be treated as POD directives. All text up to and including the following line beginning with "C<=cut>" will be summarilly ignored. Unless you intend to be using POD, do not start a line with an equals sign. =item C<__WORDS__> Any line beginning with C<__ANYTEXT__> is treated as an instruction to Include and is otherwise ignored. For more information, see L. =item Empty lines, spaces, etc. Empty lines or lines which contain whitespace followed immediately by a # are ignored. Leading whitespace is trimmed from every line and all terminating whitespace is compressed to a single carriage return. =back =head1 PROCESS CONTROL The Truly Lazy have a number of short-cuts for common behaviors. These behaviors are triggered by a line beginning with C<__> and followed immediately by the command to be used and ending with C<__> (similar to Perl's built-in C<__END__>, C<__PACKAGE__> and C<__DATA__> codes). A C<__> command continues until changed by another. =over 4 =item C<__CODE__> (default) This reads the code and, excepting the formatting notes provided above, passes the following code directly to Perl's built-in parser via C< >>. =item C<__COMMENT__> Every line after C<__COMMENT__> until the next C<__> command is ignored. You have three guesses as to why one might want to do this. =item C<__CONST__> The C keyword provides a short-hand for C. The first "word" (as defined by Perl's regex engine's C<\w> and delimited by whitespace) is treated as the name of a new constant. The rest of the line is treated as the body and should end with a semicolon. To aid Perl's built-in optimizations and be kind to the carpals, the body of the constant is pre-pended with "C" unless "C" or "C" is already prepended. For the sake of memory, "C", "C", and "C" are synonyms for "C", and for the sake of code compilation, "C" and its synonyms are removed from the actual code. To make a constant more than one line long, backslash the carriage return. Examples: =over 4 C< MYCONST "Yay, baby!"; NEWCONST "This is a constant, too."; # And has a comment! BIGCONST (MYCONST.NEWCONST)."is okay, provided MYCONST and NEWCONST are defined earlier."; MYLIST noscalar (foo,bar); LISTCONST NOT_scalar (1,2,3,4,5); > =back =item C<__SUBDEFINE__> This defines one subroutine in the caller's subroutine. To define multiple, divide them up by multiple C<__SUBDEFINE__> tags. The first line is the name of the program including the prototype (if any), and the rest of the section is processed as the body of the subroutine. The subroutine will be imported at compile-time, so assuming you don't try to use the subroutine before you import the header file, you should be able to use C<__SUBDEFINE__> functions precisely the same as built-ins. =item C<__META__> and C<__NEWCMD__> See L for a description of this tag. =item Custom Commands Yes, that's right, you can even have custom commands. Note that you will need to define the custom tags in a C<__META__> head or in the C directive. See L. =back =head1 CHANGING BEHAVIOR This section is for the Truly Hubristic. =head2 The C command The first parameter must be the location of the file, but the rest of the parameters become a way to change the behavior of the module. The rest of the parameters become a series of key-values pay =head2 C<__META__> processing This changes the behavior of all the processing following this point by setting the flags defined below. Commands should be passed in whitespace-delimited pairs optionally ending with a semicolon (it will be stripped before processing). If you want to span multiple lines, backslash the carriage return. Example: =over 4 C< SEEK sub { print "FOO"; }; ListConst 1; GETRIDOFME sub { print "I have a semicolon here, but not in reality!" }; > =back =head2 C<__NEWCMD__> creation For complicated command creation, the NEWCMD header is provided. The first line is the name of the new command and will be checked against existant flags for any collisions. The rest of the section is passed directly into an anonymous subroutine generator as code. Examples: =over 4 C< MYCMNDISHAPPY print "Yay, $_" foreach(@_); > C< FOOBAR_SNAFU foreach(@_) { eval $_ =~ s/foobar/snafu/; die $@ if($@); } > =back =head2 Setting Flags and Specifying Custom Commands If C is recognized (case-insensitive) as a flag in Behavior List, the appropriate flag is set to C. Otherwise, C is treated as a new custom command. When C<__name__> is found in a config file, C is checked to be a coderef. The handler receives a list consisting of code divided up by lines and is expected to C on failure. If the handler dies, the error message will be appened with information about the location in the config file. The code is guaranteed to have no leading spaces, nothing that resembles a POD directive, and each ends in a newline. The first line will be a C statement to the caller and the last line will be a C statement returning to the module. There are no other guarantees made about the code. Custom commands must be made up of only word characters as defined by regex's C<\w> flag and cannot have the same case-insensitive name as a flag. These are the only limitations on the definition of custom commands, so if you want to redefine what C<__META__> means in a file or if you get a kick out of a 75-character command or if you think all user-defined C<__> commands should be lower-case, go right ahead. =head2 Behavior List All behaviors default to 0/false. Behavior list names are not case-sesntive. =over 4 =item ListConst If set to true, the scalar check is not done in __CONST__ sections. =item Force If false, the program dies when there is an error in the config file. If set to 1, a warning is emitted. If set to 2, the warning is also neglected. See also the Optional flag. =item Optional When false, Include dies if the file cannot be opened. If set to 1, the file is skipped and a warning is emitted. If set to 2, the warning is also neglected. =item AutoUC Does nothing special if things are false. If set to true, however, the program will shift all constant names into upper case automatically. =item NoSecurity This controls errors such as taintedness and write-controls. The code comes directly from Chapter 23 of the Camel Book and checks to see if the parameters are tainted, followed by verifying that the file and its directory are not world-writeable. If this option is left at zero, these checks are left in place. If this option is turned to 1, only warnings are emitted. If this option is 2, any taintedness errors or writability errors are blatantly disregarded. Note, however, that this option does not let you turn off taint for any code in the configuration file -- that is still your business. =item EveryLineIsGood If this is set to 1, empty lines and lines beginning with # are processed as code. If set to 2, POD documentation is also processed in. =back =head1 AUTHOR Robert Fischer (C), who is only slightly older than Perl itself. =head1 BUGS =over 4 =item * Include called at runtime (via C) needs to be tested. =back If bugs are found, please contact the current maintainer, noted under L. =head1 SEE ALSO L =head1 COPYRIGHT Copyright 2002, Robert Fischer. All Rights Reserved. This code is free software. You may copy or redistribute it under the same terms as Perl itself. If you do use this code, please contact the author with your results. =cut # Initial package handling info package Include; $VERSION = 0.01; use Carp; use IO::Handle (); use File::stat; use Symbol 'qualify_to_ref'; # Declare the simple variables our $sec_count = 0; our $newstate = ""; our $state = "CODE"; our $from_pkg = "package ".caller.";\n"; our $to_pkg = "package ".__PACKAGE__";\n"; #our $term_re = qr/;\s*(?:[^\\]\#.*)?$/x; # Regex defining the end of a statement # Create subroutines for CONST and META here, because...yeah... my $const_ref = sub { # Prep @_ my($pack_in,$pack_out) = (shift,pop); for my $line_ref (\(@_)) { my($name,$body) = split /\s+/,$$line_ref,2; unless($flag{"listconst"}) { unless($body =~ /^ (?:\(\s*)? # Ignore any leading parenthesis and WS ( # Begin grouping (?:not?_?)?scalar # Allow scalar, not_scalar, noscalar, etc. \s* # Also collect any following spaces ) # End grouping /xi) # Ignore case and eXtend. { substr($body,0,0) = "scalar " ; } else { $body =~ tr/$1// if($1 and $1 =~ /^not?_?/i); } } uc($name) if($flag{"autouc"}); $$line_ref = "use constant $name => $body"; } unshift(@_,$pack_in); push(@_,$pack_out); # Run the line eval join("",@_); die $@ if($@); return 1; }; my $meta_ref = sub { (shift,pop); # Gets rid of package statements for my $line_ref (\(@_)) { my($key,$val) = split /\s+/, $$line_ref, 2; if (exists($flag{lc($key)})) { $flag{lc($key)} = $val; } else { $cmnd{$key} = eval $val; die if($@); } } return 1; }; my $newcmd_ref = sub { (shift,pop); my $name = shift; return die_right("force","Collision in NEWCMD \'$name\'") if (exists($flag{$lc(name)})); $cmnd{$name} = eval "return sub { ".join("",@_)." };"; die $@ if($@); }; my $subdef_ref = sub { my($pack_in,$name,$pack_out) = (shift,shift,pop); eval { eval $pack_in."sub $name { ".join("",@_)." }".$pack_out; }; die $@ if($@); }; # Define the rest of the variables that will be used our %flag = ( # Lower case is very important here. listconst => 0, force => 0, optional => 0, autouc => 0, nosecurity => 0, ); our %cmnd = ( COMMENT => sub { return 1; }, CODE => sub { eval join("",@_); die $@ if($@); return 1; }, CONST => $const_ref, META => $meta_ref, NEWCMD => $newcmd_ref, ); sub die_right($$;) { my($key,$msg) = @_; $msg =~ s/\beval\b\s?//ig; # Remove any mention to "eval" statements. $msg =~ s/\s*propgated\s+at.*//ig; # Remove the "propogated at..." crap. $msg .= " in the section before line $linecount in header file." if($linecount); $flag{lc($key)} == 0 ? croak($msg) and return 0 : $flag{lc($key)} == 1 ? carp($msg) and return 1 : return 1 ; } sub import { ## Get the import list and do error processing my $file_loc = shift; return die_right("optional","No file provided") unless(defined($file_loc)); return die_right("optional","Odd number of elements in hash") unless(scalar(@_) == 1 or(scalar(@_)%2)== 0); # Work out configuration my %rest; return die_right("nosecurity","Configuration hash is tainted") if(is_tained(@_)); unless(scalar(@_) == 1) { %rest = @_; } else { my $rest_ref = shift; return die_right("optional","Second element in pair is not a hash reference") unless(ref($rest_ref) eq "HASH"); %rest = %{$rest_ref}; } while(my($key,$val) = each %rest) { if (exists($flag{lc($key)})) { $flag{lc($key)} = $val; } else { return die_right("optional","No codref provided by \'$val\' for key \'$key\'") unless(ref($val) eq "CODE"); $cmnd{$key} = $val; } } # Get all the files as references (saves memory). # Also attempt to undo any minor booboos. # Note that the reference allows "" or "0" as file names. my @file_pile; for(ref($file_loc)) { unless ($_) { push(@file_pile,\$file_loc); } elsif (/ARRAY/i) { @file_pile = \(@{$file_loc}); } elsif (/SCALAR/i) { push(@file_pile,$file_loc); } elsif (/HASH/i) { @file_pile = \(%{$file_loc}); } # Unspool hash else { @file_pile = ($file_loc); } # See if this works... last; } ### Process the files my $file; while ($file_loc = shift(@file_pile)) { $file_loc = $$file_loc; is_tainted($file_loc) and return die_right("nosecurity","Location \'$file_loc\' came from a tainted source"); open($file,"<",$file_loc) or return die_right("optional","Could not open file at \'".$file_loc."\': ".$!); is_safish($file) or return die_right("nosecurity","Location \'$file_loc\' permissions are too loose"); IO::Handle::untaint($file); ## Process each line my @code = (); while(defined(my $line = <$file>)) { ++$linecount; # Check for POD and invalid lines unless($flag{"everylineisgood"} >= 1) { next if $line =~ /^\s*$/; next if $line =~ /^\s*\#/x; unless($flag{"everylineisgood"} >= 2) { if($line = /^=/) { ++$linecount until(<$file> =~ /^=cut/ or eof($file)); next; } } } # Clean up the line $line =~ s/^\s+//g; # Remove leading whitespace -- trailing is handled with next line. $line =~ s/\s*$/\n/g; # Compress trailing whitespace and ensure a trailing \n. # If this is the end of a block, process it. if($line =~ /^__(\w+)__/) { $newstate = $1; eval { $cmnd{$state}->($from_pkg,@code,$to_pkg) if(scalar(@code)); }; return die_right("force",$@) if($@); ($state,@code) = ($newstate,()); } else { # Otherwise, get the rest of the section. push(@code,$line); } } if(scalar(@code)) { # Make sure all the code is processed eval { $cmnd{$state}->($from_pkg,@code,$to_pkg) if(scalar(@code)); }; return die_right("force",$@) if($@); } } return 1; } sub is_safish { my $info = stat(qualify_to_ref(shift,callar)); return unless $info; return 0 if($info->uid != 0 and $info->uid != $<); return 0 if($info->mode & 022); return 1; } sub is_tainted { my $rv = 0; while(my $arg = shift) { my $nada = substr($arg,0,0); local $@; eval { eval "# $nada" }; $rv ||= $@; last if $rv; } return $rv; } return 1;