A perl Tutorial
by Will Morse, BHP Petroleum
Landmark Graphics Corporation
World Wide Technology Conference
Houston, Texas
December 1, 1994
A perl Tutorial
by: Will Morse
BHP Petroleum
Copyright 1994 Will Morse. Permission is given to freely copy and
distribute this paper as long as there is no charge except real and
actual mechanical copying costs and as long as this notice is kept
with each copy so others may copy it as well.
The opinions expressed are the author's own and do not necessarily
reflect the opinions or policies of The Broken Hill Proprietary
Company, Limited, or its various divisions. Your milage may vary.
INTRODUCTION:
perl is the "Swiss Army Chainsaw of Systems Administration".
This tutorial covers:
* What is perl? page 2
* What does a perl program look like? page 4
* A brief, hopelessly incomplete, overview of perl syntax
and features.
- Variable naming page 5
- Assignments page 6
- Arithmetic page 7
- If-then-else page 8
- Loops page 10
- Files page 11
- Special functions page 12
- Regular expressions page 14
- Report Writer page 16
- Debugger page 16
* An example using an exported horizon file from SeisWorks.
page 17
* Information on how to get perl. page 20
* A list of books on perl. page 21
* Internet contacts and support for perl. page 22
* Some handy Landmark-related perl programs. page 23
histograms, SEG-Y Headers,
nulls to spaces, HPGL
* A little about perl version 5. page 26
* Other free software you should know about. page 27
expect, tcl/tk, GNUtar, gzip
This paper specifically addresses perl 4.x. perl 5.x is out
now, but most people still use version 4. The current books
and documentation for perl all address version 4.
There are versions of perl, such as tkperl and oraperl, that
address X-windows programming, Oracle database access, and
other specific features. There is not time enough to cover
these versions in this short tutorial.
WHAT IS PERL?
perl stands for practical extraction and reporting language.
perl is a high level programming language combining elements
of C-shell, awk, and many other programming languages and
utilities.
perl is free. There are no license fees. There are no
royalties. It is NOT "Public Domain". When you get perl, you
will find a README file that explains your license. If you
give perl to someone else, you have to give them this README
file. Please read and understand this file.
There are at least two publicly available books explaining the
use of perl in detail (see page 21). There is excellent
support for perl on the internet in the comp.lang.perl
newsgroup (see page 22).
perl is better than shell script programming and awk or sed
programming because:
* It does not have all the shell initiation overhead, which
makes it faster.
* It can read and write binary data files.
* It can have many files for input or output at one time.
* It has a report writer.
* It has extended regular expressions.
* It has both linear and associative arrays.
* It has powerful defaults that simplify programming.
* It can process very large files without record size
limits.
perl is better than C, C++, or Fortran programming (for
sysadmin and data admin tasks) because:
* It does not have visible compile or link stages. The
program is kept in a source module, like a shell script.
* There is a very rich set of character string manipulation
and array handling commands.
* It has a more forgiving and easier to use syntax.
* It has both linear and associative arrays.
* It has a report writer.
* Some versions of perl include X-windows features, tkperl,
or access to Oracle databases, oraperl.
In fairness, perl has some down points.
* perl programs typically take 1.7 times as long to execute
as an equivalent C program. This is okay for utility
programs, but would be a problem for a program like
SeisWorks.
* There are a number of "gotcha's" that are pretty typical
Unix characteristics, but would confuse non-programmers.
For instance, a number starting with 0 is assumed to be
octal.
$num = 010;
print "The number is $num. \n";
will print
The number is 8.
* Mostly of interest to professional programmers, perl does
not have a CASE construct. It does not have pointers or
let you take the address of anything (until perl 5).
* perl has a dozen different ways to do anything. Some
people don't like this, others do.
* perl has a lot of capabilities to do things most
sysadmins and data admins will not understand. Don't
worry about it, just use what you need. You don't have
to know everything about a computer language to make good
use of it.
perl does not come on the standard SunOS 4.x distribution, but
it is a very standard language. perl is listed in the job
descriptions compiled by the Systems Administrators Guild
(SAGE) of Usenix, the Unix User's Group. There are currently
negotiations to include perl in the Solaris 2.5 release.
WHAT DOES A SIMPLE PERL PROGRAM LOOK LIKE?
This simple program copies records from a file and prefixes
each line with a sequential number. The italic numbers are
not in the program, they just help the explanation.
1 #! /usr/local/bin/perl
2 while (<>)
3 {
4 print STDOUT ++$i, $_;
5 }
Explanation:
1 #! is the Unix method for specifying a shell program.
/usr/local/bin/perl is the standard place to put perl.
2 while () {} creates a loop that continues while the
statement in the () is true. The statements in the loop
are enclosed in {}.
<> is a special default. It tells perl to look at the
calling command line to see if any files are specified.
If they are, read each file in turn. If no files are
specified, read from standard input. In either case, put
the characters read into the special variable $_. When
<> reaches end-of-file, it returns false, which
terminates the while loop.
4 print is a simple, unformatted, printing method.
STDOUT is the standard filehandle for Standard Output.
Filehandles are specified in all caps in perl.
++$i says to increment the value of $i and make that
value available to the print statement. All scaler
values (anything but a command, linear array, associative
array, filehandle, or procedure name) starts with $.
$_ is the default operand of any command. In this case,
$_ contains the last record read by the <> statement.
; terminates each command in perl.
A BRIEF OVERVIEW OF PERL SYNTAX:
In perl, as in Unix generally, character case is significant.
X and x are not the same character. It is common to name
variables and other items in mixed case:
$thisIsMixedCase
It is also permissible to use underscores:
$variable_with_underscores.
Do not use names that start with a number, as these are often
perl special symbols, $1, $2, etc.
All perl commands end with a semicolon, ;.
Variables:
perl identifies each type of variable - or data name - with a
prefix character or identifying style. These characters are:
$ scalar a single number (integer or
real) or character string
@ linear array an array referenced by an index
number
% associative array an array referenced by a
textual key
UC file-handle a file handle is uppercase
& procedure a subroutine
xx: label object of goto, or marker for
escape from a loop.
"Subscripts" enclosed in [] apply to linear arrays.
@items refers to the entire array items.
$items[1] refers to the scaler value which is the
second item in the array items. Linear
arrays start with the index 0.
$#items is the number of items in @items starting
from 0.
Subscripts enclosed in {} apply to associative arrays.
%items refers to the entire associative array
items.
$items{"x"} refers to the scalar value matching the
key "x"
Values enclosed in () are lists. Lists are often used as
arguments to a subroutine or built-in function call. It is
not necessary to enclose arguments in () if there is only one
argument or the program knows the limit of the list.
There can be completely separate and unrelated variables $x,
@x. %x, and &x, not to mention $X, @X, %X and &X.
There are special variables, the most important of which are
$_, @_,and @ARGV.
$_ is the default scaler value. If you do not specify a
variable name in a function where a scaler variable goes,
the variable $_ will be used. This is a very heavily
used feature of perl.
@_ is the list of arguments to a subroutine.
@ARGV is the list of arguments specified on the command
line when the program is executed.
Basic Commands and Control:
Braces, {}, are used to contain a block of program statements.
It is possible to have local variables within a block. Blocks
are used for the objects of most control commands.
Simple Assignment:
Simple, scaler, assignment is what you might expect:
$var = 1;
$str = "This is a string.";
One can also assign lists of scalars in one statement:
($rock, $jock, $crock) =
("Plymouth", "Warren Moon", "Solaris 2.x");
One can assign a list to an array:
@items = (1, 2, "Cambodia", 4);
or an array to a list:
($a, $b, $c, $d) = @items;
Associative arrays need a key, but otherwise work as you
would expect:
$aa{"able"} = "x";
%aa = ("able", "x", "baker", "y", "aardvark", "z");
Assigning an ARRAY to a SCALER will give the number of
items in the ARRAY.
@items = (10, 20, 30);
$i = @items;
print "$i";
will print "3".
Arithmetic Operations:
perl has the usual operations, and many more:
$c = $a + $b addition
$c = $a - $b subtraction
$c = $a * $b multiplication
$c = $a / $b division
$c = $a % $b remainder
$c = $a ** $b exponentiation
$c = $a . $b concatenation
++$a, $a++ increment by 1
--$a, $a-- decrement by 1
$a += $b increment by $b
$a -= $b decrement by $b
$a .= $b append $b to $a
$c = "*" x $b make $b *'s
Of course, there are many more.
There are also modifiers like these:
$a = "Big And Little";
$c = \l$a;
print $c;
prints "big and little".
\l convert to lower case
\u convert to upper case
\L lowercase until \E
\U uppercase until \E
\E end case modification
There are functions for math including:
log($x)
exp($x)
sqrt($x)
sin($x)
cos($x)
atan2($y,$x)
The only trig functions are sin, cos, and atan2, however,
these can easily be used to compute the others. The
ERUUG Unix Cookbook (see page 21) has a list of the
formulas for the conversions.
If-Then-Else:
The basic if-then-else command is fairly typical of all
computer languages.
if ( condition )
{
true branch
}
else
{
false branch
}
There is also
if (condition) {commands}
elsif (condition) {commands}
elsif (condition) {commands}
which simplifies a lot of complex nested if statements.
Note that it is elsif, not elseif or else if.
Both the true and false branches may contain any number
of nested if statements.
There is also another form of if statement:
unless (condition)
{
true branch
}
The condition has a wide range of comparison operators.
It is important to observe the distinction between
numeric comparisons and string comparisons.
numeric string meaning
== eq equals
!= ne not equal
> gt greater than
< lt less than
Strings that do not consist of numbers have a value of
zero.
if ("abc" == "def")
is TRUE, because the strings are numerically zeros. To
make this work right you have to have
if ("abc" eq "def")
perl has file test operators like shell scripts. perl
has an extended set to tests such as:
-T true if file is text
-B true if file is binary
-M days since file modified
-A days since file accessed
-C days since file created
Other forms of the if-command are not common in other
computer languages, but can be quite useful. A good
example is the postfix if.
next if $var == 1;
A useful form of logic uses || or && in a command:
open (IN," to open file F for write only.
X = >> to append to file F.
X = | to WRITE to a pipe to PROGRAM F.
Y = | to READ from a pipe from PROGRAM F.
If only the filename is provided, the file is
opened for read and write.
Reading:
The most basic reading mechanism is to enclose the
filehandle in <>, like this
$record = ;
A special case of this goes like this:
$record = <>;
This special case looks for filenames on the program
command line and reads any files it finds, one after the
other. If it finds no filenames on the program command
line, the program will assign <> to STDIN.
It is important NOT to use the array form:
@record = ;
as this will read the entire file into the array @record,
which may take up an awful lot of memory.
Reading is often done using a while loop, like this:
while ()
{
commands
}
When the last record is read, the returns
the value FALSE, which terminates the while loop. Since
a scaler variable has not been supplied for the record,
the record is stored in $_.
Writing:
Most writing is done using the print or the printf
commands. These commands are used to write to files even
if the results are never actually printed on a hardcopy
device.
print writes a line with default line spacing. It is
used when the output has no particular column spacing to
comply with:
print STDOUT "The X is $x and Y is $y\n";
printf is just like the printf in C and other similar
languages. It is a formatted print. The first variable
or string contains the format.
$fmt = " X = %8.2f Y = %8.2f Flag = %s\n";
printf STDOUT ($fmt, $x, $y, $flag);
The \n is the new line character. The % indicates the
beginning of a format character, the f is the format for
floating point numbers. The 8.2 indicates the number is
8 characters long with a decimal point in the sixth
character, and two decimal places in the seventh and
eighth characters. The %s is a character string with no
length specified.
Closing:
perl will automatically close any open files when it
exits. There are some occasions where it is useful to
close a file before perl exits, so the there is an
explicit close.
close FILEHANDLE;
Other Important Functions:
Error Messages:
die is used to print an error message and then exit.
warn is used to print an error message, but continue.
String Handling:
split is used to split tokens (fields) from a character
string into an array.
If you have a line:
$line = "Now is the time for all good men";
you can put each word into an array with the command:
@token = split(/\s+/,$line);
sort sorts a list or array.
study, an instruction I issue many times to my 12-year-
old, optimizes string operations.
Binary Encoding:
pack packs values into a string using a template.
$pi = pack("f",3.1415926);
puts pi into a floating point number.
unpack extracts values from a string using a
template.
$pi2 = unpack("f",$pi);
There is a long list of templates you can use. You can
use more than one template at a time to build up or
extract binary data from a record.
l long 32 bit signed integer
L long 32 bit unsigned integer
s short 16 bit signed integer
S short 16 bit unsigned integer
f float 32 bit floating point
d double 64 bit floating point
A ASCII ASCII string
c char a single byte (character)
System:
There are many system oriented functions including:
chmod change file permissions
fcntl sets file control options
fork creates an independent sub-process.
mkdir make a directory
Regular Expressions:
Regular expressions and pattern matching are an important part
of all Unix programming. perl adds a set of extended regular
expression characters to the standard set.
There are two ways regular expressions are used:
Match m/regexp/
m is optional, you can use /regexp/
next if m/^\s*$/; will skip blank lines.
Substitute s/regexp/new/
If the regexp matches, replace it with new.
s/\s*$//; will trim trailing spaces from a line.
Standard Set (not complete)
a match a
a* match zero or more character a's
. match any character
.* match zero or more of .
[a-m] match characters a through m only
[^n-z] do not match letters n to z
[a-m]* match zero or more letters a to m
^ match the beginning of the line
$ match the end of the line
\t matches a tab character
perl extensions (not complete)
\d same as [0-9]
\D same as [^0-9]
\s matches white space (space or tab)
\S matches anything but white space
\w same as [0-9a-zA-Z] characters)
\W same as [^0-9a-zA-Z]
.+ same as ..*
[a-m]+ match one or more letters a to m
a{n,m} at least n a's, not more than m a's
a? zero or one a, not more than one
\cD matches control-D
An important use of regular expressions is the use of () to
select subsets of the regular expression. This is actually a
standard part of regular expressions and can be used in vi,
awk, sed, and anywhere regular expressions are found. perl
makes it especially easy to use the ()
For instance, if you had the character string:
"SeisWorks 3D" "s3d 2> /dev/null"
as is found in launcher.dat, you could use the regular
expression:
;
if ( m/^\t"(.+)"\s*"(\S+)\s+2>\s*(.+)$/ )
{
($title, $program, $errorFile) = ($1, $2, $3);
}
to extract the title, program name, and the error file name.
The way this works is:
; reads a record. Since it doesn't say
where to pu the record, it is stored in
$_.
m/.../ matches a regular expression. Since it
doesn't say what variable to use, it uses
$_.
^ matches the beginning of the line
\t matches the initial tab.
" matches the first "
( starts the first extracted string
.+ matches one or more of any character
) closes the first extraction, placing it
in $1
" matches the second "
\s* matches zero or more spaces or tabs
" matches the third "
( starts the second extraction
\S+ matches any characters but space or tab
) closes the second extraction, placing it
in $2
\s+ matches one or more spaces or tabs.
2 matches 2
> matches >
\s* matches zero or more spaces or tabs
( starts the third extraction
.+ matches one or more characters
) closes the third extraction, placing it
in $3
" matches the fourth "
$ matches the end of the line
$title = $1; puts the value from $1 into $title.
Report Writer:
The report writer feature lets you define how your page should
look and do all the necessary assignments with a single
command. The report writer takes care of page breaks, page
numbers, and other issues for you.
format STDOUT_TOP =
Projects Using Too Much Disk page @##
Project Owner Last Used Cost
-------------- -------- ------------- -----------
$%
.
format STDOUT =
@<<<<<<<<<<<<< @<<<<<<< @>>>>>>>>>>>> @#######.##
$project, $owner, $lastUsed, $cost
.
while (<>)
{
($project, $owner, $lastUsed, $cost) = split;
write;
}
_TOP indicates a heading
. ends a format description
$% is the page number
@<<<< is a left justified field
@>>>> is a right justified field
@###.## is a right justified, two decimal number
Debugger:
perl has a built-in debugging system.
To use the debugger, all you have to do is add a -d to the
first line of the program.
#! /usr/local/bin/perl -d
commands
When you run the program, it will start in debug mode. You
then have many debugging commands you can use including:
h help on debugger
s step
c continue to next break
c continue until line
n next (does not step into subroutines)
l list program statements in the
b sets a breakpoint at line
p prints which is usually a variable
AN EXAMPLE USING AN EXPORTED HORIZON FILE:
Background:
A typical exported horizon file from Seisworks is in the form:
Line Trace X Y Z
where Z is often the time, but in this example, we are going
to export the amplitude as Z.
What we want to do in this example is to clip the amplitudes
to some specific range of values. Anything below the range
will be set to the lowest value in the range, anything above
will be clipped back to the highest value in the range.
This can also be done using bcm. We chose this example
because it is easy to follow and can be extended to do things
bcm cannot do. This simplified example is taken from a
program used by BHP Petroleum (Americas) to suppress tuning
effects resulting from a formation thickness being close to
the size of a seismic wave length.
This example is also kept simple. An experienced perl
programmer would use more sophisticated programming to write
a shorter, faster, program.
Usage:
Before using this program the first time, you must use
chmod +x horizonClip
to make it an executable file. There is no compile step or
link step as in C, Fortran or other languages.
You have to extract the file using the data export feature of
SeisWorks.
The program is called by typing:
horizonClip low high filein fileout
You can then re-import the horizon using the data import
feature of SeisWorks.
Program:
Note: The line numbers do not appear in the file or in the
program, they are just used in this paper to help you follow
the program:
1 #! /usr/local/bin/perl
2 die "Usage: horizonClip low high in out\n"
3 if $#ARGV !=3;
4 $low = $ARGV[0];
5 $high = $ARGV[1];
6 if ($low > $high)
7 {
8 $tmp = $low;
9 $low = $high;
10 $high = $tmp;
11 }
12 $filein = $ARGV[2];
13 $fileout = $ARGV[3];
14 if ($filein eq "-")
15 {
16 open (IN,"<&STDIN");
17 }
18 else
19 {
20 open (IN,<$filein)
21 || die "No file $filein $!\n";
22 }
23 if ($fileout eq "-")
24 {
25 open (OUT,">&STDOUT");
26 }
27 else
28 {
29 open (OUT,>$fileout)
30 || die "Cannot make $fileout $! \n";
31 }
32 while ()
33 {
34 ($line, $trace, $x, $y, $z) = split(\s+);
35 if ($z < $low) {$z = $low;}
36 if ($z > $high) {$z = $high;}
37 printf OUT ("%20s %12s %12 %12s %12.2f",
38 $line, $trace, $x, $y, $z);
39 $count++;
40 }
41 print STDOUT "Processed $count records";
Details:
1 The first line of all perl programs (on Unix
platforms).
2 - 3 Post-fix if. There are four arguments. $#ARGV is
3 because $ARGV starts at 0.
4 - 5 We could have as easily said:
($low, $high, $filein, $fileout) = @ARGV;
16 - 25 We can assign filehandles to other filehandles
(merge the output of the filehandles) using the
open statement and the &.
20 - 29 These are more standard opens.
32 - 40 The while loop continues until becomes false.
39 $count++ adds one to the value of $count.
HOW TO GET PERL:
perl is free, it is NOT PUBLIC DOMAIN. Public Domain means
there is no identifiable owner or the public at large is the
owner. perl is owned by Larry Wall. Larry gives everybody a
free LICENSE to use perl. That is not the same as ownership.
If Larry let you use his lawnmower for free, you wouldn't own
it. There is a license file that comes with perl. READ AND
UNDERSTAND THE LICENSE. Read it again before selling any
software based on perl.
Many of the CD-ROMS available have perl on them. In many
cases you can get the perl executable binary so you don't even
have to compile it. These CD-ROM's are advertized in most
Unix trade magazines.
Some Walnut Creek and some Prime Time Freeware CD-ROM's have
perl in SunOS executable form. There is book available at
BookStop and other bookstores called:
Prime Time Freeware for Unix, $60.00
ISBN 1-881957-04-7
The CD-ROM in the book Unix Power Tools has perl on it, and is
available from several bookstores in the Houston area.
Unix Power Tools, $59.95
ISBN 0-679-79073-X
The best way to get perl is via the Internet. This will get
you the latest version with the latest bug fixes. One place
to get perl on the Internet is:
ftp ftp.uu.net
login: anonymous
password: your-internet-name@your-internet-site
ftp> cd /gnu
ftp> binary ----- DON'T FORGET THIS LINE
ftp> get perl-4xxxx.tar.Z
ftp> bye
When you get it back to your machine, you will need to
uncompress it, un-tar it, and execute the make command.
It is a good idea to get gcc (also free) rather than using the
bundled C compiler on SunOS. gcc will make a much faster
executable of perl.
BOOKS ON PERL
The main reference is
Programming Perl, usually called "The Camel Book",
by Larry Wall and Randal Schwartz.
Published by O'Reilly & Associates,
ISBN 0-937175-64-1.
A more tutorial, but less complete, book is
Learning Perl, usually called "The Llama Book",
by Randall Schwartz.
Published by O'Reilly & Associates,
ISBN 1-56592-042-2.
A book giving examples of perl for systems administration is
supposed to come out soon, but I have been unable to get
details about it.
The Energy Related Unix User's Group (ERUUG) Unix Cookbook has
several example programs related to petroleum. This book is
available to members of ERUUG, and is available to guests. It
is also available on the Internet at this world wide web
location:
http:/www.glg.ed.ac.uk/
SUPPORT FOR PERL:
The main source of support for perl is the Internet newsgroup
comp.lang.perl
All the big names in perl follow this newsgroup and many
people on the net will answer questions. I usually get an
answer in a few hours. That is better than any computer
department, vendor help desk, or on-site support
representative I have ever dealt with.
some big names on the Internet for perl include:
Larry Wall (author of perl)
lwall@netlabs.com
Randall Schwartz
merlyn@stonehenge.com
Tom Christiansen
tchrist@perl.com
(303) 444-3212
Randall Schwartz and Tom Christiansen are consultants.
The Energy Related Unix User's Group (ERUUG) has several
members with at least some perl experience.
There are several consulting services that can install and
support perl. Sometimes MIS Departments insist that all
programs acquired must be paid for and have paid support. Of
course no-one supports most programs in Unix, particularly not
awk or the bundled C compiler, but most MIS Departments are
still learning about Unix and want to run things the way they
did on the VAX or IBM mainframe.
You can usually get around the "MIS shuffle" by buying the CD.
Cygnus Support sells support for many free software packages,
and unlike most software vendors supporting their own
packages, Cygnus Support actually provides support.
APPENDIX I
SOME USEFUL PERL SCRIPTS:
These scripts have been written to illustrate points made in
this paper. They are not always the most efficient, compact,
or best way to write the particular script.
Correct bcm2d histogram: page 23
Dumping SEG-Y Headers: page 24
Change nulls to spaces in file page 25
Splitting an HPGL file: page 25
Program to correct bcm histogram:
The histogram feature of the bcm program has a small but
annoying round off error. It also does not make a visual
histogram. This program reads a bcm listing, selects the
histogram portion, recalculates the percentages and draws a
histogram to the side.
#! /usr/local/bin/perl
if ($#ARGV != 1)
{
print STDERR "Usage: histofix in.file out.file\n";
exit;
}
open (IN, "<$ARGV[0]");
open (OUT, ">$ARGV[1]");
while ()
{
print OUT;
last if /\*\*\* *\.STATS *: *Summary/;
}
$skip = ; print OUT "$skip\n";
$skip = ; print OUT "$skip\n";
while ()
{
last if m/^ *$/;
m/^\s*(\d+\.\d*)\s+(\d+) /;
($interval[$i],$count[$i]) = ($1, $2);
$total += $count[$i];
$big = $count[$i] if $count[$i] > $big;
$i++;
}
while ($i > $j)
{
$pc = ($count[$j] * 100.0) / $total;
$pct += $pc;
$graph = "X" x int((($count[$j] / $big) * 20) + 1);
printf OUT "%12.4f %15.0f $7.2f %7.2f %s",
$interval[$j], $count[$j], $pc, $pct, $graph;
$j++;
}
while () {print OUT;}
Dumping SEG-Y Headers:
This program is around 500 lines long and thus too long to
include here in its entirety. The program reads the EBCDIC,
Binary, and first Trace header of a SEG-Y file on disk or
tape.
#! /usr/local/bin/perl
for $i (0..255) {$ebcdic{$i} = "_";}
$ebcdic{ 0} = "~";
$ebcdic{ 64} = " ";
...
$ebcdic{129} = "a";
...
$ebcdic{193} = "A";
...
$ebcdic{249} = "9";
$binHeadTemplate = "l3s25s170";
$traceHeadTemplate = "l7s4l8s2l4s13S2s31f5ss17";
...
sysread (IN,$ebcdicHeader,3200);
sysread (IN,$binaryHeader,400);
sysread (IN,$traceHeader,240);
print STDOUT "--------------EBCDIC---------";
for $i (0..3199) {substr($asciiHeader,$i)
= $ebcdic{ord(substr($ebcdicHeader,$i,1))} };
for $i(0..39)
{
$line = substr($asciiHeader,$i*80,80);
print STDOUT "$line\n";
}
( $jobid,
$lineid,
$reel,
...
$vibratoryPolarity
) = unpack($binHeadTemplate,$binaryHeader);
print STDOUT "--------------Binary---------";
print STDOUT "jobid $jobid \n";
print STDOUT "lineid $lineid \n";
print STDOUT "reel $reel \n";
...
print STDOUT "vibratory polarity $vibratoryPolarity \n";
( $traceLine,
$traceReel,
...
$overTravelTaper,
) = unpack($traceHeadTemplate,$traceHeader);
print STDOUT "--------------Binary---------";
print STDOUT "Trace Line $traceLine \n";
....
print STDOUT "Over Travel Taper $overTravelTaper \n";
exit;
The information here should give anyone who is familiar with
SEG-Y enough information to reconstruct the program. If you
are not familiar with SEG-Y, obtain the Seismic Unix (SU)
package from the Center for Wave Phenomenon at the Colorado
School of Mines. This package contains more than enough
information to complete this program.
Program to convert nulls to spaces in bcm output:
The output of bcm2d and bcm3d has some sloppy code that prints
nulls instead of spaces. This is usually okay for vi and
more, but interferes with the correct operation of aXe or
Xless. It also makes it harder to read the file into a
spreadsheet. This program looks in any file for null
characters and changes them to spaces:
#! /usr/local/bin/perl
open (IN,"<$ARGV[0]");
open (OUT,">$ARGV[1]");
while (!eof(IN))
{
$c = getc(IN);
$c = " " if ord($c) == 0;
print OUT $c;
}
Converting a monolithic HPGL file to records:
It is common to find HPGL and other plotter control files
given as one long record with no new-lines. These files are
hard to troubleshoot or transfer between programs. The fold
command can split the file into arbitrary length records, but
what you want is to be able to make sense of the commands.
HPGL files contain plotter commands separated by semicolons.
HPGL ignores embedded new-lines.
A perl program to fix this can be as simple as:
#! /usr/local/bin/perl
while (<>)
{
s/;/;\n/g;
print;
}
It could actually be done as simply as:
perl -pi.bak -e 's/;/;\n/g' hpgl.file
APPENDIX II
PERL 5:
perl 5 was released just as this report was being prepared.
These are a few new features we expect to see in perl 5.
* awk-like BEGIN and END sections.
* Better access to system function calls.
* Pointers and structures
* Object-oriented programming features
* Additional regular expression features.
Tkperl 5:
Tk is a set of libraries and functions to create X-windows
"widgets", picture elements such as scroll bars, pull down
menus, and even "canvas" graphics.
Tkperl 4 used embedded Tcl (Tool Command Language) to use Tk.
Tkperl 5 has native access to Tk.
APPENDIX III
OTHER FREE SOFTWARE YOU SHOULD KNOW ABOUT:
expect:
expect is a program that lets you run the kind of programs
that ask you stupid questions every fifteen minutes or so. An
expect script can read what the program prints and give it an
answer according to your instructions.
A good example is bcm3d. Part way through the program, bcm3d
asks you for a real number. Later on in the program, it asks
if you want to run the program. This makes it hard to put
together a shell script of ten or twenty bcm3d jobs and run
them overnight. expect can anticipate these questions and
answer them for you. You can start the job overnight and go
home to your family.
This is an example:
#! /usr/local/bin/expect -f
# disable timeouts
set timeout -1
# start the bcm3d program using the
# parameter specifying a .pcl file
spawn bcm3d [lindex $argv 0]
# wait for the reel number question
expect "*number :*"
exec sleep 1
send 1\r
# wait for the ready question
expect "*Ready, or A to Abort :*"
exec sleep 1
send r\r
# wait for completion message
expect "*ended normally*"
exec sleep 1
exit
There is a book coming out about expect that you will want to
read.
Exploring Expect
by Don Libes
Published by O'Reilly & Associates,
ISBN: 1-56592-090-2
Tcl/Tk:
Tcl/Tk is a shell script like language for writing X-windows
applications. It is not terribly easy, but is much easier
than writing C, C++, Motif and X programs.
There is a new book out about Tcl/Tk.
Tcl and the Tk Toolkit
John Ousterhout
Published by Addison-Wesley
ISBN: 0-201-63337-X
GNUtar:
GNUtar is like regular tar except:
* It can write tapes across the network (no more RFS or dd
to worry about).
* It can compress files as it backs them up.
* It strips the leading slash off the path, so you don't
have absolute paths in your tar.
If you get nothing else, get GNUtar.
gzip / gunzip:
gzip, gunzip, and znew are programs that work basically like
the standard SunOS compress and uncompress, except that they
typically get 40% more compression. znew takes a file
compressed with compress and turns it into a gzip file. Files
compressed with gzip have a .gz extension rather than the .Z
extension of compress.