2024 π Daylatest newsbuy art
In your hiding, you're alone. Kept your treasures with my bones.Coeur de Piratecrawl somewhere bettermore quotes
very clickable
data + munging

The Perl Journal

Volumes 1–6 (1996–2002)

Code tarballs available for issues 1–21.

I reformatted the CD-ROM contents. Some things may still be a little wonky — oh, why hello there <FONT> tag. Syntax highlighting is iffy. Please report any glaring issues.

The Perl Journal
#7
Fall 1997
vol 2
num 3
Just the FAQs: Short Circuits
&& and || or and and or, and chomp() and LABELs.
Win32 Perl
Perl for Windows.
Infinite Lists
A new construct that can manipulate endless data streams.
Perl/Tk: Binding Basics
Associating actions with events.
Perl News
What's new in the Perl community.
Perfect Programming
A collection of tips for the paranoid programmer.
A Perl in the Oil Patch
Of salt and sysread().
WebPluck
Amassing a personalized newspaper from the web.
MakeMaker: Doing More While Doing Less
How to prepare your modules for maximum portability.
Obfuscated Perl Contest - The Winners
A frightening display of cryptic virtuosity.
The Perl Journal One-Liners
Will Morse (1997) A Perl in the Oil Patch. The Perl Journal, vol 2(3), issue #7, Fall 1997.

A Perl in the Oil Patch

Of salt and sysread().

Will Morse


D uring the Jurassic Era, when dinosaurs walked the earth, what is now the Gulf of Mexico was an inland sea much like the present day Great Salt Lake. The water evaporated out of the sea much faster than it was replaced by water from rain or rivers, and the sea dried up, leaving behind it vast sheets of salt. Over the course of millions of years, dirt was blown and washed over the salt, burying it under thousands of feet of dirt and rock. Because of this protection, the salt didn't dissolve when the Gulf opened again to the sea.

As more and more sediment (mostly from the Mississippi and other Gulf Coast rivers) is deposited on the bottom of the Gulf, it crushes the salt. Much of Louisiana, Texas and Arkansas are actually lying on top of the old salt fields. A funny thing about salt is that under enough pressure, salt becomes plastic and starts to move through the sediments to reach equilibrium. The salt is actually moving faster under the Gulf of Mexico than the sea floor is spreading at the boundaries of the tectonic plates.

As the salt moves through the sediment, two things happen: the salt deforms the sediments it moves through, creating "traps" where oil can be caught. It also deforms the sea floor and modifies the way further sediments are deposited, creating pockets of oil and gas.

A major tool used in petroleum exploration is seismic interpretation. A ship tows an array of special microphones called hydrophones and a large air gun. The air gun makes a very loud, sharp bang, so loud that the hydrophones are able to record echoes from rock formations several miles beneath the sea. These echoes are then displayed as a graph called a seismic section showing the ship's track on the X axis and the travel time to each echo on the Y axis. If there are no velocity problems, the travel time is proportional to the depth - but in areas of salt, there are serious velocity problems.

Sound has a much higher velocity through salt than it does through sediment. This causes great distortions in the seismic section. In the past, it was very difficult to search for oil in areas with salt (which includes over half the Gulf of Mexico). Modern computers and the decline in the cost of computing has changed this. The array of computers and other equipment needed to make the calculations to refocus the seismic data would be an interesting story all by itself. Dozens of gigabytes of data are necessary to evaluate just one region surrounding a salt area. In fact, with the two gigabyte file size limit of many UNIX systems, it's necessary to break these logical files into many smaller physical files and concatenate them in the application.

It's worth the trouble. We suspect that there is at least as much if not more oil in the Gulf of Mexico under the salt as above. And the Gulf is only the beginning. There are several other major salt "plays" throughout the world. BHP Petroleum is a major player in sub-salt and deep water petroleum exploration in the Gulf of Mexico and other parts of the world. (BHP Petroleum is a division of the Broken Hill Proprietary Company of Melbourne Australia, which is a giant minerals, copper, steel, and petroleum resources company and the largest industrial company in Australia. We are probably better known in the United States, however, for our Foster's Ale.)

While the real number crunching applications are written in C and C++, a large amount of the data management and preparation is done in Perl. Perl has several advantages for us: it can read arbitrarily large records (up to the limits of swap, which in our case is usually close to two gigabytes), and the pack() and unpack() instructions let us read integer and floating point binary numbers. We use it for dozens of applications.

Without Perl, these applications would have to be written in C, or possibly even Fortran. Many of these are "one-off" programs; it would be very expensive and cause many delays if these programs had to be written with traditional MIS style development cycles. We have a small staff of computer-savvy geoscientists and geoscience-savvy sysadmins who can respond very quickly and produce useful scripts in minutes.

This program that clips seismic amplitudes on a tape or disk file in a particular format called SEG-Y. It demonstrates a few functions you don't see in many Perl programs: pack(), unpack(), sysread(), syswrite(), and eof().


Will Morse is a Scientific Systems Consultant for BHP Petroleum (Americas). He has been with BHP for twenty years in various scientific computing and systems administration positions. He also teaches Fundamentals of Unix, X Windows and Tcl/Tk, and Perl at the North Harris College Geoscience Technical Training Center. Will is a past Program Director for the Energy Related Unix Users Group (ERUUG) and is currently the ERUUG Unix Cookbook editor. He is also one of the founders and facilitators of the sci.geo.petroleum newsgroup.

listing 1

Program that clips Seismic Amplitudes
Will Morse (1997) A Perl in the Oil Patch. The Perl Journal, vol 2(3), issue #7, Fall 1997.
Program that clips Seismic Amplitudes

#!/usr/bin/perl
if ($#ARGV != 2) { 
   print STDERR "Usage: despike cutoff input.sgy output.sgy \n\n";
   print STDERR "If the cutoff is positive, values higher than the\n";
   print STDERR "cutoff will be clipped. If the cutoff is negative,\n";
   print STDERR "values lower than the cutoff will be clipped. The\n";
   print STDERR "output file cannot be the same as the input file.\n";
}
if ($ARGV[1] eq $ARGV[2]) { die "Output file can't be same as input file" }
$ebcdicLength = 3200; 
$binaryLength = 400; 
$traceHeaderLength = 240;
open (IN,  "<$ARGV[1]") || die "Could not open $ARGV[1] $! \n"; 
open (OUT, ">$ARGV[2]") || die "Could not open $ARGV[2] $! \n";
$cutoff = $ARGV[0];
# This next line makes $cutoff into a packed binary string 
# containing a single floating point number.
$cutoffp = pack("f", $cutoff);
# These lines just copy the EBCDIC header from the input file to the
# output file. Why are we using read() and syswrite() rather than <> 
# and print()? Because we might be reading from tape instead of disk,
# in which case the block structure must be preserved.
sysread  (IN,  $ebcdic, $ebcdicLength); 
syswrite (OUT, $ebcdic, $ebcdicLength); 
sysread  (IN,  $binary, $binaryLength); 
syswrite (OUT, $binary, $binaryLength);
# These two variables are set to the values of the short integers 
# at byte offsets 20 and 24 in the $binary packed string.
($numberSamples, $sampleFormat) = unpack("@20s@24s", $binary);
if ($sampleFormat == 3) { $traceLength = 2 * $numberSamples }
else                    { $traceLength = 4 * $numberSamples }
$traceBufferLength = $traceLength + $traceHeaderLength; 
while (!eof(IN)) {
    # We read with sysread() to get the exact number of bytes in a 
    # tape block. If the block is short, we either have a bad tape or
    # we're at the end of the tape. Either way, we're done. 
    $bytesRead = sysread(IN, $traceBuffer, $traceBufferLength); 
    last if $bytesRead < $traceBufferLength;
    $tracesRead++;
    # Announce every hundredth trace 
    if ($tracesRead % 100 == 0) { 
        print STDERR "Processed $tracesRead traces \n"; 
    }
    for $i (1 .. $numberSamples) { 
        $currentSample = $traceHeaderLength + ( ($i - 1) * 4);
        # Take the four bytes beginning at $currentSample; 
        $sample = substr($traceBuffer, $currentSample, 4);
        # ...and unpack() them as a single float 
        $amp = unpack("f", $sample);
        if ($cutoff > 0) { 
            if ($amp > $cutoff) {
                # If it's too high, replace it with the cutoff 
                substr($traceBuffer, $currentSample, 4) = $cutoffp; 
            } 
        } else { 
            if ($amp < $cutoff) {
                # If it's too low, replace it with the cutoff 
                substr($traceBuffer, $currentSample, 4) = $cutoffp; 
            } 
        } 
    } 
    syswrite(OUT, $traceBuffer, $traceBufferLength); 
} 
Martin Krzywinski | contact | Canada's Michael Smith Genome Sciences CentreBC Cancer Research CenterBC CancerPHSA
Google whack “vicissitudinal corporealization”
{ 10.9.234.152 }