2024 π Daylatest newsbuy art
Here we are now at the middle of the fourth large part of this talk.Pepe Deluxeget nowheremore quotes
very clickable
data + munging

The Perl Journal

Volumes 1–6 (1996–2002)

Code tarballs available for issues 1–21.

I reformatted the CD-ROM contents. Some things may still be a little wonky — oh, why hello there <FONT> tag. Syntax highlighting is iffy. Please report any glaring issues.

Terrence Brannon (2000) Beyond Hardcoded Database Applications with DBIx::Recordset. The Perl Journal, vol 5(4), issue #20, Winter 2000.

Beyond Hardcoded Database Applications with DBIx::Recordset

Terrence Brannon


Introduction

Perl has been dubbed "The Duct Tape of the Internet", providing a comfortable interface to an ever-growing number of external technologies. In this article I provide an introduction to one of the most convenient and powerful Perl interfaces to relational databases: DBIx::Recordset.

Look Mom, No SQL!

By and large, when using databases from a programming language, you are creating, reading, updating, or deleting data -- CRUD for short. If you use the DBI module directly, you have to write SQL for these operations. In contrast, DBIx::Recordset supports CRUD through the four simple functions Insert(), Search(), Update(), and Delete().

Under the hood, DBIx::Recordset generates SQL and ensures that it is syntactically correct for whatever database you are using. There are large applications written using DBIx::Recordset that require no code changes to run on MySQL, Microsoft Access, or Oracle. This lets you ship database code off-site knowing that it will run with whichever database they have. It also future-proofs internal applications against unforeseen changes in database systems.

Sample Usage

Though DBIx::Recordset provides a large range of functionality, this article will focus on a real-world example I developed during a recent contract job.

The company had a user registration table that contained basic information about a client such as name, email address, and phone number. However, a new table was developed to store demographic information about a client in addition to these fields. My task: copy all the current information from the original table to the new user demographics table. The basic information would simply be copied over and the demographic information would be generated randomly.

Listing 1 shows uregisternew, the table designed for this purpose. The last four fields of uregisternew are the added demographics fields. Each of these fields contains a non-negative integer indicating the user's status for a particular demographic. The integer is an index into an array.

For example, the demographic for gender might be Male, Female, or unspecified. Once a user indicates their gender, an integer is used to index into this anonymous array:

 $Angryman::User::profile{gender} = ['n/a','Male','Female'];
Thus, a user whose gender is Male will have a 1 for his gender demographic field. There are similar anonymous arrays for age bracket, occupation, and income. Listing 2 shows the actual program which performs my assigned task.

uregisternew is a table with all the fields of uregister plus a few profile fields (such as salary bracket, occupation, and age), each containing an index into the array for that particular profile field. The program shown in Listing 2 copies over the fields and generates a valid array index for the new profile fields. We'll go through the program step by step.

  use Angryman::User;
  use DBIx::Recordset;
  use Math::Random;
  use strict;
  use vars qw(%table %connect %profile);
Here, we load modules. Angryman::User has all the company information (the demographic anonymous arrays) discussed above. DBIx::Recordset is the module we're discussing in this article. Math::Random is an excellent Perl module that allows one to create random numbers of all sorts. And of course, if you want bulletproof code, you turns on strict and explicitly declare variables with the vars pragma.

  $table{in}  = 'uregister';
  $table{out} = 'uregisternew';
In these two lines we define the original table with basic user information and the target table which holds the basic user information as well as demographics information.

  %connect =  (
       '!DataSource' => 'DBI:mysql:test',
       '!Username'   => 'root'
  );
Here we define the information for connection to the database. In the real script, this information was actually sequestered into a system configuration file; the code shown here is for illustration only.

  # connect to database and SELECT * FROM uregister
  *::uregister = DBIx::Recordset->Search ({           
                                  %connect, 
                                  '!Table' => $table{in}
                                       });
This one line of DBIx::Recordset packs quite a punch: we connect to the database and retrieve all records from the original table. The connect hash gives DBIx::Recordset instructions on how to connect. The table directive indicates which table we want to retrieve records from.

Because we bind the results of the Search() call to a typeglob, we actually have three things we can use. A scalar ($::uregister) allows object-oriented access to the database. The @::uregister array allows us to access all the records, and the %::uregister hash allows us to access the fields of the current record.

  *::uregisternew = DBIx::Recordset->Setup({ 
                         %connect,
                         '!Table' => $table{out}
                                       });
While all of the CRUD functions support connection and database processing in one line, we will be inserting many records into the target table. We therefore connect to the database and target table in the above lines and re-use the connection on each iteration of the loop, to avoid having to connect and disconnect more than necessary.

 # iterate through recordsets from old table:
 while (my $record = $::uregister->Next) {
Remember how when we connected to our original table we set up the scalar, the array, and the hash? In the line above we make use of the scalar's Next() method so that we can iterate across the retrieved records. On databases which reliably return the number of rows retrieved we can make use of the array we setup as follows: for my $record (@::uregister) .

     &randomize_user_profile;
This randomizes the demographic fields and stores the results in a hash named %profile .

     # INSERT
     # the old table data into the new table along with
     # the computed hash of profile data
     $::uregisternew->Insert({%$record, %profile});
We take the data in both hashes and insert it into the database. (Note that Insert() takes a hash reference as its argument, not a hash.) DBIx::Recordset automatically takes the key-value pairs of the hash and creates the necessary SQL statements to insert the data. This is a great time-saver. You needn't personally worry about getting your data
in order and you also don't have to manually quote your string fields.

A DBI Version

One of the most important steps in Perl history was the authoring of DBI. Prior to DBI, several database-specific utilities such as oraperl and sybperl existed, and code for one utility wouldn't work on another. With DBI, all basic database tasks such as connection, statement execution, and disconnection can execute on many databases with no changes in Perl code.

DBI has its place. When you really care about performance, DBI's fine-grained process lets you tweak the parts you need. When you need to prepare very complex SQL queries, DBI lets you do it. And when you need to use database-specific features in queries or statements, DBI lets you do that too.

But the bulk of database work doesn't require the programmer to care about the details of DBI. This section will discuss Listing 3, a DBI implementation of the same task, so you can see the benefits of DBIx::Recordset.

As before, we'll go through this program step by step, skipping over the modules and starting with the DBI initialization.

 # connect to database and SELECT * FROM uregister
 my $dbh = DBI->connect('DBI:mysql:test','root')
|| die "$!\n";
 my $sth = $dbh->prepare('SELECT * FROM uregister');
 my $ret = $sth->execute;
Here we connect to the database and retrieve all records from the original table. Notice that this takes three lines, and we still don't have our database data stored in Perl variables for automatic usage. It's possible to reduce the number of lines by one if we use DBI's selectall_arrayref() function to return results. But we will never be able to shorten this to one line, because connection and statement execution occur in separate DBI calls.

  &determine_target_database_field_order;
This line of code may seem rather innocuous, but it reveals the most about the difference between DBI and DBIx::Recordset. The whole idea of this program is to retrieve records from one table and put them in another, right? With DBI, you can retrieve data into an array, an array reference, or into a hash reference. If I wanted to assume that my target table would never change, I could retrieve my data into an array or array reference and then commit it to the target database.

But I wanted my code to work regardless of whether the company moved around (or even deleted) a few fields in either database. So I chose hash references instead for this project. And since hashes are unordered, I needed an array with the key names in correct order. The function &determine_target_database_field_order() does just that. With DBIx::Recordset, such a function is not necessary: it takes the hash references you give it and correctly quotes and orders it for writing to the database.

  # iterate through the recordsets from the old table:
  while (my $record = $sth->fetchrow_hashref) {
We fetch each row of the original table into a hash reference:

    &randomize_user_profile;
We create a hash with the demographics data:

    &fiddle_with_my_data_to_get_it_to_work_with_the_DBI_API($record);
The purpose of this function is to take the data to be inserted and manually order and quote it. This is unnecessary with DBIx::Recordset, as it handles these issues for you automatically. We then finish by inserting the record into the target table:

    # INSERT 
    # the old table data into the new table along with
    # the computed hash of profile data
    my $sql = 
        "INSERT into $table{out}($sql_field_term) 
                               values($insert_term)";
    $dbh->do($sql);
Conclusion

DBIx::Recordset contains a wealth of functionality. It is currently the only CPAN database interface that allows for completely database-independent database use. Common tasks can be coded up quickly and concisely. In addition to the features discussed in this article, DBIx::Recordset supports many other common tasks such as debugging, multi-table processing, and HTML-based navigation of tables. So the next time you want your database code up and running quickly and portably, give DBIx::Recordset a try.

Terrence Brannon (tbone@cpan.org) is an active day trader and roaming Perl consultant who fiends too much on Perl to keep up his skills in Mandarin Chinese. He dreams of creating the ultimate realtime stock trading system in Perl -- but he's too busy trading to write it.

listing 1

The uregisternew Table.
Terrence Brannon (2000) Beyond Hardcoded Database Applications with DBIx::Recordset. The Perl Journal, vol 5(4), issue #20, Winter 2000.
	Field	    Type   
	-----	    ----
	firstname	varchar(50) 
	lastname	varchar(50) 
	userid	    varchar(50) 
	password	varchar(50) 
	address1	varchar(50) 
	city	    varchar(50) 
	state	    char(2)     
	zippostal	varchar(20) 
	email	    varchar(50) 
	phone	    varchar(50) 
	fax	        varchar(50) 
	dob	        date
	occupation	varchar(50) 
	gender	    varchar(20) 
	income	    varchar(25) 
	age	        varchar(255)

listing 2

Using DBIx::Recordset to copy user records from a source table (uregister) to a target table (uregisternew).
Terrence Brannon (2000) Beyond Hardcoded Database Applications with DBIx::Recordset. The Perl Journal, vol 5(4), issue #20, Winter 2000.
 use Angryman::User;
 use DBIx::Recordset;
 use Math::Random;
 use strict;
 use vars qw(%table %connect %profile);
 $table{in}  = 'uregister';
 $table{out} = 'uregisternew';
 %connect =  (
     '!DataSource' => 'DBI:mysql:test',
     '!Username'   => 'root'
 );
 # connect to database and SELECT * FROM uregister
 *::uregister = DBIx::Recordset->Search ({            
                                          %connect, 
                                          '!Table' => $table{in}  
                                      });
 # because we will re-use the target table many times, we separate the 
 # connection and insert steps with this recordset
 *::uregisternew = DBIx::Recordset->Setup({  
     %connect, 
     '!Table' => $table{out} 
 });
 # iterate through the recordsets from the old table:
 while (my $record = $::uregister->Next) {
     &randomize_user_profile;
     # INSERT:
     # the old table data into the new table along with
     # the computed hash of profile data
     $::uregisternew->Insert({%$record, %profile});
 }
 # Angryman::User::Profile is a hash in which each key is a reference 
 # to an array of profile choices. For example:
 # $Angryman::User::Profile{gender} = [ 'male', 'female' ];
 # $Angryman::User::Profile{age} = ['under 14', '14-19', '20-25', ... ];
 # Because we don't have the actual data for the people in uregister,
 # we randomly assign user profile data over a normal distribution.
 # when copying it to uregisternew.
 sub randomize_user_profile {
     for (keys %Angryman::User::Profile) {
         my @tmp = @{$Angryman::User::Profile{$_}};
         $profile{$_} = random_uniform_integer(1, 0, $#tmp);
     }
     $profile{dob}='1969-05-11';
 }

listing 3

DBI implementation of program to copy user records from a source table (uregister) to a target table (uregisternew).
Terrence Brannon (2000) Beyond Hardcoded Database Applications with DBIx::Recordset. The Perl Journal, vol 5(4), issue #20, Winter 2000.
 use Angryman::User;
 use DBI;
 use Math::Random;
 use strict;
 use vars qw(%table $sql_field_term $insert_term %profile @ordered_fields);
 $table{in}  = 'uregister';
 $table{out} = 'uregisternew';
 # connect to database and SELECT * FROM uregister
 my $dbh = DBI->connect('DBI:mysql:test', 'root') || die "$!\n";
 my $sth = $dbh->prepare('SELECT * FROM uregister');
 my $ret = $sth->execute;
 &determine_target_database_field_order;
 # because we will re-use the target table many times, we separate the 
 # connection and insert steps with this recordset
 # iterate through the recordsets from the old table:
 while (my $record = $sth->fetchrow_hashref) {
     &randomize_user_profile;
     &fiddle_with_my_data_to_get_it_to_work_with_the_DBI_API($record);
     # INSERT 
     # the old table data into the new table along with
     # the computed hash of profile data
     my $sql = 
         "INSERT into $table{out}($sql_field_term) values($insert_term)";
     $dbh->do($sql);
 }
 # Angryman::User::Profile is a hash in which each key is a reference 
 # to an array of profile choices. For example:
 # $Angryman::User::Profile{gender} = [ 'male', 'female' ];
 # $Angryman::User::Profile{age} = ['under 14', '14-19', '20-25', ... ];
 # Because we don't have the actual data for the people in uregister,
 # we randomly assign user profile data over a normal distribution.
 # when copying it to uregisternew.
 sub randomize_user_profile {
     for (keys %Angryman::User::Profile) {
         my @tmp = @{$Angryman::User::Profile{$_}};
         $profile{$_} = random_uniform_integer(1, 0, $#tmp);
     }
     $profile{dob} = '1969-05-11';
 }
 # Hmm, I can't just give DBI my data and have it figure out the order
 # of the database fields... So here we go getting the field order 
 # dynamically so this code doesn't break with the least little
 # switch of field position.
 # In DBIx::Recordset, I would just say $handle->Names()
 sub determine_target_database_field_order {
     my $order_sth = $dbh->prepare("SELECT * FROM $table{out} LIMIT 1");
     $order_sth->execute;
     @ordered_fields = @{$order_sth->{NAME}};
     $sql_field_term = join ',',  @{$order_sth->{NAME}};
 }
 # As ubiquitous as hashes are in Perl, the DBI API does not
 # offer a way to directly commit hashes to databases.
 sub fiddle_with_my_data_to_get_it_to_work_with_the_DBI_API {
     my ($record)=@_;
     my @output_data;
     for (@ordered_fields) {
         push @output_data, $dbh->quote
           (
             defined($record->{$_}) 
             ? $record->{$_} 
             : $profile{$_}
           );
     }
     $insert_term = join ',', @output_data;
 }
Martin Krzywinski | contact | Canada's Michael Smith Genome Sciences CentreBC Cancer Research CenterBC CancerPHSA
Google whack “vicissitudinal corporealization”
{ 10.9.234.152 }