Topic(s):   Informix

July 21, 2008

The Informix High Performance Loader - Part 3
Posted by Andrew Ford @ 02:55 PM ET | Jul 21, 2008

Welcome to the third and final installment of the Informix HPL blog series (at least for now.) In parts 1 and 2 we abandoned the GUI and went old school command line with onpladm (that's the way it was and weeeeeeeee liked it!), we unloaded and loaded lots of data to and from a flat file on disk, we moved data from one table to another in parallel using named pipes and we even moved data between servers without using flat files. This toolset should be enough to cover most of what you would ever need to do with HPL. If not, it will at least give you a solid foundataion to start with. In my final attempt to increase HPL awareness I want to take a look at the more tedious, but sometimes necessary, HPL maps and formats objects.

An HPL format object defines the structure of the unload or load file and a map object associates fields in the unload/load file to columns in your table.

In parts 1 and 2 we had no need for these things. We were performing FASTUNLOADJOB and FASTLOADLOB jobs because the strucutre of our unload and load tables were the same and these job types do not use formats and maps.

If you need to move data between tables with different structures you'll need to create LOADJOB and UNLOADJOB jobs (don't worry, they'll still be fast even if they don't say FAST anymore) and the necessary format and maps to describe the flat file and map flat file fields to table columns.

Say we have two tables, unload table and load table with the following schemas:

    create table unload_table (
       field1 char(16),
       field2 char(12),
       field3 char(12),
       field4 integer,
       field5 integer,
       field6 date,
       field7 smallint,
       field8 smallint
    );
    
    create table load_table (
        field2 char(12),
        field6 date
    );
    

And we want to unload the field2 and field6 columns from unload table and load them into the field2 and field6 columns of load table that is on the same server.

To do this we still need the same type of device arrays that can unload and load from a named pipe and we need a query object to select field2 and field6 from our unload table. We need a format object to describe our unload and load data structure. We need 2 maps, the first map will map the unloaded columns to the fields in the unload data and the second map will map the fields in the load data to columns in our load table. Finally we will need two jobs to unload and load the data.

    > cat blog4.spec
    BEGIN OBJECT DEVICEARRAY blog4_unl_devarrary
            BEGIN SEQUENCE
                    TYPE                    PIPE
                    FILE
                    TAPEBLOCKSIZE           0
                    TAPEDEVICESIZE          0
                    PIPECOMMAND             "cat - > /home/informix/blog/blog.p"
            END SEQUENCE
    END OBJECT
    
    BEGIN OBJECT DEVICEARRAY blog4_load_devarrary
            BEGIN SEQUENCE
                    TYPE                    PIPE
                    FILE
                    TAPEBLOCKSIZE           0
                    TAPEDEVICESIZE          0
                    PIPECOMMAND             "cat /home/informix/blog/blog.p"
            END SEQUENCE
    END OBJECT
    
    BEGIN OBJECT PROJECT blog4
    END OBJECT
    
    BEGIN OBJECT QUERY blog4_query
            PROJECT                 blog4
            DATABASE                blog
            SELECTSTATEMENT         "select field2, field6 from unload_table"
    END OBJECT
    
    BEGIN OBJECT DELIMITEDFORMAT blog4_format
            PROJECT                 blog4
            CHARACTERSET            ASCII
            RECORDSTART
            RECORDEND               newline
            FIELDSTART
            FIELDEND
            FIELDSEPARATOR          |
    
            BEGIN SEQUENCE
                    FIELDNAME                field2
                    FIELDTYPE                Chars
            END SEQUENCE
    
            BEGIN SEQUENCE
                    FIELDNAME                field6
                    FIELDTYPE                Chars
            END SEQUENCE
    END OBJECT
    
    BEGIN OBJECT UNLOADMAP blog4_unloadmap
            PROJECT                 blog4
            FORMAT                  blog4_format
            DATABASE                blog
            QUERY                   blog4_query
    
            BEGIN SEQUENCE
                    COLUMNNAME              field2
                    FIELDNAME               field2
                    JUSTIFICATION
                    CASECONVERT
                    DEFAULTVALUE
                    TRANSFERBYTES           0
                    COLUMNOFFSET            0
                    FIELDOFFSET             0
                    FIELDMINIMUM
                    FIELDMAXIMUM
                    FILLCHARACTER
                    PICTURE
                    FUNCTION
                    STORAGECODING
                    BLOBCOLUMN
            END SEQUENCE
    
            BEGIN SEQUENCE
                    COLUMNNAME              field6
                    FIELDNAME               field6
                    JUSTIFICATION
                    CASECONVERT
                    DEFAULTVALUE
                    TRANSFERBYTES           0
                    COLUMNOFFSET            0
                    FIELDOFFSET             0
                    FIELDMINIMUM
                    FIELDMAXIMUM
                    FILLCHARACTER
                    PICTURE
                    FUNCTION
                    STORAGECODING
                    BLOBCOLUMN
            END SEQUENCE
    END OBJECT
    
    BEGIN OBJECT LOADMAP blog4_loadmap
            PROJECT                 blog4
            FORMAT                  blog4_format
            DATABASE                blog
            TABLE                   load_table
    
            BEGIN SEQUENCE
                    COLUMNNAME              field2
                    FIELDNAME               field2
                    JUSTIFICATION
                    CASECONVERT
                    DEFAULTVALUE
                    TRANSFERBYTES           0
                    COLUMNOFFSET            0
                    FIELDOFFSET             0
                    FIELDMINIMUM
                    FIELDMAXIMUM
                    FILLCHARACTER
                    PICTURE
                    FUNCTION
                    STORAGECODING
                    BLOBCOLUMN
            END SEQUENCE
    
            BEGIN SEQUENCE
                    COLUMNNAME              field6
                    FIELDNAME               field6
                    JUSTIFICATION
                    CASECONVERT
                    DEFAULTVALUE
                    TRANSFERBYTES           0
                    COLUMNOFFSET            0
                    FIELDOFFSET             0
                    FIELDMINIMUM
                    FIELDMAXIMUM
                    FILLCHARACTER
                    PICTURE
                    FUNCTION
                    STORAGECODING
                    BLOBCOLUMN
            END SEQUENCE
    END OBJECT
    
    BEGIN OBJECT UNLOADJOB blog4_unload
            PROJECT                 blog4
            DEVICE                  blog4_unl_devarrary
            MAP                     blog4_unloadmap
            FILTER
            SERVER                  cheetah04
            DATABASE                blog
            REJECTFILE              /tmp/blog4_unload.rej
            LOGFILE                 /tmp/blog4_unload.log
            ISOLATIONLEVEL          DR
            MAXERRORS               0
    END OBJECT
    
    BEGIN OBJECT LOADJOB blog4_load
            PROJECT                 blog4
            DEVICE                  blog4_load_devarrary
            MAP                     blog4_loadmap
            FILTER
            SERVER                  cheetah04
            DATABASE                blog
            FLTFILE                 /tmp/blog4_load.flt
            REJECTFILE              /tmp/blog4_load.rej
            LOGFILE                 /tmp/blog4_load.log
            RUNMODE                 E
            GENERATEVIORECS         Y
            TAPES                   0
            NUMRECORDS              0
            STARTRECORD             0
            MAXERRORS               0
    END OBJECT
    
    > onpladm create object -F blog4.spec
    Successfully created object DEVICEARRAY   blog4_unl_devarrary
    Successfully created object DEVICEARRAY   blog4_load_devarrary
    Successfully created object PROJECT   blog4
    Successfully created object QUERY   blog4_query
    Successfully created object DELIMITEDFORMAT   blog4_format
    Successfully created object UNLOADMAP   blog4_unloadmap
    Successfully created object LOADMAP   blog4_loadmap
    Successfully created object UNLOADJOB   blog4_unload
    Successfully created object LOADJOB   blog4_load
    

The blog4_query QUERY object defines what data we are going to unload via a simple select statement.

The blog4_format DELIMITEDFORMAT object defines how our unload/load file will be structured, a standard pipe delimited ASCII file.

The blog4_unloadmap UNLOADMAP object links the field2 and field6 columns we are pulling from blog4_query to the field positions in the blog4_format unload file.

The blog4_loadmap LOADMAP object links the delimited fields in the blog4_format load file to the field2 and field6 columns of our load table.

The blog4_unload UNLOADJOB object defines a job that will unload our data using the blog4_unloadmap object (which is made of up from our query and format objects) and blog4_unldevarray.

And finally, the blog4_load LOADJOB object defines a load job that will load our data using the blog4_loadmap to the blog4_loaddevarray.

Run the unload and load jobs in parallel with the onpload command like we did in part 2 and HPL will pull field2 and field6 from the unload table and load them into field2 and field6 of the load table.

I hope this information has helped shed some light on HPL and it's mysterious voodoo like powers that enable it to quickly load and unload sick amounts of data. The fun doesn't have to end here, everything we've done so far has been with the default HPL configuration. If you want to make things run even faster you can take a look at the IDS docs and fine tune the HPL config paramters stored in the $INFORMIXDIR/etc/pconfig.std and try adding multiple devices to your device array.

Trackback Pings

TrackBack URL for this entry:
http://www.ibmdatabasemag.com/blog/main/archives/2008/07/the_informix_hi_2.html

« DB2 LUW Performance: Statement Analysis Introduction - Hunting Elephants and Mosquitoes | Main | DB2 9 DBA Certification: Create and insert statements, plus dbmtrk »





This is a public forum. CMP Media and its affiliates are not responsible for and do not control what is posted herein. CMP Media makes no warranties or guarantees concerning any advice dispensed by its staff members or readers.

Community standards in this comment area do not permit hate language, excessive profanity, or other patently offensive language. Please be aware that all information posted to this comment area becomes the property of CMP Media LLC and may be edited and republished in print or electronic format as outlined in CMP Media's Terms of Service.

Important Note: This comment area is NOT intended for commercial messages or solicitations of business.



CAREER CENTER
Ready to take that job and shove it?
SEARCH JOBS
RECENT JOB POSTINGS
CAREER NEWS
10 Search Engines You Don't Know About
Go beyond Google and get vertical. These specialized search sites will help you find the business information you need -- fast.

Subscribe to the new digital version of IBM Database Magazine
New Digital Version

Sponsored links:



Subscribe to the IBM Database Magazine Newsletter

Email Address *
First Name
Last Name
HTML Preference
HTML Text
 

Fields with * are required.

 




Visit these other IBM and TechWeb Partner Sites: :
Maximizing ROI Through Business Process Management (BPM) and Service-Oriented Architecture (SOA)
Internet Evolution – The Macrosite for News, Analysis, & Opinion About the Future of the Internet
Business Innovation – Technology Strategies and Solutions for Driving Business Success