Programmers sometimes make two common mistakes: They overuse the IBM i Reclaim Storage command and fail to understand how to efficiently use SQL as a programming language. These youdunits can cause unnecessary system downtime in the first case and limit DB2 performance in the second. Scott Forstie and Kent Milligan use this article to explain how to avoid both problems and thus better harness the power of the i.
by Scott Forstie
The Reclaim Storage (RCLSTG) command is an overused approach for managing the health and correctness of IBM DB2 for i system cross-reference. This article arms you with the knowledge to discern when a cross-reference reclaim operation saves the day versus when it causes an unnecessary interruption in system availability.
What Is the DB2 for i System Cross-Reference?
The system cross-reference embodies a set of physical and logical files that the i maintains as it performs database-related operations. The data stored within the files indicates the database objects that exist on the machine, the attributes and definition details for the database objects, and their relational details. The system cross-reference includes information for both SQL objects and native database objects. The files reside in the QSYS and QSYS2 libraries, and users cannot directly modify them.
You "reference" the data in the files when you execute database operations (i.e., commands, APIs, and SQL statements). The two system jobs nnnnnn/QSYS/QDBSRVXR and nnnnnn/QSYS/QDBSRVXR2 process cross-reference requests and generally take care of the system cross-reference files. Again, the user has no direct interaction with these system cross-reference jobs, but it helps to know why they're active jobs on your machine.
When you use Independent ASPs, you can find the cross-reference files in libraries QSYSxxxxx and QSYS2xxxxx, where xxxxx is the ASP number. You can find the cross-reference jobs in nnnnnn/QSYS/QDBxxxXR and nnnnnn/QSYS/QDBxxxXR2, where xxx is the ASP number.
Because operators cannot directly manipulate the system cross-reference objects or system cross-reference jobs, they use the RCLSTG command with the SELECT(*DBXREF) option to rebuild and validate the system cross-reference information.
Avoid the Hammer
If you're running V5R4 or later, you can use the Reclaim Database Cross-Reference (RCLDBXREF) command to assess the state of the system cross-reference and possibly fix any correctable problems without entering restricted state. RCLDBXREF has several key advantages over RCLSTG:
- You can run RCLDBXREF any time, while RCLSTG requires you to move the system into restricted state.
- RCLDBXREF has a *CHECK facility to tell you whether or not the system cross-reference is in good shape.
- RCLDBXREF has a LIB (library) parameter you can use with the *FIX option to selectively attempt to correct the error condition.
When to Use RCLSTG *DBXREF
You should use RCLSTG SELECT(*DBXREF) if any of these conditions exist:
- RCLDBXREF *CHECK indicates that RCLSTG SELECT(*DBXREF) is the required recovery operation, as previously explained.
- The system cross-reference files (catalogs) contain incorrect data.
- IBM Service or an IBM PTF cover letter recommends RCLSTG SELECT(*DBXREF) as a recovery or circumvention.
- A message in the user's job log indicates that RCLSTG SELECT(*DBXREF) is a recovery for the error.
Do not use RCLSTG SELECT(*ALL) or RCLSTG SELECT(*DBXREF) if CPF32A4 is returned from RCLDBXREF *CHECK. The CPF32A4 message arrives when an IBM failure condition exists within the system cross-references' jobs. Because the cross-reference processing is in an error state, moving into a restricted state and running a reclaim won't solve the problem and could make things worse.
You Do It
Use the techniques described in this article to avoid unnecessary system downtime.
Scott Forstie is a senior software engineer at IBM. He is the IBM i SQL development leader for DB2 in Rochester, Minnesota. Before concentrating on DB2, he worked on Unix enablement for AS/400 and S/390 systems.
by Kent Milligan
IBM i programmers moving to SQL often fail to efficiently use it as a programming language. The natural tendency when first trying SQL is to apply native record-level access techniques and process one record or file at a time. To really harness SQL performance power, programmers should focus instead on how to use a single SQL statement to execute their data processing. To demonstrate this struggle, I include examples that review application programs written by an i customer.
Multiple SQL Statements for Extract and Populate
In the first case, the application contained logic to extract the transaction data for a specified date and write the removed data into a work table. The code in Figure 3 first declares and opens a cursor to read through the source table to find the rows matching the search criteria. For each matching row you discover, you save the relevant data into local host variables (e.g., :var1) and then use those variables on an SQL Insert to write the selected data into the target table.
Single SQL Statement for Extract and Populate
Although these SQL statements perform the processing required, you can use a single SQL statement to implement the requirement much faster. The Insert statement with a nested Select in Figure 4 accomplishes the same task with a single SQL statement. This performs much better because it's a single SQL call to DB2, and you can copy the data from the source to the target table inside the DB2 engine instead of copying the data to and from program variables.
Single SQL Statement for Create, Extract, and Populate
If you need to generate the target table for storing the extracted data, you can use a single SQL statement to produce the table and populate it with data from the source table. In Figure 5, the Create Table statement includes a nested Select statement. DB2 uses the Select statement to not only identify the rows and columns that you need to extract from the source table, but also examine the attributes of the selected columns to build the matching column in the target table.
Multiple SQL Statements for Multiple Objects
As I mentioned earlier, accessing one file at a time is another common mistake made by longtime IBM i programmers using SQL. In Figure 6 the logic associates customer data with order data for a specified order date. You declare and open a cursor to find the customers in the orders table who placed an order on that date. For each one you find, the code executes a second Select statement within the loop to retrieve the name and address for the specified customer ID from the customers table.
Single SQL Statement Processing Multiple Objects
To complete the process much faster, enhance the cursor definition to join the orders and customers tables. Figure 7 hows the enhanced cursor definition. You retrieve the same name and address values from the customers table after joining the two tables using the customer ID column.
You Do It
If you review these common SQL programming mistakes, you can gain a more thorough understanding of how to get better DB2 performance by harnessing the set-processing and advanced capabilities of SQL.
Kent Milligan is a senior DB2 for i specialist on IBM's ISV Enablement team. He spent the first eight years of his IBM career as a member of the DB2 development group in Rochester, Minnesota. He speaks and writes regularly about relational database topics.