Wednesday, January 10, 2007

Embedded SQL with Pro*C

In any application software, accessing a database system is almost inevitable. Some programming languages, such as Java and C++ with custom libraries, do a very good job of providing good programmer interfaces to access the database. What if you want to access a database using C or COBOL or FORTRAN? This is usually the case in firms that use legacy applications--commonly large banks. These firms phase out systems at a much slower pace than do companies in other industries, due to the systems being extremely critical: they perform electronic funds tracking. It's difficult to introduce new technologies while rolling out the older technology just to satisfy the thirst of technology evangelists. Such systems usually rely on old ways of accessing databases, including low-level database libraries such as OCI in case of Oracle, Sybase Open Client Connect Libraries, and Pro*C. It takes plenty of planning to move to newer generation systems; it takes years of parallel and phased development in order to phase out older implementations.

Database access mechanisms such as Pro*C will stay relevant for some time to come. Pro*C is a very programmer- and reviewer-friendly way of writing code to access a database. The pre-compilation stage turns Pro*C code into low-level structures and function calls. This paradigm provides quite a few advantages. First, it shields the programmer from the underlying API calls used to access the database. Second, the actual SQL statements and the host programming language code really mix very well to produce very readable and clear code. It's not perfect, but I appreciate it.

Embedded SQL is a superset of Sybase's T-SQL or Oracle's PL/SQL that lets you place SQL statements in application programs written in languages such as C and COBOL. Pro*C allows the C programmer to write database access code fast and with less of a learning curve. For people who are familiar with both C and SQL, this is a cakewalk. Its worth noting that there are differences between implementations of Pro*C across different database vendors due to the differences between database architectures, datatypes, etc. Each new release of a database may announce certain enhancements or changes to its Embedded SQL pre-compiler. It is best to track changes on this front by referring to the vendor database websites.

This article discusses the ways and means of writing Pro*C code to access and modify a database system. My test databases are mainly Sybase and Oracle, as these are in wide use in the same environments as Pro*C. Finally, it's worth noting that some people refer to Pro*C as Embedded SQL.

Ready? What does it take to connect to a Sybase database using conventional C API calls (such as using Sybase OpenClient Connect Libraries)?

Note: This code, dbCon.c, is the sample Sybase programming code available from Sybase.

/************************************************
* *
* Filename : dbCon.c *
* *
* This is function demonstrates the series of *
* steps it involves in opening a connection *
* to the Sybase database. *
* *
* NOTE: Not all variables used are declared; *
* Such as context, connection etc. as this *
* is just a demo snippet. *
* *
***********************************************/

void dbConnect ()
{
CS_INT rc;
CS_INT *outlen;
CS_INT buf_len;
CS_INT msglimit;
CS_INT netdriver;

/*-----------------------------------------*
* Allocate a connection to the server *
*-----------------------------------------*/
rc = ct_con_alloc (context, &connection);
if (rc != CS_SUCCEED)
{
strncpy (msgstr, "CT_CONALLOC failed", \
msg_size);
no_errors_sw = FALSE ;
error_out (rc);
}

/*-----------------------------------------*
* Alter properties of the *
* connection for user-id *
*-----------------------------------------*/
buf_len = user_size;
rc = ct_con_props (connection, (long)CS_SET,\
(long)CS_USERNAME, username, buf_len,\
outlen);
if (rc != CS_SUCCEED)
{
strncpy (msgstr, "CT_CON_PROPS for \
user-id failed",msg_size);
no_errors_sw = FALSE ;
error_out (rc);
}

/*-----------------------------------------*
* Alter properties of the *
* connection for password *
*-----------------------------------------*/
buf_len = pwd_size;
rc = ct_con_props (connection, (long)CS_SET,\
(long)CS_PASSWORD, pwd, buf_len, outlen);
if (rc != CS_SUCCEED)
{
strncpy (msgstr, "CT_CON_PROPS for \
password failed", msg_size);
no_errors_sw = FALSE ;
error_out (rc);
}

/*-----------------------------------------*
* Alter properties of the *
* connection for transaction *
*-----------------------------------------*/
buf_len = tran_size;
rc = ct_con_props (connection, (long)CS_SET,\
(long)CS_TRANSACTION_NAME, tran,\
buf_len, outlen);
if (rc != CS_SUCCEED)
{
strncpy (msgstr, "CT_CON_PROPS for \
transaction failed", msg_size);
no_errors_sw = FALSE ;
error_out (rc);
}

/*-----------------------------------------*
* Alter properties of the connection *
* for network driver *
*-----------------------------------------*/

/*-----------------------------------------*
* default value for non-recognized *
* driver name *
*-----------------------------------------*/
netdriver = 9999;

/*-----------------------------------------*
* if no netdriver entered, *
* default is LU62 *
*-----------------------------------------*/

if (strncmp(driver," ",9) == 0 ?? \
strncmp(driver,"LU62",4) == 0)
netdriver = CS_LU62;
else if (strncmp(driver,"INTERLINK",8) == 0)
netdriver = CS_INTERLINK;
else if (strncmp(driver,"IBMTCPIP",8) == 0)
netdriver = CS_TCPIP;
else if (strncmp(driver,"CPIC",4) == 0)
netdriver = CS_NCPIC;

rc = ct_con_props (connection, (long)CS_SET,\
(long)CS_NET_DRIVER, (long)netdriver,\
CS_UNUSED, outlen);
if (rc != CS_SUCCEED)
{
strncpy (msgstr, "CT_CON_PROPS for \
network driver failed",msg_size);
no_errors_sw = FALSE ;
error_out (rc);
}

/*-----------------------------------------*
* Setup retrieval of All Messages *
*-----------------------------------------*/
rc = ct_diag (connection, CS_INIT, \
CS_UNUSED, CS_UNUSED, CS_NULL);
if (rc != CS_SUCCEED)
{
strncpy (msgstr, "CT_DIAG CS_INIT \
failed", msg_size);
no_errors_sw = FALSE ;
error_out (rc);
}

/*-----------------------------------------*
* Set the upper limit of number *
* of messages *
*-----------------------------------------*/
msglimit = 5 ;
rc = ct_diag (connection, CS_MSGLIMIT, \
CS_ALLMSG_TYPE, CS_UNUSED, &msglimit);
if (rc != CS_SUCCEED)
{
strncpy (msgstr, "CT_DIAG CS_MSGLIMIT \
failed", msg_size);
no_errors_sw = FALSE ;
error_out (rc);
}

/*-----------------------------------------*
* Open connection to the server *
* or CICS region *
*-----------------------------------------*/
rc = ct_connect (connection, servname, \
server_size);
if (rc != CS_SUCCEED)
{
strncpy (msgstr, "CT_CONNECT failed",\
msg_size);
no_errors_sw = FALSE ;
error_out (rc);
}
}

This code invokes quite a few APIs and performs plenty of error checking. How much simpler Pro*C makes this! Note that Pro*C source files have the extensions .PC or .pc. The Pro*C pre-compiler will produce the actual .c or .cpp files.

The following example Pro*C program shows only the declarations and statements required for opening a connection to the database successfully. I'll explain it later. Just see how much more concise it is:


/*----------------------------------------------*
* This appears in the top of the file in the *
* global area. Not inside any function. *
*----------------------------------------------*/
EXEC SQL INCLUDE SQLCA;
EXEC SQL WHENEVER SQLERROR CONTINUE;
EXEC SQL WHENEVER SQLWARNING CONTINUE;
EXEC SQL WHENEVER NOT FOUND CONTINUE;

/*----------------------------------------------*
* This is a simple C or C++ function that *
* would make use of Pro*C code to connect *
* to the database. *
*----------------------------------------------*/


int database_connect()
{
EXEC SQL BEGIN DECLARE SECTION;
char *usr;
char *pswd;
char *srvr;
char *dbase;
char *cnct;
EXEC SQL END DECLARE SECTION;

EXEC SQL CONNECT :usr identified by \
:pswd AT :cnct using :srvr;
err_check (&sqlca, __FUNCTION__, \
__LINE__);

EXEC SQL AT :cnct USE :dbase;
err_check (&sqlca, __FUNCTION__, \
__LINE__);
}


/*---------------------------------------------*
* Here is the basic error handling *
* function snippet. *
*---------------------------------------------*/

int error_check (SQLCA * sqlca_ptr,
char *func_name, int line_num)
{
SQLCA *p_sqlca;

if (p_sqlca -> sqlcode < 0)
{
printf ("Error at function: %s at \
line %d\n", func_name, line_num);
printf ("\tError Code: %d\n", \
p_sqlca -> sqlerrm.sqlerrml);
printf ("\tError Description: %s\n", \
p_sqlca -> sqlerrm.sqlerrmc);
return p_sqlca -> sqlcode;
}
else if (p_sqlca -> sqlcode == 0)
{
printf ("%s operation successful\n", \
func_name);
return p_sqlca -> sqlcode;
}
return 0;
}

What a difference in the number of lines of code and the clarity of the code, as compared to using API calls directly!

General Concepts

The Pro*C pre-compiler often comes with database management software such as Sybase or Oracle. The Pro*C pre-compiler replaces each of the Pro*C calls with appropriate C/C++ calls. The C/C++ compiler then compiles and links the generated C/C++ program like any other regular program. The linker obviously needs to know the locations of the Sybase- or Oracle-specific libraries. Figure 1 shows the related phases.

flow of compiling a Pro*C program


Figure 1. Flow of compiling a Pro*C program

Every Pro*C statement follows a general syntax pattern: they begin with the keywords EXEC SQL and terminate with a semi-colon (;). The code that follows the keywords EXEC SQL mostly correspond one-to-one with the actual SQL code of the database, of course with the extra ability to pass host language (C/C++) variables to the database and retrieve return values. For example, the SQL statement:

COMMIT tran

translates into Pro*C code as:

EXEC SQL COMMIT tran;

Connecting Host Language and SQL using "Host Variables"

Host variables transfer data between the host language, such as C/C++, and the SQL database calls. To declare the host variables:

EXEC SQL BEGIN DECLARE SECTION;
// Here declare all the host variables.
EXEC SQL END DECLARE SECTION;

Refer to a host variable by prefixing its name with a colon (:). For example:

/*-----------------------------------------*
* For simplicity I have just allocated *
* memory statically. They can even be *
* pointers and allocations can be managed *
* later. *
*-----------------------------------------*/
EXEC SQL BEGIN DECLARE SECTION;
char username[128];
char password[128];
char db_name[128];
char connection_name[128];
EXEC SQL END DECLARE SECTION;

strcpy (username, "casper");
strcpy (password, "knock knock");
strcpy (db_name, "cust_db");
strcpy (connection_name, "Sai");
EXEC SQL CONNECT :username IDENTIFIED BY
:password AT :connection_name
USING :db_name;

Suppose you want to have multiple database connections in your program. Your code will look something like this:

char custdb[256] = "Cust_Db";
char invdb[256] = "Inventory_Db";

EXEC SQL CONNECT :username IDENTIFIED BY
:password AT :custdb USING :db_string1;
EXEC SQL CONNECT :username IDENTIFIED BY
:password AT :invdb USING :db_string2;

Then execute your SQL statements, like this:

EXEC SQL AT "Cust_Db"
/* or */
EXEC SQL AT :custdb
SELECT ...

There are some caveats to these variables, however.

* The datatypes of the host variables allowed are vendor-dependent. Refer to your vendor manual for more information. As far as I know, Pro*C allows most of the basic datatypes. Both the Oracle and Sybase manuals state that you can declare pointers in the EXEC SQL (BEGIN/END) DECLARE section, but you cannot use pointers or pointer expressions as host variables.
* Every host variable is actually a C variable, so they are case-sensitive, unlike database column names.
* Every host variable is actually a C variable, and hence must correspond to a valid address in your program.

The host variable datatype must be compatible with the database column type. For example, the C datatypes char and char[] matches the VARCHAR Oracle type. short/int/long/float/etc. match the NUMBER/NUMBER(P,S) type in the database.
Host Structures and Arrays

You can also declare host variable structures to represent a row in your database table. An array of host structures would then represent a set of rows in the table. Pro*C allows you to fetch or manipulate the array of host structs at once rather than fetching/manipulating data for individual host variables. This improves performance and gives a more logical way to represent the database data. For example:

struct part_cust_rec
{
char cust_name[256];
float discount_rate;
};

// a set of 50 key customers
struct part_cust_rec key_custs[50];

EXEC SQL AT "Cust_Db"
SELECT Customer_Name, Customer_Discount
FROM Customer
INTO :key_custs.custname, \
:key_custs.discount_rate
WHERE Customer_Discount > 25

Indicator Variables

Every host variable can have an indicator variable associated with it. Indicator variables are two-byte integer values that indicate the value of the corresponding host variable. Declare them as part of the host variable declaration in the DECLARE section.

Table one shows the indicator values and their meanings.
Indicator Variable Value Description
0 The operation was successful.
-1 A NULL was returned, inserted, or updated.
-2 Output to a character host variable from a "long" type was truncated, but the original column length cannot be determined.
>0 The result of a SELECT or FETCH into a character host variable was truncated. In this case, if the host variable is a multibyte character variable, the indicator value is the original column length in characters. If the host variable is not a multibyte character variable, then the indicator length is the original column length in bytes.

Indicator values are straightforward to use:

EXEC SQL BEGIN DECLARE SECTION;
int cust_id;
char cust_name[256];
char cust_address[512];
short indicator_addr;
EXEC SQL END DECLARE SECTION;

EXEC SQL AT "Cust_Db"
SELECT Customer_Name, Customer_Address
FROM Customer
INTO :cust_name, :cust_address:indicator_addr

/*-------------------------------------------*
* Alternate Declaration of the INTO clause *
* *
* INTO :cust_name,:cust_address *
* INDICATOR :indicator_addr *
* WHERE Customer_Id = :cust_id; *
*-------------------------------------------*/

if (indicator_addr == -1)
{
/* Address is NULL */
cout << "Customer's Contact \
Address is not registered...";
...
}

Communication Area and Error Handling

Apart from using the host variables to pass data back and forth between database and host language, SQL Communication Area (or SQLCA) can propagate status and runtime information between the database and the host. SQLCA sends runtime information of the database to the host language to take appropriate actions upon certain database events.

Use SQLCA with:

EXEC SQL INCLUDE SQLCA.H; // In Oracle

and

EXEC SQL INCLUDE SQLCA; // In Sybase

This declaration includes a structure called sqlca, which has a SQL status code, an error message, a warning message, etc., for the most recently executed SQL. You can read details directly out of the sqlca structure.

struct sqlca
{
/*-------------------------------------*
* sqlcaid : Holds the hardcoded *
* string "SQLCA" *
*-------------------------------------*/
char sqlcaid[8];

/*-------------------------------------*
* sqlabc : Holds the length of the *
* structure *
*-------------------------------------*/
long sqlabc;

/*-------------------------------------*
* sqlcode : Holds the status code *
* of the most recently executed *
* SQL statement. *
*-------------------------------------*/
long sqlcode;

/*-------------------------------------*
* sqlerrm is a structure to hold *
* error description. *
* *
* sqlerrmc : contains error *
* description of the status code *
* in sqlerrmc (upto 70 chars). *
*-------------------------------------*/
struct
{
unsigned short sqlerrml;
char sqlerrmc[70];
} sqlerrm;

/*-------------------------------------*
* sqlerrp : Un-used *
* *
* The array element sqlerrd[2] : *
* Holds number of Rows *
* processed by the most recent *
* SQL statement *
* *
* The array element sqlerrd[4] : *
* Holds offset of the most recent *
* parse error in SQL. *
* *
* The array elements *
* sqlerrd[0,1,3,5] : Are Un-used *
*-------------------------------------*/
char sqlerrp[8];
long sqlerrd[6];

/*-------------------------------------*
* sqlwarn : Holds warning information *
* sqlext : Un-used. *
*-------------------------------------*/
char sqlwarn[8];
char sqlext[8];
};

Exception Handling

Handle errors by examining the error codes and reasons from the sqlca structure, or use the WHENEVER clause:

Click this Mojo Ad

EXEC SQL WHENEVER

CONDITION can be:

* SQLERROR, indicating that an error occurred while executing the previous SQL command.
* SQLWARNING, indicating that a warning occurred a result of executing the previous SQL command.
* NOT FOUND, indicating that the database found no data as a result of the previous SQL command.

ACTION can be:

* STOP, which will exit() the program, rolling back all uncommitted transactions.
* CONTINUE, which will try to continue executing the program despite the error.
* The DO , where function, an error-handling function, which will call that function.
* GOTO