[Chapter 4] 4.7 Tips for Creating and Using Variables

4.7 Tips for Creating and Using Variables

How can you create, use, and maintain your variables and constants most effectively? This section pulls together a number of tips.

4.7.1 Establish Clear Variable Naming Conventions

Database administrators have long insisted on, and usually enforced, strict naming conventions for tables, columns, and other database objects. The advantage of these conventions is clear: at a single glance anyone who knows the conventions (and even many who do not) will understand the type of data contained in that column or table.

Many programmers, however, tend to get annoyed by such naming conventions. They regard them as just another bureaucracy that prevents them from "getting the job (the program) done." Of course, when it comes to tables and columns, most developers have no choice; they are stuck with what the DBA gives them. But when it comes to their own programs, they are free to name variables, programs, and other identifiers whatever they want. What a rush! Freedom! Room to express oneself!

Or rope with which to hang oneself. While those conventions are a bother to anyone who must follow them, they are a tremendous time- and eye-saver to anyone who must read and understand your code -- even yourself.

If conventions for columns make sense in the database, they also make sense in PL/SQL programs -- and in the development tools, such as Oracle Forms. In general I recommend that you follow the same guidelines for naming variables that you follow for naming columns in your development environment. In many cases, your variables simply represent database values, having been fetched from a table into those variables via a cursor. Beyond these database-sourced variables, however, there are a number of types of variables that you will use again and again in your code.

Generally, I try to come up with a variable name suffix that will help identify the representative data. This convention not only limits the need for additional comments, it can also improve the quality of code, because the variable type often implies rules for its use. If you can identify the variable type by its name, then you are more likely to use the variable properly.

Here is a list of variable types and suggested naming conventions for each.

Module parameter

A parameter has one of three modes: IN, OUT, or IN OUT. Attach the parameter mode as a suffix or prefix to the parameter name so that you can easily identify how that parameter should be used in the program. Here are some examples:

PROCEDURE calc_sales
   (company_id_IN IN NUMBER,
    in_call_type IN VARCHAR2,
    company_nm_INOUT IN OUT VARCHAR2)

See Chapter 15 for a more complete discussion of parameter naming conventions.

Cursor

Append a suffix of _cur or _cursor to the cursor name, as in:

CURSOR company_cur  IS ... ;
CURSOR top_sellers_cursor IS ... ;

I generally stick with the shorter _cur suffix; the extra typing (to spell out "cursor") is not really needed to convey the meaning of the variable.

Record based on table or cursor

These records are defined from the structure of a table or cursor. Unless more variation is needed, the simplest naming convention for a record is the name of the table or cursor with a _rec or _record suffix. For example, if the cursor is company_cur, then a record based on that cursor would be called company_rec. If you have more than one record declared for a single cursor, preface the record name with a word that describes it, such as newest_company_rec and duplicate_company_rec.

FOR loop index

There are two kinds of FOR loops, numeric and cursor, each with a corresponding numeric or record loop index. In a numeric loop you should incorporate the word "index" or "counter" or some similar suffix into the name of the loop index, such as:

FOR year_index IN 1 .. 12

In a cursor loop, the name of the record which serves as a loop index should follow the convention described above for records:

FOR emp_rec IN emp_cur

Named constant

A named constant cannot be changed after it gets its default value at the time of declaration. A common convention for such constants is to prefix the name with a c:

c_last_date CONSTANT DATE := SYSDATE:

This way, a programmer is less likely to try to use the constant in an inappropriate manner.

PL/SQL table TYPE

In PL/SQL Version 2 you can create PL/SQL tables which are similar to one-dimensional arrays. In order to create a PL/SQL table, you must first execute a TYPE declaration to create a table datatype with the right structure. I suggest that you use the suffix _tabtype to indicate that this is not actually a PL/SQL table, but a type of table. Here are some examples:

TYPE emp_names_tabtype IS TABLE OF ...;
TYPE dates_tabtype IS TABLE OF ...;

PL/SQL table

A PL/SQL table is declared based on a table TYPE statement, as indicated above. In most situations, I use the same name as the table type for the table, but I leave off the type part of the suffix. The following examples correspond to the previous table types:

emp_names_tab emp_names_tabtype;
dates_tab dates_tabtype;

You could also use the full table suffix for a high degree of clarity:

emp_names_table emp_names_tabtype;
dates_table dates_tabtype;

If you want to minimize confusion between PL/SQL tables and database tables, it is possible to call these constructs "arrays." In this case, switch your prefix:

emp_names_array emp_names_tabtype;
dates_array dates_tabtype;

Programmer-defined subtype

In PL/SQL Version 2.1 you can define subtypes from base datatypes. I suggest that you use a subtype suffix to make the meaning of the variable clear. Here are some examples:

SUBTYPE primary_key_subtype IS BINARY_INTEGER;
SUBTYPE large_string_subtype IS VARCHAR2;

Programmer-defined TYPE for record

In PL/SQL Version 2 you can create records with a structure you specify (rather than from a table or cursor). To do this, you must declare a type of record which determines the structure (number and types of columns). Use a rectype prefix in the name of the TYPE declaration, as follows:

TYPE sales_rectype IS RECORD ... ;

Programmer-defined record instance

Once you have defined a record type, you can declare actual records with that structure. Now you can drop the type part of the rectype prefix; the naming convention for these programmer-defined records is the same as that for records based on tables and cursors:

sales_rec sales_rectype;

4.7.2 Name Subtypes to Self-Document Code

One of the most compelling reasons for creating your own subtypes is to provide application- or function-specific datatypes that self-document your code. A programmer-defined subtype hides the generic "computer datatype" and replaces it with a datatype that has meaning in your own environment. In naming your subtype, you should consequently avoid references to the underlying datatype and concentrate instead on the business use of the subtype.

Suppose you are building a hotel reservation system. A very useful subtype would be a room number; it is a special kind of NUMBER and is used throughout the application. A room number is always an integer and is always greater than zero. So you could create the subtype as follows:

SUBTYPE room_#_pos_integer IS POSITIVE;
...
-- A declaration using the subtype:
open_room room_#_pos_integer;

The name, room_#_pos_integer, however, provides too much information about the subtype -- this information is not needed by the developer. A much better name for the subtype follows:

SUBTYPE room_number_type IS POSITIVE;
...
-- A declaration using the subtype:
open_room room_number_type;

A word-processing screen might rely heavily on fixed-length, 80-character strings (full lines of text). Which of these subtype declarations would you prefer to use?

SUBTYPE line_fixlen_string IS CHAR;

SUBTYPE full_line_type IS CHAR;

Which of these next two subtypes makes more sense in the variable declarations?

DECLARE
   SUBTYPE use_details_flag_Boolean_subtype IS BOOLEAN;
   use_item_totals use_details_flag_Boolean_subtype;
   SUBTYPE use_details_type IS BOOLEAN;
   use_footers use_details_type;

In both examples, the second example is preferable. You can improve the readability of your subtypes if you include a standard prefix or suffix to all subtypes, explicitly defining the identifier as a type. As shown in the following example, you might use a concise _t or the full _type suffix; in either case, a glance at the code will reveal that identifier's nature:

DECLARE
   longitude coordinate_t;
   avail_room room_number_type;
   found_company t_entity_found;

If you set your standards before developers begin to define their subtypes, you will have a consistent naming convention throughout your application.

4.7.3 Avoid Recycling Variables

Do you hear yourself echoed in any of the following statements? I certainly do.

Each variable and constant I use in your program takes up some amount of memory. I want to optimize memory usage in my program.
It takes time to declare and document each variable and constant. I am under deadline pressures and need to work as efficiently as possible.
As I write my program and run into the need for another variable, I can simply use an already declared but currently unused variable.

The one thing that all of the above statements have in common, besides the fact that they are poor excuses for writing poorly designed code, is that I avoid declaring separate variables and constants to meet the different needs in my program. Although I strongly support the idea of "reuse and recycle," both in my neighborhood and with procedures and functions, this principle produces hard-to-read code when applied to variables and constants.

Each variable and constant I declare should have one purpose and one purpose only. The name for that variable or constant should describe, as clearly as possible, that single-minded purpose.

The only reason to create generically-named variables and constants is to save you, the developer, typing time. That is always a terrible reason for a coding style or change in a programming effort. Reliance on a "time-saver" short-cut should raise a red flag: you are probably doing (or avoiding) something now for which you will pay later.

4.7.4 Use Named Constants to Avoid Hardcoding Values

Every application has its own set of special or "magic" values. These values are often configuration parameters, such as the maximum amount of time allowed before an order must be shipped or the name of an application (to be displayed in report and screen headers). Sometimes these special values will be relevant only for a particular program and not for the application as a whole, such as the character used as a string delimiter.

Whatever the scope and form of these values, programmers seem to believe that they will never change. The data in the database will certainly change and you might even need to add a column or two, but nobody will ever need to modify the naming scheme for reports, right? The value used to identify a closed request is always going to be "C," right? In a seemingly unconscious effort to prove this inviolability, we enter these values directly into the code, wherever it is needed, however often it is needed.

And then we are burned when the assumptions change -- usually as the result of a change in business rules. Not only is our belief system undermined, our weekend is lost to minute changes to code. It's always better to be skeptical, with perhaps a tinge of cynicism. Assume that anything you do will have to be changed a week after you write it. Protect yourself in every way possible -- particularly with your constants.

Follow these simple guidelines regarding literals in all of your applications:

Remove all literals (within reason) from your code. Instead, declare constants which hold those literal values.
Allow the value of that literal (now a constant) to be set in only one place in your code, preferably with a call to a procedure. This procedure can be run on startup of the application, or from the initialization section of a package.
Provide a single way to retrieve the literal value, preferably through a call to a function. Don't let any program reference the literal directly. This way you reserve the right and ability to change at any time the data structures you use to store that literal -- without affecting any of the programs which rely on that constant.

You can, of course, go overboard in your hunt for hardcoded values. Suppose you encounter this very common assignment to increment a counter:

number_of_orders := number_of_orders+ 1;

Should you really create a named constant called single_increment, initialize it with a value of one, and then change the above line to the following:

number_of_orders := number_of_orders+ single_increment;

I don't think so. The use of the literal value 1 in this situation is perfectly appropriate to the task at hand. Almost any other literal value hardcoded into your program should be replaced with a named constant.

4.7.5 Convert Variables into Named Constants

Every programming language provides a set of features to meet a set of requirements and needs. You should always try to use the most appropriate feature for a given situation.

There is a critical difference between a variable and a named constant. A variable can be assigned values; its value can be changed in the code. The value of a named constant is a constant; its value may not be altered after it has been declared. If you find that you have written a program in which a local variable's value does not change, you should first determine if that behavior is correct. If all is as it should be, you should then convert that variable to a constant.

Why should you bother converting "read only" variables to constants (and named ones at that)? Because when you "tell it and use it like it is," the program explains itself more clearly. The declaration of a named identifier as a constant gives you information about how it should be used in the program.

If you do convert a variable to a constant, you should also change its name. This will help to remind anyone reading the code that your identifier refers to a constant and cannot be changed. See the earlier tip on naming conventions for ideas on how to name your constants.

Directly related to improved readability, the declaration of a constant can also help with maintenance and debugging. Suppose you define and use a variable under the assumption that its default value will not change. You do not, however, go to the trouble of declaring that variable as a constant. You are aware of this assumption and use the variable properly. What happens, however, when a month or a year later another programmer has to perform maintenance on the code? This person might change the value of that variable in the course of making changes, and then cause a ripple effect of errors. If the variable was declared initially as a constant, such a change could not be compiled.

4.7.6 Remove Unused Variables from Programs

I have always been amazed at how quickly and easily a program can turn into spaghetti code. Consider this scenario. A program starts out with weak specifications. As a result, users change requirements at the same time that the developer implements the functions to support those requirements. A program evolves (or does it devolve?) rapidly, with whole sections of code moved, removed, or revamped. After many rounds of approximating a solution, the program works to the users' satisfaction. You wipe the sweat from your brow and gratefully move on to the next screen or report, hoping never to have to touch that program again.

Consider another scenario. After you wipe the sweat from your brow, you take a deep breath and dive into clean-up mode. There will never be a better time to review all the steps you took and understand the reasons you took them than immediately upon completion of your program. If you wait, you will find it particularly difficult to remember those parts of the program which were needed at one point, but were rendered unnecessary in the end. You should not view these "dead zones" in your code as harmless backwaters, never traveled so never a bother. In fact, unexecuted portions of code become sources of deep insecurity for maintenance programmers.

You should go through your programs and remove any part of your code that is no longer used. This is a relatively straightforward process for variables and named constants. Simply execute searches for a variable's name in that variable's scope. If you find that the only place it appears is its declaration, delete the declaration and, by doing so, delete one more potential question mark from your code.

4.7.7 Use %TYPE When a Variable Represents a Column

Always, always use the %TYPE attribute to declare variables which are actually PL/SQL representations of database values. When you think about it, this includes a lot of your variables; using %TYPE sometimes takes lots more typing, but it improves your code substantially.

Suppose you have a procedure in Oracle Forms that formats information about a customer. You need to declare a variable for each attribute of the customer: first name, last name, address, Social Security number, etc. "Hardcoded" declarations would look like this:

PROCEDURE format_customer
IS
   first_name  VARCHAR2 (30);
   last_name   VARCHAR2 (30);
   address     VARCHAR2 (60);
   city        VARCHAR2 (60);
   state       VARCHAR2 (2);
   soc_sec#    VARCHAR2 (11);
BEGIN
   ... interact with database customer information ...
END;

There are a few problems associated with this declaration style. As mentioned previously, these declarations are going to work only as long as the structure of the table does not change and the resulting changed data does not violate the above size constraints. Another drawback is that there is no documented relationship between the variables and the table columns. For example, is the city variable the city in which the customer lives or the city in which the sale was made? I need additional documentation here to guide my understanding of the ways these variables will be used.

These problems are all resolved with the %TYPE attribute. The declaration section shown in the preceding example has a very different look when explicit declarations are replaced with the referential %TYPE declarations:

PROCEDURE format_customer
IS
   first_name  customer.cust_fname%TYPE;
   last_name   customer.cust_lname%TYPE;
   address     customer.cust_addr_l1%TYPE;
   city        customer.cust_city%TYPE;
   state       customer.cust_state_abbrev_cd%TYPE;
   soc_sec#    customer.cust_soc_sec_number%TYPE;
BEGIN
   ... interact with database customer information ...
END;

Using the %TYPE attribute ensures that my variables stay synchronized with my database structure. Just as importantly, though, this declaration section is more self-documenting now. The %TYPE attribute provides important information to anyone reviewing the code, stating: "These variables represent my columns in the program. When you see one of these variables, think `database column'." This correlation makes it easier to understand the code, easier to change the code, and easier to recognize when one of those variables is used in an inappropriate manner.

Notice that the variable name does not have to match the column name. The %TYPE attribute guarantees only that the datatype of the variable matches the datatype of the column. While a name matchup is not required, I generally try to name my %TYPE variables the same as my column names. The identical name strongly reinforces the fact that this variable "represents" my column in this program. While name synchronization can be a nuisance (database administrators often insist on somewhat obscure and rigid naming conventions, such as cust_state_abbrev_cd for the two-character abbreviation of a customer's state), you cannot escape a DBA's conventions. Because you must use that table, why not make your programs as consistent as possible with the underlying database?

4.7.8 Use %TYPE to Standardize Nondatabase Declarations

While it is true that many (perhaps even most) of your local PL/SQL variables are directly related to database columns, at least some of your variables are local-only, perhaps calculated values based on database columns. You can also use the %TYPE attribute to infer a variable's datatype from another, previously defined PL/SQL variable, as I'll explain in this section.

The following declarations use this alternative source:

DECLARE
   revenue_data NUMBER(20,2);
   total_revenue revenue_data%TYPE;
   --
   max_available_date DATE := LAST_DAY (ADD_MONTHS (SYSDATE, 3));
   last_ship_date max_available_date%TYPE;

The variable called revenue_data acts as the standard variable for revenue data. Whenever I declare my total_revenue variable (or any other revenue-related variables), I base it on the general revenue_data variable. By doing this, I can guarantee a consistent declaration of revenue variables. Furthermore, if I ever need to change my revenue datatypes again, I only have to change the way that revenue_data is declared and recompile. All variables declared with revenue_data%TYPE will automatically adjust.

Note that while max_available_date has a default value as well, it is not applied to last_ship_date. Everything up to the optional default value assignment (initiated with a DEFAULT keyword or assignment operator) in a declaration is used in the %TYPE declaration, such as NOT NULL and the datatype. The default value, if specified in the source variable declaration, is ignored.

To make it easiest for individual developers to be aware of and make use of standard variable declarations, consider creating a package that contains only standard variable declarations and any code necessary to initialize them, as follows:

PACKAGE std_vartypes
IS
   /* Source for all revenue-related declarations */
   revenue_data NUMBER(20,2);

   /* Source for any Y/N flags -- when you don't use Booleans */
   flag_data CHAR(1);

   /* Standard format for primary key columns */
   primary_key_data NUMBER (10);

END std_vartypes;

4.7.9 Use Variables to Hide Complex Logic

A variable is a chunk of memory that has a name. A variable can hold a simple value. It can also be assigned the value of the outcome of an arbitrarily complicated expression -- either through a default value setting or an assignment. In this way a variable can represent that complex expression and thus be used in place of that expression in your code. The result is a program which is easy to read and maintain.

Consider the following code fragment:

IF (shipdate < ADD_MONTHS (SYSDATE, +3) OR
    order_date >= ADD_MONTHS (SYSDATE, -2)) AND
    cust_priority_type = 'HIGH' AND
    order_status = 'O'
THEN
   ship_order ('EXPRESS');

ELSIF (order_date >= ADD_MONTHS (SYSDATE, -2) OR
       ADD_MONTHS (SYSDATE, 3) > shipdate) AND
      order_status = 'O'
THEN
   ship_order ('GROUND');
END IF;

If I skip past the complicated Boolean expressions and look at the code executed in each IF and ELSIF clause I can "reverse-engineer" my understanding of the code. It looks like the IF statement is used to determine the method by which an order should be shipped. Well, that's good to know. Unfortunately, it would be very difficult to discern this fact from the conditions in the IF statement. Those Boolean expressions with multiple components are, in and of themselves, almost impossible to interpret without drawing a diagram. If there is an error in this logic, no one but the original author would be able to readily untangle the knot.

When you find yourself writing this kind of code (or having to maintain it) and, in the process, stumble through the logic -- either not sure if you got it right or even wondering precisely what "right" is, it is time for a shift in your approach. Perhaps you are trying to understand the implementational details before you understand the business rules.

4.7.9.1 Build from high-level rules

You should, in general, avoid implementing complicated expressions and logic until you have mapped out and verified that logic independent of PL/SQL code. Are you working from specifications? Then for the above code, your specifications might say something like this:

Rule 1: If the order is overdue and the customer priority is high, ship the products ordered using express delivery.
Rule 2: If the order is not overdue or the order is overdue but the customer priority is normal, then use standard ground delivery.

Before you dive into the PL/SQL IF statement, write some pseudocode based on your specifications. Here is an example:

IF order-overdue
THEN
   IF customer-priority-is-high
   THEN
      ship-express;
   ELSE
      ship-ground;
   END IF;
ELSE
   ship-ground;
END IF;

The next step would be to rewrite this nested IF statement as follows:

IF order-overdue AND IF customer-priority-is-high
THEN
   ship-express;
ELSE
   ship-ground;
END IF;

Even before writing a line of code, I have been able to simplify the logic which meets this specification. At this point I don't know what it means for an order to be overdue. I don't know how to tell if a customer's priority is high. I don't really need to know these details yet. My focus is to make sure I understand the logical requirements. Once this is done, I can recast the pseudocode as real PL/SQL.

4.7.9.2 Convert pseudocode to PL/SQL

My first pass in a conversion to PL/SQL would be a more-or-less direct translation:

BEGIN
   IF order_overdue AND high_priority
   THEN
      ship_order ('EXPRESS');
   ELSE
      ship_order ('GROUND');
   END IF;
END;

I don't know how ship_order will behave and, again, at this moment I don't care. I will employ top-down design to "fill in the blanks" and then work out the details later.

My conditional expression:

order_overdue AND high_priority

substitutes named variables for the pseudocode. To figure out exactly how these variables should be assigned their values, I need to look back at my requirements. Here is what I find:

Definition 1: An order is overdue if the shipping date is within the next three months or the order status is open and the order was placed more than two months ago.
Definition 2: A customer has a high priority if its priority type is equal to HIGH.

This last sentence is less a requirement than an implementation instruction. Be that as it may, instead of creating a function for each of these conditions, I can declare Boolean named constants and assign a value to those constants based on the variables representing the order and customer, as shown below:

DECLARE
   /* I hide my business rule behind this variable. */
   order_overdue CONSTANT BOOLEAN
         DEFAULT (shipdate < ADD_MONTHS (SYSDATE, +3) OR
                  order_date >= ADD_MONTHS (SYSDATE, -2)) AND
                  order_status = 'O';

   high_priority CONSTANT BOOLEAN
         DEFAULT cust_priority_type = 'HIGH';

BEGIN
   IF order_overdue AND high_priority
   THEN
      ship_order ('EXPRESS');
   ELSE
      ship_order ('GROUND');
   END IF;
END;

In this final version of the IF statement, I've used the order_overdue constant to abstract out or hide the two-part check against the order and ship dates. Now the IF-THEN code is much easier to read; in fact, it all but explains itself through the names of the constants. This self-documenting capability reduces the need for separate comments in the code.

By consolidating my redundant code, I also make it easier to maintain my application. If the conditions which make an order overdue change, I do not need to hunt through my code for all the places which perform the order_overdue test. I need only change the default value given to the order_overdue constant.

I can also take this approach a step further and place my business rule logic into a Boolean function. I can then call this function from any of my programs and avoid reproducing the logic in that declaration statement.


4.6 Programmer-Defined Subtypes		5. Conditional and Sequential Control