Paul Lea Lujanac Lujanac Database Consulting Email: lujanac (at) acm (dot) org December 17, 1996 YEAR 2000 CONSIDERATIONS IN CLIPPER APPLICATIONS -- or -- ``Murphy is Alive and Well (and Living in Our Programs)'' Clipper is descended from dBase--both languages have date-type variables and fields, and comprehensive date computation capabilities. All dates from January 1, 100 through December 31, 2999 are handled. In theory it is possible to construct a date-intensive Clipper application without having date problems until the year 3000, provided that your applications don't use the lupdate() function. In practice, it will almost never happen. This paper explores some of the reasons for this paradoxical situation, and what we can do over the next few years to avoid total disaster. The reasons can be broadly categorized as: external requirements; deficient analysis; and, too-clever programming. Most of the examples which follow can be put in at least one of these three categories. 1. Using the ctod() function in programs which have (or default to) Set Century Off. The default century is the twentieth (``19'' in the high-order year digits). This means that character-format dates with year 00 will convert to 1900 in date format. (Yes, I know there was never a ``year zero'' and the twentieth century actually ends at midnight on December 31, 2000. I'm just following the xBase century convention here.) Solution: In applications compiled with version 5.x, insert Set Epoch to 1980 in initialization code of each program. This will interpret years 00-79 as twenty-first century dates. (1980 was chosen as as roughly the beginning of the PC era--most PC applications don't deal with dates in the range 1900 thru 1979. For those that do, a different year can be chosen.) For applications compiled with Summer of 87 and earlier Clipper versions, the ideal solution would be to compile them in 5.x and use the Set Epoch approach above. If this isn't feasible, the Set Epoch command can be simulated as a S87 user-defined function. This requires that all occurrences of ctod in the source code be changed to the name of the new function, and that it return the same results that a legitimate ctod call would. It also requires that any Gets to a date variable or field be adjusted similarly. This can be handled in a UDF called from your Valid function, with the Get variable as a parameter. (Why must an application be compiled in S87? Well, sometimes recompiling in 5.01 just won't work. dBase-style code converts OK, but if the S87 app uses a data-driven approach, you may be out of luck. In S87, for instance, we might define and size a Public array based on data from a control file, using macro expansion. The S87 construct: Public &aRrayName[nSize] just won't compile under 5.01. The only alternative to redesigning the whole application is to stick with S87.) For programs compiled with earlier versions the job gets messier. If we can't recompile in 5.x or S87, each ctod occurrence will have to be examined individually and replaced with a procedure call to a proc which would simulate the Set Epoch simulation. The calling sequence will also change, at least minimally. Date type Get variables need to be handled individually. (Why can't we at least recompile in S87? Here's an example--one user had a 512K 286 which required a humongous TSR for mainframe interface. Our attempt to recompile the Autumn of 86 programs in S87 created an oversized .EXE. Of course the old compiler had long since been trashed. Fortunately we were able to devise a work-around until the new pentium came in!) 2. Two-digit year in character format being used as a key. This is not always the result of faulty design on the PC side--it is often due to the requirement to interface to a mainframe. Business data processing on mainframe computers began in 1951. Both memory size and external storage size were critical factors, and few people worried about what might happen in 49 years. Some systems were designed with a single year digit; a date in format YDDD could be converted to binary and carried as an unsigned value in as few as 13 bits. While the mainframe situation has eased since 1951, it is still the practice to conserve both memory and file space to the extent practical; hence, two-digit years are still common. Solution: If the PC application doesn't interface to a mainframe, does not make decisions based on date ranges, and doesn't use the two-digit year in a key, the problem could be ignored. This isn't a very good solution, because printed and on-screen reports and queries may come out in the wrong order. To keep our customers happy we should fix the problem by using a four-digit year. This may require a fairly large number of rather straight-forward program changes. For applications which interface to a mainframe, coordination is required. Ideally the mainframe changes will be designed early enough to allow the PC application sufficient time for exhaustive testing. Here's a real-life example. A Clipper 5.01 eleven-program application (which was originally written in S87) was designed with a key consisting of two-digit year, one-digit quarter, and several other fields. All fields were character. The final quarterly output was a text file destined for a mainframe. For reasons not related to the mainframe, five years worth of data was to be kept online on the PC. One of the Clipper programs needed to select records within a given range. Database size started out at around 50,000 records, and was expected to grow to around a million to accommodate five years worth. Rather than depend on ``Set filter'' the program set up a beginning and ending key, used Softseek to get to the beginning, then processed while the ending key was not exceeded. If the beginning key starts out ``993'' and the end key is ``002...'', how many records do you think will be processed? Does this involve faulty analysis, clever programming, or just external requirements? Solution: the mainframe folks have agreed to carry a four-digit year. Seven of the eleven Clipper programs will require change, some of them significant change. 3. Two-digit year in character format being used as a portion of a single- field key. This one _is_ a design flaw! An example is a project number designed as a single field, where the first two digits are actually the last two digits of the year. The tough part is ferreting out such anomalies. We must have good system-level documentation (ha!), question designers and users intensively, and/or read the source code line by line. Solution: We really should clean this up by redesigning the key so that year and number are separate fields. However, cleaning up flawed systems is not normally part of Year 2000 conversion. We can patch the system either to use a four-digit year or do a work-around, whichever is easier for the particular system. An example of a work-around might be allowing alphas in the first character of the year--A=0, B=1, etc. Thus Year 2000 project numbers will start ``A0'', rather than ``00''; ``B0'' is 2010, etc. 4. Programs using lupdate(). The designers of JPLDIS/dBASE reserved a few characters in the database file header, presumably for future expansion. With our 20/20 hindsight, we can criticize them for allowing only one character for the year portion of the date last updated. This _always_ wraps to zero. It is not affected by Set Epoch, Set Century, or anything else. Even if we edit the header to put 100 (64h) in this byte, the date of last update always has a year of ``19nn''. If your applications use lupdate() only for display, make sure it's displayed as a two-digit year--no one will know the difference. If the date of last update is important, you'll need to program a work-around. The Clipper directory() function works fine, as does adir().