Postmortem on the Y2K project
("Irish Computer" magazine Feb. 2000, Cutter IT Journal July 2000)
It's good practice to wrap up a project with a postmortem, a review of what happened, what went well that could be a lesson for the future and what went badly that needs to be avoided again. The Y2K project was a particularly good source of lessons that can be applied to any other mass change project with an imposed deadline, such as the changeover to the euro currency in Europe.
What went well? For those who exercised prudence, it was ITís "finest hour". The effort paid off. But the lack of reported problems with Small and Medium Enterprises (SMEs) does not mean there were no problems. I received calls from several companies who had systems dating from the early 1990ís that failed to produce output. Fortunately, these were DOS Clipper applications that just needed a one-line fix and were dealt with easily enough. Itís easy to see that in cases where the source is missing, or the expertise to maintain it, or the capacity to schedule the work, companies would have problems. But nobody heard about these because nobody is rushing to the press to confess their problems. Only if customers or the public are affected are we likely to hear anything, and that will be a political message. (See the box on "Top 10 things we are likely to hear"). On New Year's Eve, top Pentagon officials withheld news of a major Y2K computer glitch that had cut access to a critical satellite intelligence system, telling reporters only after the big millennial celebrations in Washington and New York had finished. Later still, it emerged the systems were down for three days.
Poll on Y2K Benefits
Members of Peter de Jager's Year 2000 mail list were asked to name the main benefits brought by their Y2K project. (This was not a scientific survey so no numbers are available for how many respondents agreed with each of the following.)
But with success arises the question: was it right to spend money preventing problems rather than waiting to fix-on-failure as those companies did? Was the threat of technology failure overstated? The answer, of course, is obvious if this question was asked about anything else. If your car needed a 100,000 mile service and you spent hundreds of pounds on it, would you be wondering then if nothing went wrong after the service? If your car insurance cost hundreds of pounds and you don't have an accident, do you ask the insurance company for your money back? Most computer experts insist the problem was a widespread and major threat before work began. "There was puffery by vendors and some money was wasted, but these were real problems," said Leon Kappelman, an associate business professor at University of North Texas in Denton, and co-chairman of the Society for Information Management's Year 2000 Working Group. He points out that "If it is true that total Y2K spending was $100 billion and that $10 billion of that spending was actually Ďwasted,í whatever that really means, then Y2K was the best-managed IT project in history. The fact is that about 25% of IT projects totally fail to deliver anything." Michael Granatt, a director of the British government's millennium centre, said it was important to acknowledge that the lack of truly disastrous computer failures was not a fluke. "Things don't go right by accident," he said. "They go right through proper planning."
Cultural issues also arise in how these things are presented. In the litigious USA, SEC disclosure rules meant that companies had to be particularly careful about what they said they were doing and "nothing" was not an acceptable answer. Where there are regulatory bodies, their proceedings are not public knowledge and their role in pushing for action and providing co-operation will only be appreciated by the insiders in each sector - banking, telecommunications, and energy. In other parts of the world, upgrade and maintenance were handled as normal business-as-usual. And not being English-speaking, such work was not discussed on the Internet to the same extent. Internet mail lists were key to providing a platform for the dissemination both of wild speculation and invaluable support in the more sober forums.
What went badly? It could be argued that the PR could have been better managed. But I donít think we need pay attention to the Monday morning referees, or at least no more than is needed for chit-chat over coffee rather than serious management discussion. And I have no problems with the media reporting in Ireland - the mainstream dailies and Sundays (in particular the Sunday Business Post) had sober, sensible coverage with plenty of level-headed advice. You will notice that much of the "end of civilisation" hype came from the uninvolved; the people who actually did the work know what they did and where the money went. Just as before the event, the most dire predictions came from those with a base in economics, community activism, environmentalism . . . where people are used to seeing problems stemming from inaction but do not have any insight into how software is actually managed.
Perhaps there are some lessons about forecasting events outside our technical realm. I was never one much for predictions, focusing mainly on cause-and-effect, and recommending safe practices and sensible risk management. I was asked to include in my 1999 book on "Managing the euro in information systems" a chapter on the much-discussed Y2K-vs-euro debate. It is a source of some satisfaction that my effort estimates, which were much lower than the alarmists predicted, have been borne out. Nonetheless, I remember telling the organisers of a national management conference that it was certain that if Y2K were not on the agenda in April 1999, it would be in April 2000. It wasn't and it wasn't. Ed Yourdon acquired a reputation as an apocalyptic Y2K prophet, based not only on his "Time Bomb 2000" book but also on his famous February 1998 statement that "Rural China will probably be okay; but in my humble opinion, New York, Chicago, Atlanta and a dozen other cities are going to resemble Beirut in January 2000. "
The reports of failures (whether publicly or privately seen) are mainly from those who did nothing and therefore suffered the consequences. You had a choice: do nothing and fix on failure, or check first and then fix what had to be fixed. Common sense and ordinary prudence demands that you check to see if you have a problem first, and then take avoiding action if necessary. One of the companies who ask for help in January stressed that their need was urgent, they needed it the next day to calculate their wages! Would any reader seriously regard that as an appropriate way to manage their software assets? Ironically, although they got their fix inside an hour, they were then hit by an industrial relations dispute (totally unconnected to Y2K) that meant that they did not need the wages calculation the next week, so there are worse things than software failure.
There is still an argument over those without internal expertise who relied on their IT supplier, bought new systems and now wondered did they really need to. If they find that they donít want to give back the new, fast computers and feature-rich software, then there was a good business case. But others are going to feel angry and unnecessarily pressured, and some disputes are going to arise. My articles which criticised the "silver bullet" solutions and the poor value clock-fixing solutions aroused some antagonism from the vendors at the time, but those who paid attention saved money. The big lawsuits in the USA are not over excess Y2K remediation costs, they (Xerox, Nike, GTE, Unisys, Kmart) are about recovering those costs from the insurers - see the box on insurance and cost recovery.
Insurance and cost recovery
Steve Davis, a risk management consultant, has an article onhttp://www.davislogic.com/SueAndLabor.htm . "Sue and labor" clauses allow the insured to recover costs expended to prevent a calamity from occurring on the theory that acts of prevention are less expensive for the insurance companies in the long run then paying out damage claims.
In July 1999, Xerox Corporation filed suit against one insurance company - American Guarantee and Liability - to reclaim the entire $180 million they spent to fix their Y2K problems.
On Jan 14, 2000 the PRNewswire via COMTEX reported:
"Kmart Corp. (NYSE: KM) and ITT Industries (NYSE: IIN) each filed declaratory judgment actions on Dec. 30 contending that property policies dating to 1996 should cover millions of dollars spent remediating potential Year 2000 computer problems in internal systems. Kmart has estimated its costs at $80 million. ITT Industries, which said it spent $19.5 million, filed the same day in state courts in Indiana and New York. Similar coverage actions have been filed within the past eight months by Xerox ($180M), Nike ($100M), GTE, Unisys, the Port of Seattle and the School District of the City of Royal Oak (Mich.). "
Other problems (such as the Royal Doulton and Hershey reports, see the box) are those caused by the remediation project itself running late and new systems being rushed in untested and with service levels cut to a minimum just to keep things going. Such organisations now have lessons to learn from the prudent ones who started in good time and estimated, planned, and executed their projects properly.
New project "teething troubles"
Hershey Foods Corp. suffered millions of dollars in lost sales because of problems with a $110 million project run by IBM to install SAP, Siebel & Manugistics simultaneously. We may smile at the idea that a shortage of Halloween candy bars being a serious problem, but it was to employees and shareholders of Hershey. Stock dropped from $75 to $50 and competitors
snapped up the market share due to missed shipments.
The London Times in an article on 11 November 1999 entitled "Doulton bitten by millennium bug" reported that Royal Doulton, one of Britain's oldest and most famous makers of china, lost orders worth between £10 million and £12 ($19M) million because frustrated customers took their business elsewhere. Although its new £5 ($8M) million computer system recognised dates ending in 00, it did not recognise a set of china. So when orders came in for their best-selling package the computer would say 'none in stock'. It should have picked five pieces and made the sets. When the confession was made, its shares crashed 25 per cent, prompting analysts to forecast more than double the £12 million loss expected earlier.
The international view
Peter de Jager published an article on his web site wondering why he got it wrong about Italy. In my opinion, itís simply the language and cultural gap. Big companies there were just as good as big companies elsewhere, they just donít brag about it on the Internet. The Italian national Y2K programme chairman had commented about how small enterprises in Italy are either not technology dependent or recently computerised, and so would manage in their usual ad hoc manner. I was one of the many experts who contributed to the International Y2K Co-Operation Center based in Washington. The American members were initially very concerned about readiness in the rest of the world, but eventually realised that developing countries are just not so dependent on computers. The director, Bruce McConnell, said on 2nd January: "The world is both more resilient and more connected than we knew. Working together, nations are capable of managing a tough global challenge. The worldís information systems have had a complete work-over and they are now passing the physical."
Ed Yourdon made another forecast at a software quality conference in Dublin in November 1999 that is, in my opinion, more likely because it is a continuation of existing practices. He feared that if Y2K was a non-event, that people would feel justified in "good enough" software practices that in fact meant slipshod work done at "internet speed" to meet unrealistic demands and deadlines. It appears that software quality advocates are struggling, and have to prefix their course titles with "e-" to attract some interest.
Nicholas Zvegintzov of Software Management Network, wrote in "American Programmer", (the antecedent to the Cutter IT Journal) in February 1996 describing "Y2K as Racket and Ruse". (http://www.softwaremanagement.com/References/year_2000.html )
He cited it as an instance of a "bounded storage" problem that Capers Jones also wrote about in his 1998 article "Dangerous Dates for Software Applications". ( http://www.comlinks.com/mag/ddates.htm ) Zvegintzov wrote "Conscientious software managers and professionals, whose responsibility it is to understand software and make it work, see the Year 2000 racket as working for them. For once, here is a real world software problem that lay people, particularly higher management, can understand [...] If the Year 2000 racket can attract resources that they need -- tools and training and preventive maintenance -- for more important and difficult software problems, they will take the resources where they can as long as the window remains open."
So now we have consultants saying that it's e-business or out-of-business. (Don't mention boo.com) Maybe we'll get another chance.
The top ten most common things said after Y2K
"We were right all along"
"I thought they had fixed that"
"Who would have thought that could have gone wrong?"
"You'd have thought a big company like that would have known better"
"Never heard of that one before"
"It wasn't that serious, just a glitch"
"Sorry, it's not covered under warranty"
"It'll be fixed real soon now"
"Who's going to pay for this?"
Y2K problems affect systems as dates are handled across 1999 to 2000 for calculation of past due balances, batch and end-of-month operations. Even unfixed programs may eventually come right as they proceed to deal to 00 dates only with no reference back to 99. In theory, there are faint ripples possible later. The year 2000 has 366 days and 53 (some use 54) week numbers so some systems could get confused on 31-Dec-2000, like the Tiwai Point aluminium smelter shutdown caused by a leap year miscalculation on 31 Dec 1996. There may be problems in software that infers date formats from the position of the year because it is expected to be greater than 31 and does not know which number is the year in 03/02/01. But I'm not expecting any great level of incidence there.
Some people warn of slow hidden degradation in systems and data. However, no evidence has yet emerged to support this, and no convincing mechanism has been proposed. The comments that have appeared have generally been unscientific and poorly worded, indicating an inadequate grasp of the construction of software and systems.
Patrick O'Beirne is a software quality consultant, trainer, speaker, and author. His book+CD "Managing the Euro in Information Systems: Strategies for Successful Changeover" was published by Addison Wesley in August 1999 ISBN 0-201-60482-5. He may be contacted at Systems Modelling Ltd in Ireland web site http://www.sysmod.com