06-10 Contents: Euro 2007, Spam pays, Visual Data Quality, Bad data bars, Free Excel books, C+-
ISSN 1649-2374 This issue online at http://www.sysmod.com/praxis/prax0610.htm [Previous] [Index] [Next]
|Systems Modelling Ltd.: Managing reality in Information Systems - strategies for success|
IN THIS ISSUE
|1) Risk & Security
Euro conversion looms again
Testing software and electronic calculators
Stock spam pays off if they're quick
|2) Data Quality
Review of Tufte's 'The Visual Display of Quantitative Information'
European DM+IQ conference, London, Oct 30-Nov 2
Excel 2007 databars misrepresent zero
More rave reviews and more downloads for 'Spreadsheet Check and Control'
Free books on Excel
|4) Off Topic
C More or Less
|10 Web links in this newsletter
About this newsletter and Archives
Subscribe and Unsubscribe information
This month, I do a reader's digest of Edward Tufte's book on graphics, without using a single graphic. You'll have to buy the book to get them!
It's nearly that time again, for the recently joined members of the EU. First up is Slovenia:
From 1 January 2007 the euro will replace Slovenia's currency, the tolar, at the fixed and irrevocable conversion rate of 239.640 tolars for one euro. The European Commission reported that the practical preparations for the introduction of the euro in Slovenia (population: 2 million) were at an advanced stage, but the final preparations must be speeded up to ensure a smooth changeover and to address citizens' fears about price increases. Slovenia became member of the EU in May 2004 together with nine other countries. Most of the other countries wish to join the euro area between 2008 and 2010.
Back in 2001, we helped a public agency avoid the potentially expensive error of selecting a low-cost but inaccurate euro calculator for public distribution. My attention was drawn recently to an official web site where I found an incorrect conversion rate - a simple typo, but they just had not checked for the crucial rounding boundaries in their test cases.
A software tester needs to be like a sculptor or diamond cutter looking for the flaw line where a small tap cracks the whole thing open. You can purchase from us very detailed test data sets based on the mathematics of the conversion between the euro and the national currency.
The Systems Modelling euro certification scheme was launched in August 2000. It follows the guidelines of ISO/IEC 12119:1994 - "Information Technology - Software packages - Quality requirements and testing". The test cases cover the effects of changing scales of value, detecting the use of inverse rates, rounding, truncation, accounting with master and detail records, and account and base conversion in the transition scenario. They also assess reliability, error protection, audit trails, user interface (usability) issues, documentation, and support. The product will be certified to have fulfilled specified conditions. This is not a certification of the producer's business, their quality system or their software production process. For further information, contact me.
http://www.sysmod.com/eurofaq.htm Frequently Asked Questions about Euro conversion
http://www.sysmod.com/eurocalc Euro conversion calculator using the official daily ECB exchange rates
http://www.sysmod.com/eurowork.htm One day workshop on converting IT systems to the euro
http://www.euro.gov.uk/ UK Treasury business factsheets "the euro: it's your business" updated July 2006.
http://ssrn.com/abstract=920553 "Spam Works: Evidence from Stock Touts and Corresponding Market Activity" by Laura Frieder and Jonathan Zittrain Berkman Center Research Publication No 2006-11 (July 2006)
"We suggest that the profitability of spammed stock touting calls for adjustments to securities regulation models that rely principally on the proper labeling of information and disclosure of conflicts of interest in order to protect consumers. Based on a large sample of touted stocks listed on the Pink Sheets quotation system, we find that stocks experience a significantly positive return on days when they are heavily touted via spam, and on the day preceding such touting. Returns in the days following touting are significantly negative. Investors who respond to touting are losing, on average, 5.25% in the two day period following touting."
I attempt to filter such spam in Eudora by applying a filter that moves emails flagged by Mailscanner with SARE_GIF_STOX to Trash. I tried HTML_IMAGE_ONLY_0 but found that it caught too much real mail from people who don't know better.
http://www.sysmod.com/az.php?a=1878109367&b=Frauds+Spies+Lies Frauds, Spies, and Lies: and How to Defeat Them by Fred Cohen, 2005
Amazon ranks this in the best 100 non-fiction books of the 20th century. It's a classic, which usually means that it is more frequently referenced than read. Because established publishers seemed "appalled at the prospect that an author might govern design", Edward Tufte remortgaged his home to pay for self-publishing this volume. He hired a first-rate book designer to ensure the book layout exemplified his own principles, such as eliminating the usual separation of text and image.
Part 1, the first third of the book, deals with graphical practice, especially integrity.
Chapter 1,'Graphical Excellence' illustrates how graphics reveal patterns hidden in tables of numbers, providing examples down the centuries of the evolution of statistical X-Y charts, cartographic data maps, and time series. His most famous example is the chart by Minard showing the terrible shrinking of Napoleon's army in Russia on dimensions of army size, date, temperature and location. He gives five principles including "Graphical excellence consists of complex ideas communicated with clarity, precision. and efficiency."
Chapter 2,'Graphical Integrity' tackles deceptions such as distorted axes, areas, and non-comparable time periods. His funniest example is the Day Mines 1974 annual report that conceals the zero axis on the profit barchart so as to make the bars appear tall. He defines a 'Lie Factor' as the ratio of the size effect in the graphic to the size effect in the data, and takes newspapers to task for such misuse. He gives six principles for integrity: proportionality, labelling, variation, standardized units, dimensions, and quoting data in context. Of them all, this chapter should be required reading not just for creators of charts, but readers.
Chapter 3,'Sources of Graphical Integrity and Sophistication' attacks the idea that graphics are only for the unsophisticated reader. His analysis textbooks, newspapers and magazines by usage of relational (scatter) diagrams compare to non-explanatory charts ranks the Frankfurter Allgemeine on a similar position to Pravda.
Part 2 deals with the theory of data graphics. Essentially, he recommends maximising the amount of ink devoted to data and eliminating chart junk.
Chapter 4,'Data Ink' was put into practice at the Excel User Conference by Andy Pope who deleted the automatic grey background of an Excel chart every time he created one. He also told me about the Stephen Few book that I'll review at a later date.
Chapter 5,'Chartjunk' shows the awful cluttered effects of noise, shading, hatching, and even the grid. He lampoons the use of unnecessary graphical items as 'ducks', after a duck-shaped building that places form before function. (Although I think he may be losing a sense of humour there).
Chapter 6,'Data-Ink Maximization' begins with a wonderful quote by Ad Reinhardt about painting from his statement from an exhibition catalogue: "Clarity .. no noise ... no humbugging .. no mixing things up". Tufte strips down familiar presentations such as the box-plot, bar-chart, scatterplot down to their essentials. His dot-dash-plot is a little minimalist even for my austere tastes.
Chapter 7,'Multifunctioning Graphical Elements' describes some unusual ways to encode data into pictorial and verbal presentations, such as data-based coordinate lines. Shading and colours should of course be used to convey meaning and not create puzzles for the reader.
Chapter 8,'Data Density and Small Multiples' defines data density as the ratio of the number of data points to the area of the graphic. He says "The average published graphic is rather thin ... very few statistical graphics achieve the information display rates found in maps ... Graphics can be shrunk way down." True, but the results tend to give me a headache. The smallest graphics - not dealt with in this book - are 'sparklines' which I described last month.
Chapter 9,'Aesthetics and Techniques' discusses making complexity accessible, combining words, numbers, and pictures, with typography, colour, proportion, and scale. A table compares the attributes of friendly and unfriendly graphics.
The Visual Display of Quantitative Information: Edward R. Tufte. 2nd Ed, 2001.
Show Me the Numbers: Designing Tables and Graphs to Enlighten by Stephen Few
The IRM Data Management and Information Quality Conference will be held from 30th October to 2nd November 2006 in the Victoria Park Plaza Hotel in London. I am presenting on 1st November on 'Minimizing risks in IQ spreadsheets'. You can get in for £100 less than the advertised price by just citing me as the reference.
http://www.sysmod.com/az.php?a=0471253839&b=Data+Information+Quality Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits, by Larry P. English
In the light of Edward Tufte's book, it is amusing to hear of the decision by Microsoft to deliberately distort Excel graphics on the grounds that it is what users want.
http://www.juiceanalytics.com/weblog/index.php?tag=excel Juice Analytics - On misrepresenting data
"[What's wrong with this chart?] The graduated shading makes it hard to see where the bar graphs end. Zero pounds of sprouts were consumed, but the bar shows a value. The brussel sprouts badness is based on Microsoft's implementation of databars in the upcoming version of Excel. To quote the Excel 2007 blog: 'The answer is that when we were doing usability testing of this area in Excel, we found that users preferred not to see blank data bars, so Excel™s default was set to a 10% minimum width.' [...] the rest of us can use the in-cell graphing to do everything databars can do and more."
Martin Green of FontStuff.com reviewed my book 'Spreadsheet Check and Control' on Amazon:
'I hate proofreading my work. I know what's supposed to be there so that's what I see. But if a typo slips through the net there's no real harm done and someone else usually spots it and lets me know. But auditing a spreadsheet is a different matter. A simple error could easily go unnoticed and its consequences might cascade through the workbook without ever being discovered. So this book is a Godsend! It explains, clearly and with illustrated examples, how to design and build reliable, error-free spreadsheets and how to use the tools that Excel provides for auditing and error-checking. Each section concludes with a self-test and there is a support website. I though I knew my way around Excel pretty well, but reading this book I found myself saying "I didn't know you could do that!". If you build spreadsheets you should read this book.'
As this is the first anniversary of my book, I am making available some expanded material in response to requests for more detail:
1) A 303K 11 page PDF: Understanding the recalculation mode, Lookup and Transition Formula Evaluation, Pie charts with negative data, Using Excel Scenarios for test cases, Comparing worksheets.
2) An expanded chapter on Data Validation, 16 pages, 468K PDF..
3) Bonus material outside the scope of the ECDL syllabus. Mainly VBA examples,16 pages, 320K PDF.
To download, please have the book to hand in order to enter a password from a page and then visit:
http://sysmod.buy.ie/catalog/product_info.php?products_id=188 Our offer - free shipping to EU in August 2006.
http://www.sysmod.com/az.php?a=190540400X&b=Spreadsheet+Check+Control Available worldwide from Amazon.
http://www.sysmod.com/scanxls.htm SCANXLS is my Excel utility to scan directories and create an inventory of spreadsheets. It also builds a cross-reference of their dependencies, and helps assess their quality. Many programs will show the links IN (ie TO) a spreadsheet; SCANXLS is one of the very few tools in the marketplace that inspect entire directories and construct a list of XLS files that are found to have links FROM other files.
http://www.vgupta.com/ makes available this amazing set of free books:
Statistical Analysis With Excel [1.6 MB]
Excel For Beginners [2.4 MB]
Charting In Excel [1.6 MB]
Excel-- Beyond The Basics [1.8 MB]
Managing & Tabulating Data in Excel [1.9 MB]
Financial Analysis Using Excel [1.7 MB]
Simply send your comments to FEEDBACK (at) SYSMOD (dot) COM
Thank you! Patrick O'Beirne, Editor
http://lambda-the-ultimate.org/node/774 New Programming Language C+-
"There's finally a replacement for the commonly used programming language, C++ -- yes, it's C+- (pronounced "C More or Less"). Unlike C++, C+- is a subject-oriented language. Each C+- class instance, known as a subject, holds hidden members, known as prejudices or undeclared preferences, which are impervious to outside messages, as well as public members known as boasts or claims. C+- is a strongly typed language based on stereotyping and self-righteous logic. C+- supports information hiding and, among friend classes only, rumor sharing."
Copyright 2006 Systems Modelling Limited,
Reproduction allowed provided the newsletter is copied in its entirety and with
this copyright notice.
We appreciate any feedback or suggestions for improvement. If you have received this newsletter from anybody else, we urge you to sign up for your personal copy by sending a blank email to EuroIS-subscribe (at) yahoogroups (dot) com - it's free!
For those who would like to do more than receive the monthly newsletter, the EuroIS list makes it easy for you to discuss issues raised, to share experiences with the rest of the group, and to contribute files to a common user community pool independent of the sysmod.com web site. I will be moderating posts to the EuroIS list, to screen out inappropriate material.
Patrick O'Beirne, Editor
ABOUT THIS NEWSLETTER
"Praxis" means model or example, from the Greek verb "to do". The name is chosen to reflect our focus on practical solutions to IS problems, avoiding hype. If you like acronyms, think of it as "Patrick's reports and analysis across Information Systems".
Please tell a friend about this newsletter.
We especially appreciate a link to www.sysmod.com from your web site!
To read previous issues of this newsletter please visit our web site at http://www.sysmod.com/praxis.htm
This newsletter is prepared in good faith and the information has been taken from observation and other sources believed to be reliable. Systems Modelling Ltd. (SML) does not represent expressly or by implication the accuracy, truthfulness or reliability of any information provided. It is a condition of use that users accept that SML has no liability for any errors, inaccuracies or omissions. The information is not intended to constitute legal or professional advice. You should consult a professional at Systems Modelling Ltd. directly for advice that is specifically tailored to your particular circumstances.
We guarantee not to sell, trade or give your e-mail address to anyone.
To subscribe to this Newsletter send an email to
EuroIS-subscribe (at) yahoogroups (dot) com
To unsubscribe from this Newsletter send an email to
EuroIS-unsubscribe (at) yahoogroups (dot) com
EuroIS is the distribution list server of the PraxIS newsletter. It also offers a moderated discussion list for readers and a free shared storage area for user-contributed files. The archives of this group are on YahooGroups website http://finance.groups.yahoo.com/group/EuroIS/