In order to pay the rent while I was going to college, I had to do some programming work at a small local company. The company resold an accounting package that came with source. We, meaning I, also did a fair amount of work doing some vertical application add-on work to support individual client needs. Doing such work, I had ample opportunity to learn that when an accountant looks at a decimal point, it does not mean the same thing to him/her as it does to say a chemist, or perhaps many computer scientists. E.g. - measure the amount of liquid in a beaker or silo, approximate the velocity of a rude sprite flying across a virtual reality arena.
It I were hiring somebody to work on a business application that did any accounting,
one of the things I would be sure to ask is to show me some code to calculate
(and print, perhaps) the figures for an invoice.
Given a list (perhaps in a CSV data file?) such as
|
Now, I don't really care which language they express the solution in, what I'm after -- the turnip truck test, if you will -- is this: Do they know enough to quantize the result of each step at some fixed precision, or do they fall into simply (and incorrectly) using some kind of floating point arithmetic?
For the sample program below, I have stripped out most of the problem (such as actually reading and/or parsing input; multiplying out the extended cost), and have set up a situation that breaks simple addition.
Code | Notes | |
---|---|---|
|
Some set up: pull in the names for some utility classes; define the unit cost which was repeated in the data file; create a formatting tool to print the numbers. |
|
|
First we will see a way to screw it up, then a way to get correct answers, the same ones, every time! (but of course TMTOWTDI) |
|
|
OK, let's measure / estimate the amount of dollars we want
to scoop into the grand total bucket.
What's that, you say? I should use double instead of float?
Yes, that would hide the problem longer,
but it would not really solve it.
|
|
|
|
|
|
This time, we are going to count out the mils,
but we will remember where the decimal point goes
as we are using dollars for our units.
This is fixed precision arithmetic,
as opposed to floating point.
|
|
|
|
The root of this common problem in business applications has to do with how binary floating point numbers are composed. Binary numbers are of course composed of the sum of a series of positional values, 0 or 1 times a power of 2. However, the decimal numbers that people are used to seeing (in most cultures) are composed of the digits 0..9 times a power of 10. Of course, if the exponents are negative, then the position in question is part of a rational number (a fraction), rather than an integer.
So, how do you represent 1/10th -- (2 * 5)^-1 in terms of prime factors -- as a sum of negative powers of two? Not very well, it turns out. The table below tries to estimate 1/10th using an increasing number of binary digits.
Binary Digits (as fractions) | Decimal Equivalent | Error (vs target = 0.1) |
---|---|---|
0/2 | 0.0 | -0.1 |
0/2 + 0/4 | 0.0 | -0.1 |
0/2 + 0/4 + 1/8 | 0.125 | +0.025 |
0/2 + 0/4 + 1/8 + 0/16 | 0.125 | +0.025 |
0/2 + 0/4 + 0/8 + 1/16 + 1/32 | 0.09375 | -0.00625 |
0/2 + 0/4 + 0/8 + 1/16 + 1/32 + 1/64 | 0.109375 | +0.009375 (oops! that's worse!) |
0/2 + 0/4 + 0/8 + 1/16 + 1/32 + 0/64 | 0.09375 | -0.00625 |
0/2 + 0/4 + 0/8 + 1/16 + 1/32 + 0/64 + 1/128 | 0.1015625 | +0.0015625 |
You can come as close as you like by adding more digits, but you cannot get an exact match for a decimal fraction with binary. Ask a real mathematician: 2 and 5 are both prime numbers, and trying to create the sum of various powers of one out of the sum of various powers of another is just not going to work. Not exactly, anyway.
Notes:
Yes, about the only place you are likely to see mils is at the gas station. This probably made sense when gasoline was thirty cents a gallon, not so much when it is $3. You might also use mils on line items for measuring out cheap, bulk, commodities. A mil is of course 1/1000 of a dollar.
In particularly impoverished environments, in a pinch, you can fake it by storing values such as mils in the biggest integer type you can find, being oh so careful to do the scaling division and multiplication at the right time, but I can't say I recommend this (come on, at least slap a class, or other abstract data type mechansim, around it)
Contact me: |
r |
o |
b |
o |
p |
r |
o |
g |
@ |
yahoo.com |