MathGroup Archive: July 2005 [00034]

[Date Index] [Thread Index] [Author Index]

converting exact numbers to binary fixed-point representation

To: mathgroup at smc.vnet.net
Subject: [mg58430] converting exact numbers to binary fixed-point representation
From: Torsten Coym <torsten.coym at eas.iis.fraunhofer.de>
Date: Sat, 2 Jul 2005 04:06:34 -0400 (EDT)
Organization: Fraunhofer Gesellschaft (http://www.fraunhofer.de/)
Sender: owner-wri-mathgroup at wolfram.com

Hi group,


what I want to achieve is to represent the exact value of an irrational 
number, say Sin[2*Pi*131/8191], as a binary fixed-point number having 16 
fractional (plus one sign bit) bits.

First, I thought of converting to floating-point value and then 
converting to fixed-point using:


Floor[N[Sin[2*Pi*(131/8191)]]*2^16]

Now I'm worried about the precision of this conversion. The piece of 
code above truncates all fractional bits that occur after the left shift 
operation. The following two intermediate results (I changed to 4 bits 
for simplicity here) 1101,000...1 and 1100,111...1 will end up in two 
different code words 1101 and 1100, respectively.
Though both values might be equally close to the exact value, the second 
would give the wrong solution. So how can I ensure, that *rounding* the 
exact value to a floating-point number will never lead to such a case, 
that eventually spoils my 16 bit representation?

Is there a standard way to solve this problem?
Is this a problem at all or am I worried too much?

Any explanation is welcome.

Torsten

Prev by Date: Re: a question about the UnitStep function

Next by Date: Re: a question about the UnitStep function

Previous by thread: Re: how to find n in expression x^n using a pattern?

Next by thread: Re: converting exact numbers to binary fixed-point representation