flt2dec
)Expand description
Floatingpoint number to decimal conversion routines.
Problem statement
We are given the floatingpoint number v = f * 2^e
with an integer f
,
and its bounds minus
and plus
such that any number between v  minus
and
v + plus
will be rounded to v
. For the simplicity we assume that
this range is exclusive. Then we would like to get the unique decimal
representation V = 0.d[0..n1] * 10^k
such that:

d[0]
is nonzero. 
It’s correctly rounded when parsed back:
v  minus < V < v + plus
. Furthermore it is shortest such one, i.e., there is no representation with less thann
digits that is correctly rounded. 
It’s closest to the original value:
abs(V  v) <= 10^(kn) / 2
. Note that there might be two representations satisfying this uniqueness requirement, in which case some tiebreaking mechanism is used.
We will call this mode of operation as to the shortest mode. This mode is used
when there is no additional constraint, and can be thought as a “natural” mode
as it matches the ordinary intuition (it at least prints 0.1f32
as “0.1”).
We have two more modes of operation closely related to each other. In these modes
we are given either the number of significant digits n
or the lastdigit
limitation limit
(which determines the actual n
), and we would like to get
the representation V = 0.d[0..n1] * 10^k
such that:

d[0]
is nonzero, unlessn
was zero in which case onlyk
is returned. 
It’s closest to the original value:
abs(V  v) <= 10^(kn) / 2
. Again, there might be some tiebreaking mechanism.
When limit
is given but not n
, we set n
such that k  n = limit
so that the last digit d[n1]
is scaled by 10^(kn) = 10^limit
.
If such n
is negative, we clip it to zero so that we will only get k
.
We are also limited by the supplied buffer. This limitation is used to print
the number up to given number of fractional digits without knowing
the correct k
beforehand.
We will call the mode of operation requiring n
as to the exact mode,
and one requiring limit
as to the fixed mode. The exact mode is a subset of
the fixed mode: the sufficiently large lastdigit limitation will eventually fill
the supplied buffer and let the algorithm to return.
Implementation overview
It is easy to get the floating point printing correct but slow (Russ Cox has demonstrated how it’s easy), or incorrect but fast (naïve division and modulo). But it is surprisingly hard to print floating point numbers correctly and efficiently.
There are two classes of algorithms widely known to be correct.

The “Dragon” family of algorithm is first described by Guy L. Steele Jr. and Jon L. White. They rely on the fixedsize big integer for their correctness. A slight improvement was found later, which is posthumously described by Robert G. Burger and R. Kent Dybvig. David Gay’s
dtoa.c
routine is a popular implementation of this strategy. 
The “Grisu” family of algorithm is first described by Florian Loitsch. They use very cheap integeronly procedure to determine the closetocorrect representation which is at least guaranteed to be shortest. The variant, Grisu3, actively detects if the resulting representation is incorrect.
We implement both algorithms with necessary tweaks to suit our requirements.
In particular, published literatures are short of the actual implementation
difficulties like how to avoid arithmetic overflows. Each implementation,
available in strategy::dragon
and strategy::grisu
respectively,
extensively describes all necessary justifications and many proofs for them.
(It is still difficult to follow though. You have been warned.)
Both implementations expose two public functions:

format_shortest(decoded, buf)
, which always needs at leastMAX_SIG_DIGITS
digits of buffer. Implements the shortest mode. 
format_exact(decoded, buf, limit)
, which accepts as small as one digit of buffer. Implements exact and fixed modes.
They try to fill the u8
buffer with digits and returns the number of digits
written and the exponent k
. They are total for all finite f32
and f64
inputs (Grisu internally falls back to Dragon if necessary).
The rendered digits are formatted into the actual string form with four functions:

to_shortest_str
prints the shortest representation, which can be padded by zeroes to make at least given number of fractional digits. 
to_shortest_exp_str
prints the shortest representation, which can be padded by zeroes when its exponent is in the specified ranges, or can be printed in the exponential form such as1.23e45
. 
to_exact_exp_str
prints the exact representation with given number of digits in the exponential form. 
to_exact_fixed_str
prints the fixed representation with exactly given number of fractional digits.
They all return a slice of preallocated Part
array, which corresponds to
the individual part of strings: a fixed string, a part of rendered digits,
a number of zeroes or a small (u16
) number. The caller is expected to
provide a large enough buffer and Part
array, and to assemble the final
string from resulting Part
s itself.
All algorithms and formatting functions are accompanied by extensive tests
in coretests::num::flt2dec
module. It also shows how to use individual
functions.
Modules
Structs
Enums
Constants
Traits
decode
d.Functions
FullDecoded
value
from given floating point number.""
, "+"
or ""
.0.<...buf...> * 10^exp
into the decimal form
with at least given number of fractional digits. The result is stored to
the supplied parts array and a slice of written parts is returned.0.<...buf...> * 10^exp
into the exponential
form with at least the given number of significant digits. When upper
is true
,
the exponent will be prefixed by E
; otherwise that’s e
. The result is
stored to the supplied parts array and a slice of written parts is returned.d
contains decimal digits, increase the last digit and propagate carry.
Returns a next digit when it causes the length to change.upper
is used to determine the case of the exponent prefix (e
or E
).
The first part to be rendered is always a Part::Sign
(which can be
an empty string if no sign is rendered).upper
is currently
unused but left for the future decision to change the case of nonfinite values,
i.e., inf
and nan
. The first part to be rendered is always a Part::Sign
(which can be an empty string if no sign is rendered).upper
is used to determine the case of nonfinite values
(inf
and nan
) or the case of the exponent prefix (e
or E
).
The first part to be rendered is always a Part::Sign
(which can be
an empty string if no sign is rendered).upper
is currently
unused but left for the future decision to change the case of nonfinite values,
i.e., inf
and nan
. The first part to be rendered is always a Part::Sign
(which can be an empty string if no sign is rendered).