VirtualBox

source: vbox/trunk/src/libs/softfloat-3e/testfloat/doc/TestFloat-general.html@ 103075

最後變更 在這個檔案從103075是 94551,由 vboxsync 提交於 3 年 前

libs/softfloat: Copied TestFloat-3e from vendor branch and to testfloat subdir. bugref:9898

  • 屬性 svn:eol-style 設為 native
  • 屬性 svn:mime-type 設為 text/html
檔案大小: 41.5 KB
 
1
2<HTML>
3
4<HEAD>
5<TITLE>Berkeley TestFloat General Documentation</TITLE>
6</HEAD>
7
8<BODY>
9
10<H1>Berkeley TestFloat Release 3e: General Documentation</H1>
11
12<P>
13John R. Hauser<BR>
142018 January 20<BR>
15</P>
16
17
18<H2>Contents</H2>
19
20<BLOCKQUOTE>
21<TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0>
22<COL WIDTH=25>
23<COL WIDTH=*>
24<TR><TD COLSPAN=2>1. Introduction</TD></TR>
25<TR><TD COLSPAN=2>2. Limitations</TD></TR>
26<TR><TD COLSPAN=2>3. Acknowledgments and License</TD></TR>
27<TR><TD COLSPAN=2>4. What TestFloat Does</TD></TR>
28<TR><TD COLSPAN=2>5. Executing TestFloat</TD></TR>
29<TR><TD COLSPAN=2>6. Operations Tested by TestFloat</TD></TR>
30<TR><TD></TD><TD>6.1. Conversion Operations</TD></TR>
31<TR><TD></TD><TD>6.2. Basic Arithmetic Operations</TD></TR>
32<TR><TD></TD><TD>6.3. Fused Multiply-Add Operations</TD></TR>
33<TR><TD></TD><TD>6.4. Remainder Operations</TD></TR>
34<TR><TD></TD><TD>6.5. Round-to-Integer Operations</TD></TR>
35<TR><TD></TD><TD>6.6. Comparison Operations</TD></TR>
36<TR><TD COLSPAN=2>7. Interpreting TestFloat Output</TD></TR>
37<TR>
38 <TD COLSPAN=2>8. Variations Allowed by the IEEE Floating-Point Standard</TD>
39</TR>
40<TR><TD></TD><TD>8.1. Underflow</TD></TR>
41<TR><TD></TD><TD>8.2. NaNs</TD></TR>
42<TR><TD></TD><TD>8.3. Conversions to Integer</TD></TR>
43<TR><TD COLSPAN=2>9. Contact Information</TD></TR>
44</TABLE>
45</BLOCKQUOTE>
46
47
48<H2>1. Introduction</H2>
49
50<P>
51Berkeley TestFloat is a small collection of programs for testing that an
52implementation of binary floating-point conforms to the IEEE Standard for
53Floating-Point Arithmetic.
54All operations required by the original 1985 version of the IEEE Floating-Point
55Standard can be tested, except for conversions to and from decimal.
56With the current release, the following binary formats can be tested:
57<NOBR>16-bit</NOBR> half-precision, <NOBR>32-bit</NOBR> single-precision,
58<NOBR>64-bit</NOBR> double-precision, <NOBR>80-bit</NOBR>
59double-extended-precision, and/or <NOBR>128-bit</NOBR> quadruple-precision.
60TestFloat cannot test decimal floating-point.
61</P>
62
63<P>
64Included in the TestFloat package are the <CODE>testsoftfloat</CODE> and
65<CODE>timesoftfloat</CODE> programs for testing the Berkeley SoftFloat software
66implementation of floating-point and for measuring its speed.
67Information about SoftFloat can be found at the SoftFloat Web page,
68<A HREF="http://www.jhauser.us/arithmetic/SoftFloat.html"><NOBR><CODE>http://www.jhauser.us/arithmetic/SoftFloat.html</CODE></NOBR></A>.
69The <CODE>testsoftfloat</CODE> and <CODE>timesoftfloat</CODE> programs are
70expected to be of interest only to people compiling the SoftFloat sources.
71</P>
72
73<P>
74This document explains how to use the TestFloat programs.
75It does not attempt to define or explain much of the IEEE Floating-Point
76Standard.
77Details about the standard are available elsewhere.
78</P>
79
80<P>
81The current version of TestFloat is <NOBR>Release 3e</NOBR>.
82This version differs from earlier releases 3b through 3d in only minor ways.
83Compared to the original <NOBR>Release 3</NOBR>:
84<UL>
85<LI>
86<NOBR>Release 3b</NOBR> added the ability to test the <NOBR>16-bit</NOBR>
87half-precision format.
88<LI>
89<NOBR>Release 3c</NOBR> added the ability to test a rarely used rounding mode,
90<I>round to odd</I>, also known as <I>jamming</I>.
91<LI>
92<NOBR>Release 3d</NOBR> modified the code for testing C arithmetic to
93potentially include testing newer library functions <CODE>sqrtf</CODE>,
94<CODE>sqrtl</CODE>, <CODE>fmaf</CODE>, <CODE>fma</CODE>, and <CODE>fmal</CODE>.
95</UL>
96This release adds a few more small improvements, including modifying the
97expected behavior of rounding mode <CODE>odd</CODE> and fixing a minor bug in
98the all-in-one <CODE>testfloat</CODE> program.
99</P>
100
101<P>
102Compared to Release 2c and earlier, the set of TestFloat programs, as well as
103the programs&rsquo; arguments and behavior, changed some with
104<NOBR>Release 3</NOBR>.
105For more about the evolution of TestFloat releases, see
106<A HREF="TestFloat-history.html"><NOBR><CODE>TestFloat-history.html</CODE></NOBR></A>.
107</P>
108
109
110<H2>2. Limitations</H2>
111
112<P>
113TestFloat output is not always easily interpreted.
114Detailed knowledge of the IEEE Floating-Point Standard and its vagaries is
115needed to use TestFloat responsibly.
116</P>
117
118<P>
119TestFloat performs relatively simple tests designed to check the fundamental
120soundness of the floating-point under test.
121TestFloat may also at times manage to find rarer and more subtle bugs, but it
122will probably only find such bugs by chance.
123Software that purposefully seeks out various kinds of subtle floating-point
124bugs can be found through links posted on the TestFloat Web page,
125<A HREF="http://www.jhauser.us/arithmetic/TestFloat.html"><NOBR><CODE>http://www.jhauser.us/arithmetic/TestFloat.html</CODE></NOBR></A>.
126</P>
127
128
129<H2>3. Acknowledgments and License</H2>
130
131<P>
132The TestFloat package was written by me, <NOBR>John R.</NOBR> Hauser.
133<NOBR>Release 3</NOBR> of TestFloat was a completely new implementation
134supplanting earlier releases.
135The project to create <NOBR>Release 3</NOBR> (now <NOBR>through 3e</NOBR>) was
136done in the employ of the University of California, Berkeley, within the
137Department of Electrical Engineering and Computer Sciences, first for the
138Parallel Computing Laboratory (Par Lab) and then for the ASPIRE Lab.
139The work was officially overseen by Prof. Krste Asanovic, with funding provided
140by these sources:
141<BLOCKQUOTE>
142<TABLE>
143<COL>
144<COL WIDTH=10>
145<COL>
146<TR>
147<TD VALIGN=TOP><NOBR>Par Lab:</NOBR></TD>
148<TD></TD>
149<TD>
150Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery
151(Award #DIG07-10227), with additional support from Par Lab affiliates Nokia,
152NVIDIA, Oracle, and Samsung.
153</TD>
154</TR>
155<TR>
156<TD VALIGN=TOP><NOBR>ASPIRE Lab:</NOBR></TD>
157<TD></TD>
158<TD>
159DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from
160ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA,
161Oracle, and Samsung.
162</TD>
163</TR>
164</TABLE>
165</BLOCKQUOTE>
166</P>
167
168<P>
169The following applies to the whole of TestFloat <NOBR>Release 3e</NOBR> as well
170as to each source file individually.
171</P>
172
173<P>
174Copyright 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018 The Regents of the
175University of California.
176All rights reserved.
177</P>
178
179<P>
180Redistribution and use in source and binary forms, with or without
181modification, are permitted provided that the following conditions are met:
182<OL>
183
184<LI>
185<P>
186Redistributions of source code must retain the above copyright notice, this
187list of conditions, and the following disclaimer.
188</P>
189
190<LI>
191<P>
192Redistributions in binary form must reproduce the above copyright notice, this
193list of conditions, and the following disclaimer in the documentation and/or
194other materials provided with the distribution.
195</P>
196
197<LI>
198<P>
199Neither the name of the University nor the names of its contributors may be
200used to endorse or promote products derived from this software without specific
201prior written permission.
202</P>
203
204</OL>
205</P>
206
207<P>
208THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS &ldquo;AS IS&rdquo;,
209AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
210IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ARE
211DISCLAIMED.
212IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
213INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
214BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
215DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
216LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
217OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
218ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
219</P>
220
221
222<H2>4. What TestFloat Does</H2>
223
224<P>
225TestFloat is designed to test a floating-point implementation by comparing its
226behavior with that of TestFloat&rsquo;s own internal floating-point implemented
227in software.
228For each operation to be tested, the TestFloat programs can generate a large
229number of test cases, made up of simple pattern tests intermixed with weighted
230random inputs.
231The cases generated should be adequate for testing carry chain propagations,
232and the rounding of addition, subtraction, multiplication, and simple
233operations like conversions.
234TestFloat makes a point of checking all boundary cases of the arithmetic,
235including underflows, overflows, invalid operations, subnormal inputs, zeros
236(positive and negative), infinities, and NaNs.
237For the interesting operations like addition and multiplication, millions of
238test cases may be checked.
239</P>
240
241<P>
242TestFloat is not remarkably good at testing difficult rounding cases for
243division and square root.
244It also makes no attempt to find bugs specific to SRT division and the like
245(such as the infamous Pentium division bug).
246Software that tests for such failures can be found through links on the
247TestFloat Web page,
248<A HREF="http://www.jhauser.us/arithmetic/TestFloat.html"><NOBR><CODE>http://www.jhauser.us/arithmetic/TestFloat.html</CODE></NOBR></A>.
249</P>
250
251<P>
252NOTE!<BR>
253It is the responsibility of the user to verify that the discrepancies TestFloat
254finds actually represent faults in the implementation being tested.
255Advice to help with this task is provided later in this document.
256Furthermore, even if TestFloat finds no fault with a floating-point
257implementation, that in no way guarantees that the implementation is bug-free.
258</P>
259
260<P>
261For each operation, TestFloat can test all five rounding modes defined by the
262IEEE Floating-Point Standard, plus possibly a sixth mode, <I>round to odd</I>
263(depending on the options selected when TestFloat was built).
264TestFloat verifies not only that the numeric results of an operation are
265correct, but also that the proper floating-point exception flags are raised.
266All five exception flags are tested, including the <I>inexact</I> flag.
267TestFloat does not attempt to verify that the floating-point exception flags
268are actually implemented as sticky flags.
269</P>
270
271<P>
272For the <NOBR>80-bit</NOBR> double-extended-precision format, TestFloat can
273test the addition, subtraction, multiplication, division, and square root
274operations at all three of the standard rounding precisions.
275The rounding precision can be set to <NOBR>32 bits</NOBR>, equivalent to
276single-precision, to <NOBR>64 bits</NOBR>, equivalent to double-precision, or
277to the full <NOBR>80 bits</NOBR> of the double-extended-precision.
278Rounding precision control can be applied only to the double-extended-precision
279format and only for the five basic arithmetic operations: addition,
280subtraction, multiplication, division, and square root.
281Other operations can be tested only at full precision.
282</P>
283
284<P>
285As a rule, TestFloat is not particular about the bit patterns of NaNs that
286appear as operation results.
287Any NaN is considered as good a result as another.
288This laxness can be overridden so that TestFloat checks for particular bit
289patterns within NaN results.
290See <NOBR>section 8</NOBR> below, <I>Variations Allowed by the IEEE
291Floating-Point Standard</I>, plus the <CODE>-checkNaNs</CODE> and
292<CODE>-checkInvInts</CODE> options documented for programs
293<CODE>testfloat_ver</CODE> and <CODE>testfloat</CODE>.
294</P>
295
296<P>
297TestFloat normally compares an implementation of floating-point against the
298Berkeley SoftFloat software implementation of floating-point, also created by
299me.
300The SoftFloat functions are linked into each TestFloat program&rsquo;s
301executable.
302Information about SoftFloat can be found at the Web page
303<A HREF="http://www.jhauser.us/arithmetic/SoftFloat.html"><NOBR><CODE>http://www.jhauser.us/arithmetic/SoftFloat.html</CODE></NOBR></A>.
304</P>
305
306<P>
307For testing SoftFloat itself, the TestFloat package includes a
308<CODE>testsoftfloat</CODE> program that compares SoftFloat&rsquo;s
309floating-point against <EM>another</EM> software floating-point implementation.
310The second software floating-point is simpler and slower than SoftFloat, and is
311completely independent of SoftFloat.
312Although the second software floating-point cannot be guaranteed to be
313bug-free, the chance that it would mimic any of SoftFloat&rsquo;s bugs is low.
314Consequently, an error in one or the other floating-point version should appear
315as an unexpected difference between the two implementations.
316Note that testing SoftFloat should be necessary only when compiling a new
317TestFloat executable or when compiling SoftFloat for some other reason.
318</P>
319
320
321<H2>5. Executing TestFloat</H2>
322
323<P>
324The TestFloat package consists of five programs, all intended to be executed
325from a command-line interpreter:
326<BLOCKQUOTE>
327<TABLE>
328<TR>
329<TD>
330<A HREF="testfloat_gen.html"><CODE>testfloat_gen</CODE></A><CODE>&nbsp;&nbsp;&nbsp;</CODE>
331</TD>
332<TD>
333Generates test cases for a specific floating-point operation.
334</TD>
335</TR>
336<TR>
337<TD>
338<A HREF="testfloat_ver.html"><CODE>testfloat_ver</CODE></A>
339</TD>
340<TD>
341Verifies whether the results from executing a floating-point operation are as
342expected.
343</TD>
344</TR>
345<TR>
346<TD>
347<A HREF="testfloat.html"><CODE>testfloat</CODE></A>
348</TD>
349<TD>
350An all-in-one program that generates test cases, executes floating-point
351operations, and verifies whether the results match expectations.
352</TD>
353</TR>
354<TR>
355<TD>
356<A HREF="testsoftfloat.html"><CODE>testsoftfloat</CODE></A><CODE>&nbsp;&nbsp;&nbsp;</CODE>
357</TD>
358<TD>
359Like <CODE>testfloat</CODE>, but for testing SoftFloat.
360</TD>
361</TR>
362<TR>
363<TD>
364<A HREF="timesoftfloat.html"><CODE>timesoftfloat</CODE></A><CODE>&nbsp;&nbsp;&nbsp;</CODE>
365</TD>
366<TD>
367A program for measuring the speed of SoftFloat (included in the TestFloat
368package for convenience).
369</TD>
370</TR>
371</TABLE>
372</BLOCKQUOTE>
373Each program has its own page of documentation that can be opened through the
374links in the table above.
375</P>
376
377<P>
378To test a floating-point implementation other than SoftFloat, one of three
379different methods can be used.
380The first method pipes output from <CODE>testfloat_gen</CODE> to a program
381that:
382<NOBR>(a) reads</NOBR> the incoming test cases, <NOBR>(b) invokes</NOBR> the
383floating-point operation being tested, and <NOBR>(c) writes</NOBR> the
384operation results to output.
385These results can then be piped to <CODE>testfloat_ver</CODE> to be checked for
386correctness.
387Assuming a vertical bar (<CODE>|</CODE>) indicates a pipe between programs, the
388complete process could be written as a single command like so:
389<BLOCKQUOTE>
390<PRE>
391testfloat_gen ... &lt;<I>type</I>&gt; | &lt;<I>program-that-invokes-op</I>&gt; | testfloat_ver ... &lt;<I>function</I>&gt;
392</PRE>
393</BLOCKQUOTE>
394The program in the middle is not supplied by TestFloat but must be created
395independently.
396If for some reason this program cannot take command-line arguments, the
397<CODE>-prefix</CODE> option of <CODE>testfloat_gen</CODE> can communicate
398parameters through the pipe.
399</P>
400
401<P>
402A second method for running TestFloat is similar but has
403<CODE>testfloat_gen</CODE> supply not only the test inputs but also the
404expected results for each case.
405With this additional information, the job done by <CODE>testfloat_ver</CODE>
406can be folded into the invoking program to give the following command:
407<BLOCKQUOTE>
408<PRE>
409testfloat_gen ... &lt;<I>function</I>&gt; | &lt;<I>program-that-invokes-op-and-compares-results</I>&gt;
410</PRE>
411</BLOCKQUOTE>
412Again, the program that actually invokes the floating-point operation is not
413supplied by TestFloat but must be created independently.
414Depending on circumstance, it may be preferable either to let
415<CODE>testfloat_ver</CODE> check and report suspected errors (first method) or
416to include this step in the invoking program (second method).
417</P>
418
419<P>
420The third way to use TestFloat is the all-in-one <CODE>testfloat</CODE>
421program.
422This program can perform all the steps of creating test cases, invoking the
423floating-point operation, checking the results, and reporting suspected errors.
424However, for this to be possible, <CODE>testfloat</CODE> must be compiled to
425contain the method for invoking the floating-point operations to test.
426Each build of <CODE>testfloat</CODE> is therefore capable of testing
427<EM>only</EM> the floating-point implementation it was built to invoke.
428To test a new implementation of floating-point, a new <CODE>testfloat</CODE>
429must be created, linked to that specific implementation.
430By comparison, the <CODE>testfloat_gen</CODE> and <CODE>testfloat_ver</CODE>
431programs are entirely generic;
432one instance is usable for testing any floating-point implementation, because
433implementation-specific details are segregated in the custom program that
434follows <CODE>testfloat_gen</CODE>.
435</P>
436
437<P>
438Program <CODE>testsoftfloat</CODE> is another all-in-one program specifically
439for testing SoftFloat.
440</P>
441
442<P>
443Programs <CODE>testfloat_ver</CODE>, <CODE>testfloat</CODE>, and
444<CODE>testsoftfloat</CODE> all report status and error information in a common
445way.
446As it executes, each of these programs writes status information to the
447standard error output, which should be the screen by default.
448In order for this status to be displayed properly, the standard error stream
449should not be redirected to a file.
450Any discrepancies that are found are written to the standard output stream,
451which is easily redirected to a file if desired.
452Unless redirected, reported errors will appear intermixed with the ongoing
453status information in the output.
454</P>
455
456
457<H2>6. Operations Tested by TestFloat</H2>
458
459<P>
460TestFloat can test all operations required by the original 1985 IEEE
461Floating-Point Standard except for conversions to and from decimal.
462These operations are:
463<UL>
464<LI>
465conversions among the supported floating-point formats, and also between
466integers (<NOBR>32-bit</NOBR> and <NOBR>64-bit</NOBR>, signed and unsigned) and
467any of the floating-point formats;
468<LI>
469for each floating-point format, the usual addition, subtraction,
470multiplication, division, and square root operations;
471<LI>
472for each format, the floating-point remainder operation defined by the IEEE
473Standard;
474<LI>
475for each format, a &ldquo;round to integer&rdquo; operation that rounds to the
476nearest integer value in the same format; and
477<LI>
478comparisons between two values in the same floating-point format.
479</UL>
480In addition, TestFloat can also test
481<UL>
482<LI>
483for each floating-point format except <NOBR>80-bit</NOBR>
484double-extended-precision, the fused multiply-add operation defined by the 2008
485IEEE Standard.
486</UL>
487</P>
488
489<P>
490More information about all these operations is given below.
491In the operation names used by TestFloat, <NOBR>16-bit</NOBR> half-precision is
492called <CODE>f16</CODE>, <NOBR>32-bit</NOBR> single-precision is
493<CODE>f32</CODE>, <NOBR>64-bit</NOBR> double-precision is <CODE>f64</CODE>,
494<NOBR>80-bit</NOBR> double-extended-precision is <CODE>extF80</CODE>, and
495<NOBR>128-bit</NOBR> quadruple-precision is <CODE>f128</CODE>.
496TestFloat generally uses the same names for operations as Berkeley SoftFloat,
497except that TestFloat&rsquo;s names never include the <CODE>M</CODE> that
498SoftFloat uses to indicate that values are passed through pointers.
499</P>
500
501<H3>6.1. Conversion Operations</H3>
502
503<P>
504All conversions among the floating-point formats and all conversions between a
505floating-point format and <NOBR>32-bit</NOBR> and <NOBR>64-bit</NOBR> integers
506can be tested.
507The conversion operations are:
508<BLOCKQUOTE>
509<PRE>
510ui32_to_f16 ui64_to_f16 i32_to_f16 i64_to_f16
511ui32_to_f32 ui64_to_f32 i32_to_f32 i64_to_f32
512ui32_to_f64 ui64_to_f64 i32_to_f64 i64_to_f64
513ui32_to_extF80 ui64_to_extF80 i32_to_extF80 i64_to_extF80
514ui32_to_f128 ui64_to_f128 i32_to_f128 i64_to_f128
515
516f16_to_ui32 f32_to_ui32 f64_to_ui32 extF80_to_ui32 f128_to_ui32
517f16_to_ui64 f32_to_ui64 f64_to_ui64 extF80_to_ui64 f128_to_ui64
518f16_to_i32 f32_to_i32 f64_to_i32 extF80_to_i32 f128_to_i32
519f16_to_i64 f32_to_i64 f64_to_i64 extF80_to_i64 f128_to_i64
520
521f16_to_f32 f32_to_f16 f64_to_f16 extF80_to_f16 f128_to_f16
522f16_to_f64 f32_to_f64 f64_to_f32 extF80_to_f32 f128_to_f32
523f16_to_extF80 f32_to_extF80 f64_to_extF80 extF80_to_f64 f128_to_f64
524f16_to_f128 f32_to_f128 f64_to_f128 extF80_to_f128 f128_to_extF80
525</PRE>
526</BLOCKQUOTE>
527Abbreviations <CODE>ui32</CODE> and <CODE>ui64</CODE> indicate
528<NOBR>32-bit</NOBR> and <NOBR>64-bit</NOBR> unsigned integer types, while
529<CODE>i32</CODE> and <CODE>i64</CODE> indicate their signed counterparts.
530These conversions all round according to the current rounding mode as relevant.
531Conversions from a smaller to a larger floating-point format are always exact
532and so require no rounding.
533Likewise, conversions from <NOBR>32-bit</NOBR> integers to <NOBR>64-bit</NOBR>
534double-precision or to any larger floating-point format are also exact, as are
535conversions from <NOBR>64-bit</NOBR> integers to <NOBR>80-bit</NOBR>
536double-extended-precision and <NOBR>128-bit</NOBR> quadruple-precision.
537</P>
538
539<P>
540For the all-in-one <CODE>testfloat</CODE> program, this list of conversion
541operations requires amendment.
542For <CODE>testfloat</CODE> only, conversions to an integer type have names that
543explicitly specify the rounding mode and treatment of inexactness.
544Thus, instead of
545<BLOCKQUOTE>
546<PRE>
547&lt;<I>float</I>&gt;_to_&lt;<I>int</I>&gt;
548</PRE>
549</BLOCKQUOTE>
550as listed above, operations converting to integer type have names of these
551forms:
552<BLOCKQUOTE>
553<PRE>
554&lt;<I>float</I>&gt;_to_&lt;<I>int</I>&gt;_r_&lt;<I>round</I>&gt;
555&lt;<I>float</I>&gt;_to_&lt;<I>int</I>&gt;_rx_&lt;<I>round</I>&gt;
556</PRE>
557</BLOCKQUOTE>
558The <CODE>&lt;<I>round</I>&gt;</CODE> component is one of
559&lsquo;<CODE>near_even</CODE>&rsquo;, &lsquo;<CODE>near_maxMag</CODE>&rsquo;,
560&lsquo;<CODE>minMag</CODE>&rsquo;, &lsquo;<CODE>min</CODE>&rsquo;, or
561&lsquo;<CODE>max</CODE>&rsquo;, choosing the rounding mode.
562Any other indication of rounding mode is ignored.
563The operations with &lsquo;<CODE>_r_</CODE>&rsquo; in their names never raise
564the <I>inexact</I> exception, while those with &lsquo;<CODE>_rx_</CODE>&rsquo;
565raise the <I>inexact</I> exception whenever the result is not exact.
566</P>
567
568<P>
569TestFloat assumes that conversions from floating-point to an integer type
570should raise the <I>invalid</I> exception if the input cannot be rounded to an
571integer representable in the result format.
572In such a circumstance:
573<UL>
574
575<LI>
576<P>
577If the result type is an unsigned integer, TestFloat normally expects the
578result of the operation to be the type&rsquo;s largest integer value.
579In the case that the input is a negative number (not a NaN), a zero result may
580also be accepted.
581</P>
582
583<LI>
584<P>
585If the result type is a signed integer and the input is a number (not a NaN),
586TestFloat expects the result to be the largest-magnitude integer with the same
587sign as the input.
588When a NaN is converted to a signed integer type, TestFloat allows either the
589largest postive or largest-magnitude negative integer to be returned.
590</P>
591
592</UL>
593Conversions to integer types are expected never to raise the <I>overflow</I>
594exception.
595</P>
596
597<H3>6.2. Basic Arithmetic Operations</H3>
598
599<P>
600The following standard arithmetic operations can be tested:
601<BLOCKQUOTE>
602<PRE>
603f16_add f16_sub f16_mul f16_div f16_sqrt
604f32_add f32_sub f32_mul f32_div f32_sqrt
605f64_add f64_sub f64_mul f64_div f64_sqrt
606extF80_add extF80_sub extF80_mul extF80_div extF80_sqrt
607f128_add f128_sub f128_mul f128_div f128_sqrt
608</PRE>
609</BLOCKQUOTE>
610The double-extended-precision (<CODE>extF80</CODE>) operations can be rounded
611to reduced precision under rounding precision control.
612</P>
613
614<H3>6.3. Fused Multiply-Add Operations</H3>
615
616<P>
617For all floating-point formats except <NOBR>80-bit</NOBR>
618double-extended-precision, TestFloat can test the fused multiply-add operation
619defined by the 2008 IEEE Floating-Point Standard.
620The fused multiply-add operations are:
621<BLOCKQUOTE>
622<PRE>
623f16_mulAdd
624f32_mulAdd
625f64_mulAdd
626f128_mulAdd
627</PRE>
628</BLOCKQUOTE>
629</P>
630
631<P>
632If one of the multiplication operands is infinite and the other is zero,
633TestFloat expects the fused multiply-add operation to raise the <I>invalid</I>
634exception even if the third operand is a quiet NaN.
635</P>
636
637<H3>6.4. Remainder Operations</H3>
638
639<P>
640For each format, TestFloat can test the IEEE Standard&rsquo;s remainder
641operation.
642These operations are:
643<BLOCKQUOTE>
644<PRE>
645f16_rem
646f32_rem
647f64_rem
648extF80_rem
649f128_rem
650</PRE>
651</BLOCKQUOTE>
652The remainder operations are always exact and so require no rounding.
653</P>
654
655<H3>6.5. Round-to-Integer Operations</H3>
656
657<P>
658For each format, TestFloat can test the IEEE Standard&rsquo;s round-to-integer
659operation.
660For most TestFloat programs, these operations are:
661<BLOCKQUOTE>
662<PRE>
663f16_roundToInt
664f32_roundToInt
665f64_roundToInt
666extF80_roundToInt
667f128_roundToInt
668</PRE>
669</BLOCKQUOTE>
670</P>
671
672<P>
673Just as for conversions to integer types (<NOBR>section 6.1</NOBR> above), the
674all-in-one <CODE>testfloat</CODE> program is again an exception.
675For <CODE>testfloat</CODE> only, the round-to-integer operations have names of
676these forms:
677<BLOCKQUOTE>
678<PRE>
679&lt;<I>float</I>&gt;_roundToInt_r_&lt;<I>round</I>&gt;
680&lt;<I>float</I>&gt;_roundToInt_x
681</PRE>
682</BLOCKQUOTE>
683For the &lsquo;<CODE>_r_</CODE>&rsquo; versions, the <I>inexact</I> exception
684is never raised, and the <CODE>&lt;<I>round</I>&gt;</CODE> component specifies
685the rounding mode as one of &lsquo;<CODE>near_even</CODE>&rsquo;,
686&lsquo;<CODE>near_maxMag</CODE>&rsquo;, &lsquo;<CODE>minMag</CODE>&rsquo;,
687&lsquo;<CODE>min</CODE>&rsquo;, or &lsquo;<CODE>max</CODE>&rsquo;.
688The usual indication of rounding mode is ignored.
689In contrast, the &lsquo;<CODE>_x</CODE>&rsquo; versions accept the usual
690indication of rounding mode and raise the <I>inexact</I> exception whenever the
691result is not exact.
692This irregular system follows the IEEE Standard&rsquo;s particular
693specification for the round-to-integer operations.
694</P>
695
696<H3>6.6. Comparison Operations</H3>
697
698<P>
699The following floating-point comparison operations can be tested:
700<BLOCKQUOTE>
701<PRE>
702f16_eq f16_le f16_lt
703f32_eq f32_le f32_lt
704f64_eq f64_le f64_lt
705extF80_eq extF80_le extF80_lt
706f128_eq f128_le f128_lt
707</PRE>
708</BLOCKQUOTE>
709The abbreviation <CODE>eq</CODE> stands for &ldquo;equal&rdquo; (=),
710<CODE>le</CODE> stands for &ldquo;less than or equal&rdquo; (&le;), and
711<CODE>lt</CODE> stands for &ldquo;less than&rdquo; (&lt;).
712</P>
713
714<P>
715The IEEE Standard specifies that, by default, the less-than-or-equal and
716less-than comparisons raise the <I>invalid</I> exception if either input is any
717kind of NaN.
718The equality comparisons, on the other hand, are defined by default to raise
719the <I>invalid</I> exception only for signaling NaNs, not for quiet NaNs.
720For completeness, the following additional operations can be tested if
721supported:
722<BLOCKQUOTE>
723<PRE>
724f16_eq_signaling f16_le_quiet f16_lt_quiet
725f32_eq_signaling f32_le_quiet f32_lt_quiet
726f64_eq_signaling f64_le_quiet f64_lt_quiet
727extF80_eq_signaling extF80_le_quiet extF80_lt_quiet
728f128_eq_signaling f128_le_quiet f128_lt_quiet
729</PRE>
730</BLOCKQUOTE>
731The <CODE>signaling</CODE> equality comparisons are identical to the standard
732operations except that the <I>invalid</I> exception should be raised for any
733NaN input.
734Similarly, the <CODE>quiet</CODE> comparison operations should be identical to
735their counterparts except that the <I>invalid</I> exception is not raised for
736quiet NaNs.
737</P>
738
739<P>
740Obviously, no comparison operations ever require rounding.
741Any rounding mode is ignored.
742</P>
743
744
745<H2>7. Interpreting TestFloat Output</H2>
746
747<P>
748The &ldquo;errors&rdquo; reported by TestFloat programs may or may not really
749represent errors in the system being tested.
750For each test case tried, the results from the floating-point implementation
751being tested could differ from the expected results for several reasons:
752<UL>
753<LI>
754The IEEE Floating-Point Standard allows for some variation in how conforming
755floating-point behaves.
756Two implementations can sometimes give different results without either being
757incorrect.
758<LI>
759The trusted floating-point emulation could be faulty.
760This could be because there is a bug in the way the emulation is coded, or
761because a mistake was made when the code was compiled for the current system.
762<LI>
763The TestFloat program may not work properly, reporting differences that do not
764exist.
765<LI>
766Lastly, the floating-point being tested could actually be faulty.
767</UL>
768It is the responsibility of the user to determine the causes for the
769discrepancies that are reported.
770Making this determination can require detailed knowledge about the IEEE
771Standard.
772Assuming TestFloat is working properly, any differences found will be due to
773either the first or last of the reasons above.
774Variations in the IEEE Standard that could lead to false error reports are
775discussed in <NOBR>section 8</NOBR>, <I>Variations Allowed by the IEEE
776Floating-Point Standard</I>.
777</P>
778
779<P>
780For each reported error (or apparent error), a line of text is written to the
781default output.
782If a line would be longer than 79 characters, it is divided.
783The first part of each error line begins in the leftmost column, and any
784subsequent &ldquo;continuation&rdquo; lines are indented with a tab.
785</P>
786
787<P>
788Each error reported is of the form:
789<BLOCKQUOTE>
790<PRE>
791&lt;<I>inputs</I>&gt; => &lt;<I>observed-output</I>&gt; expected: &lt;<I>expected-output</I>&gt;
792</PRE>
793</BLOCKQUOTE>
794The <CODE>&lt;<I>inputs</I>&gt;</CODE> are the inputs to the operation.
795Each output (observed or expected) is shown as a pair: the result value first,
796followed by the exception flags.
797</P>
798
799<P>
800For example, two typical error lines could be
801<BLOCKQUOTE>
802<PRE>
803-00.7FFF00 -7F.000100 => +01.000000 ...ux expected: +01.000000 ....x
804+81.000004 +00.1FFFFF => +01.000000 ...ux expected: +01.000000 ....x
805</PRE>
806</BLOCKQUOTE>
807In the first line, the inputs are <CODE>-00.7FFF00</CODE> and
808<CODE>-7F.000100</CODE>, and the observed result is <CODE>+01.000000</CODE>
809with flags <CODE>...ux</CODE>.
810The trusted emulation result is the same but with different flags,
811<CODE>....x</CODE>.
812Items such as <CODE>-00.7FFF00</CODE> composed of a sign character
813<NOBR>(<CODE>+</CODE>/<CODE>-</CODE>)</NOBR>, hexadecimal digits, and a single
814period represent floating-point values (here <NOBR>32-bit</NOBR>
815single-precision).
816The two instances above were reported as errors because the exception flag
817results differ.
818</P>
819
820<P>
821Aside from the exception flags, there are ten data types that may be
822represented.
823Five are floating-point types: <NOBR>16-bit</NOBR> half-precision,
824<NOBR>32-bit</NOBR> single-precision, <NOBR>64-bit</NOBR> double-precision,
825<NOBR>80-bit</NOBR> double-extended-precision, and <NOBR>128-bit</NOBR>
826quadruple-precision.
827The remaining five types are <NOBR>32-bit</NOBR> and <NOBR>64-bit</NOBR>
828unsigned integers, <NOBR>32-bit</NOBR> and <NOBR>64-bit</NOBR>
829two&rsquo;s-complement signed integers, and Boolean values (the results of
830comparison operations).
831Boolean values are represented as a single character, either a <CODE>0</CODE>
832(false) or a <CODE>1</CODE> (true).
833A <NOBR>32-bit</NOBR> integer is represented as 8 hexadecimal digits.
834Thus, for a signed <NOBR>32-bit</NOBR> integer, <CODE>FFFFFFFF</CODE> is
835&minus;1, and <CODE>7FFFFFFF</CODE> is the largest positive value.
836<NOBR>64-bit</NOBR> integers are the same except with 16 hexadecimal digits.
837</P>
838
839<P>
840Floating-point values are written decomposed into their sign, encoded exponent,
841and encoded significand.
842First is the sign character <NOBR>(<CODE>+</CODE> or <CODE>-</CODE>),</NOBR>
843followed by the encoded exponent in hexadecimal, then a period
844(<CODE>.</CODE>), and lastly the encoded significand in hexadecimal.
845</P>
846
847<P>
848For <NOBR>16-bit</NOBR> half-precision, notable values include:
849<BLOCKQUOTE>
850<TABLE CELLSPACING=0 CELLPADDING=0>
851<TR><TD><CODE>+00.000&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD><TD>+0</TD></TR>
852<TR><TD><CODE>+0F.000</CODE></TD><TD>&nbsp;1</TD></TR>
853<TR><TD><CODE>+10.000</CODE></TD><TD>&nbsp;2</TD></TR>
854<TR><TD><CODE>+1E.3FF</CODE></TD><TD>maximum finite value</TD></TR>
855<TR><TD><CODE>+1F.000</CODE></TD><TD>+infinity</TD></TR>
856<TR><TD>&nbsp;</TD></TR>
857<TR><TD><CODE>-00.000</CODE></TD><TD>&minus;0</TD></TR>
858<TR><TD><CODE>-0F.000</CODE></TD><TD>&minus;1</TD></TR>
859<TR><TD><CODE>-10.000</CODE></TD><TD>&minus;2</TD></TR>
860<TR>
861 <TD><CODE>-1E.3FF</CODE></TD>
862 <TD>minimum finite value (largest magnitude, but negative)</TD>
863</TR>
864<TR><TD><CODE>-1F.000</CODE></TD><TD>&minus;infinity</TD></TR>
865</TABLE>
866</BLOCKQUOTE>
867Certain categories are easily distinguished (assuming the <CODE>x</CODE>s are
868not all 0):
869<BLOCKQUOTE>
870<TABLE CELLSPACING=0 CELLPADDING=0>
871<TR>
872 <TD><CODE>+00.xxx&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD>
873 <TD>positive subnormal numbers</TD>
874</TR>
875<TR><TD><CODE>+1F.xxx</CODE></TD><TD>positive NaNs</TD></TR>
876<TR><TD><CODE>-00.xxx</CODE></TD><TD>negative subnormal numbers</TD></TR>
877<TR><TD><CODE>-1F.xxx</CODE></TD><TD>negative NaNs</TD></TR>
878</TABLE>
879</BLOCKQUOTE>
880</P>
881
882<P>
883Likewise for other formats:
884<BLOCKQUOTE>
885<TABLE CELLSPACING=0 CELLPADDING=0>
886<TR><TD>32-bit single</TD><TD>64-bit double</TD><TD>128-bit quadruple</TD></TR>
887<TR><TD>&nbsp;</TD></TR>
888<TR>
889<TD><CODE>+00.000000&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD>
890<TD><CODE>+000.0000000000000&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD>
891<TD><CODE>+0000.0000000000000000000000000000&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD>
892<TD>+0</TD>
893</TR>
894<TR>
895<TD><CODE>+7F.000000</CODE></TD>
896<TD><CODE>+3FF.0000000000000</CODE></TD>
897<TD><CODE>+3FFF.0000000000000000000000000000</CODE></TD>
898<TD>&nbsp;1</TD>
899</TR>
900<TR>
901<TD><CODE>+80.000000</CODE></TD>
902<TD><CODE>+400.0000000000000</CODE></TD>
903<TD><CODE>+4000.0000000000000000000000000000</CODE></TD>
904<TD>&nbsp;2</TD>
905</TR>
906<TR>
907<TD><CODE>+FE.7FFFFF</CODE></TD>
908<TD><CODE>+7FE.FFFFFFFFFFFFF</CODE></TD>
909<TD><CODE>+7FFE.FFFFFFFFFFFFFFFFFFFFFFFFFFFF</CODE></TD>
910<TD>maximum finite value</TD>
911</TR>
912<TR>
913<TD><CODE>+FF.000000</CODE></TD>
914<TD><CODE>+7FF.0000000000000</CODE></TD>
915<TD><CODE>+7FFF.0000000000000000000000000000</CODE></TD>
916<TD>+infinity</TD>
917</TR>
918<TR><TD>&nbsp;</TD></TR>
919<TR>
920<TD><CODE>-00.000000&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD>
921<TD><CODE>-000.0000000000000&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD>
922<TD><CODE>-0000.0000000000000000000000000000&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD>
923<TD>&minus;0</TD>
924</TR>
925<TR>
926<TD><CODE>-7F.000000</CODE></TD>
927<TD><CODE>-3FF.0000000000000</CODE></TD>
928<TD><CODE>-3FFF.0000000000000000000000000000</CODE></TD>
929<TD>&minus;1</TD>
930</TR>
931<TR>
932<TD><CODE>-80.000000</CODE></TD>
933<TD><CODE>-400.0000000000000</CODE></TD>
934<TD><CODE>-4000.0000000000000000000000000000</CODE></TD>
935<TD>&minus;2</TD>
936</TR>
937<TR>
938<TD><CODE>-FE.7FFFFF</CODE></TD>
939<TD><CODE>-7FE.FFFFFFFFFFFFF</CODE></TD>
940<TD><CODE>-7FFE.FFFFFFFFFFFFFFFFFFFFFFFFFFFF</CODE></TD>
941<TD>minimum finite value</TD>
942</TR>
943<TR>
944<TD><CODE>-FF.000000</CODE></TD>
945<TD><CODE>-7FF.0000000000000</CODE></TD>
946<TD><CODE>-7FFF.0000000000000000000000000000</CODE></TD>
947<TD>&minus;infinity</TD>
948</TR>
949<TR><TD>&nbsp;</TD></TR>
950<TR>
951<TD><CODE>+00.xxxxxx</CODE></TD>
952<TD><CODE>+000.xxxxxxxxxxxxx</CODE></TD>
953<TD><CODE>+0000.xxxxxxxxxxxxxxxxxxxxxxxxxxxx</CODE></TD>
954<TD>positive subnormals</TD>
955</TR>
956<TR>
957<TD><CODE>+FF.xxxxxx</CODE></TD>
958<TD><CODE>+7FF.xxxxxxxxxxxxx</CODE></TD>
959<TD><CODE>+7FFF.xxxxxxxxxxxxxxxxxxxxxxxxxxxx</CODE></TD>
960<TD>positive NaNs</TD>
961</TR>
962<TR>
963<TD><CODE>-00.xxxxxx</CODE></TD>
964<TD><CODE>-000.xxxxxxxxxxxxx</CODE></TD>
965<TD><CODE>-0000.xxxxxxxxxxxxxxxxxxxxxxxxxxxx</CODE></TD>
966<TD>negative subnormals</TD>
967</TR>
968<TR>
969<TD><CODE>-FF.xxxxxx</CODE></TD>
970<TD><CODE>-7FF.xxxxxxxxxxxxx</CODE></TD>
971<TD><CODE>-7FFF.xxxxxxxxxxxxxxxxxxxxxxxxxxxx</CODE></TD>
972<TD>negative NaNs</TD>
973</TR>
974</TABLE>
975</BLOCKQUOTE>
976</P>
977
978<P>
979The <NOBR>80-bit</NOBR> double-extended-precision values are a little unusual
980in that the leading bit of precision is not hidden as with other formats.
981When canonically encoded, the leading significand bit of an <NOBR>80-bit</NOBR>
982double-extended-precision value will be 0 if the value is zero or subnormal,
983and will be 1 otherwise.
984Hence, the same values listed above appear in <NOBR>80-bit</NOBR>
985double-extended-precision as follows (note the leading <CODE>8</CODE> digit in
986the significands):
987<BLOCKQUOTE>
988<TABLE CELLSPACING=0 CELLPADDING=0>
989<TR>
990 <TD><CODE>+0000.0000000000000000&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD>
991 <TD>+0</TD>
992</TR>
993<TR><TD><CODE>+3FFF.8000000000000000</CODE></TD><TD>&nbsp;1</TD></TR>
994<TR><TD><CODE>+4000.8000000000000000</CODE></TD><TD>&nbsp;2</TD></TR>
995<TR>
996 <TD><CODE>+7FFE.FFFFFFFFFFFFFFFF</CODE></TD>
997 <TD>maximum finite value</TD>
998</TR>
999<TR><TD><CODE>+7FFF.8000000000000000</CODE></TD><TD>+infinity</TD></TR>
1000<TR><TD>&nbsp;</TD></TR>
1001<TR><TD><CODE>-0000.0000000000000000</CODE></TD><TD>&minus;0</TD></TR>
1002<TR><TD><CODE>-3FFF.8000000000000000</CODE></TD><TD>&minus;1</TD></TR>
1003<TR><TD><CODE>-4000.8000000000000000</CODE></TD><TD>&minus;2</TD></TR>
1004<TR>
1005 <TD><CODE>-7FFE.FFFFFFFFFFFFFFFF</CODE></TD>
1006 <TD>minimum finite value</TD>
1007</TR>
1008<TR><TD><CODE>-7FFF.8000000000000000</CODE></TD><TD>&minus;infinity</TD></TR>
1009</TABLE>
1010</BLOCKQUOTE>
1011</P>
1012
1013<P>
1014Lastly, exception flag values are represented by five characters, one character
1015per flag.
1016Each flag is written as either a letter or a period (<CODE>.</CODE>) according
1017to whether the flag was set or not by the operation.
1018A period indicates the flag was not set.
1019The letter used to indicate a set flag depends on the flag:
1020<BLOCKQUOTE>
1021<TABLE CELLSPACING=0 CELLPADDING=0>
1022<TR>
1023 <TD><CODE>v&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD>
1024 <TD><I>invalid</I> exception</TD>
1025</TR>
1026<TR>
1027 <TD><CODE>i</CODE></TD>
1028 <TD><I>infinite</I> exception (&ldquo;divide by zero&rdquo;)</TD>
1029</TR>
1030<TR><TD><CODE>o</CODE></TD><TD><I>overflow</I> exception</TD></TR>
1031<TR><TD><CODE>u</CODE></TD><TD><I>underflow</I> exception</TD></TR>
1032<TR><TD><CODE>x</CODE></TD><TD><I>inexact</I> exception</TD></TR>
1033</TABLE>
1034</BLOCKQUOTE>
1035For example, the notation <CODE>...ux</CODE> indicates that the
1036<I>underflow</I> and <I>inexact</I> exception flags were set and that the other
1037three flags (<I>invalid</I>, <I>infinite</I>, and <I>overflow</I>) were not
1038set.
1039The exception flags are always written following the value returned as the
1040result of the operation.
1041</P>
1042
1043
1044<H2>8. Variations Allowed by the IEEE Floating-Point Standard</H2>
1045
1046<P>
1047The IEEE Floating-Point Standard admits some variation among conforming
1048implementations.
1049Because TestFloat expects the two implementations being compared to deliver
1050bit-for-bit identical results under most circumstances, this leeway in the
1051standard can result in false errors being reported if the two implementations
1052do not make the same choices everywhere the standard provides an option.
1053</P>
1054
1055<H3>8.1. Underflow</H3>
1056
1057<P>
1058The standard specifies that the <I>underflow</I> exception flag is to be raised
1059when two conditions are met simultaneously:
1060<NOBR>(1) <I>tininess</I></NOBR> and <NOBR>(2) <I>loss of accuracy</I></NOBR>.
1061</P>
1062
1063<P>
1064A result is tiny when its magnitude is nonzero yet smaller than any normalized
1065floating-point number.
1066The standard allows tininess to be determined either before or after a result
1067is rounded to the destination precision.
1068If tininess is detected before rounding, some borderline cases will be flagged
1069as underflows even though the result after rounding actually lies within the
1070normal floating-point range.
1071By detecting tininess after rounding, a system can avoid some unnecessary
1072signaling of underflow.
1073All the TestFloat programs support options <CODE>-tininessbefore</CODE> and
1074<CODE>-tininessafter</CODE> to control whether TestFloat expects tininess on
1075underflow to be detected before or after rounding.
1076One or the other is selected as the default when TestFloat is compiled, but
1077these command options allow the default to be overridden.
1078</P>
1079
1080<P>
1081Loss of accuracy occurs when the subnormal format is not sufficient to
1082represent an underflowed result accurately.
1083The original 1985 version of the IEEE Standard allowed loss of accuracy to be
1084detected either as an <I>inexact result</I> or as a
1085<I>denormalization loss</I>;
1086however, few if any systems ever chose the latter.
1087The latest standard requires that loss of accuracy be detected as an inexact
1088result, and TestFloat can test only for this case.
1089</P>
1090
1091<H3>8.2. NaNs</H3>
1092
1093<P>
1094The IEEE Standard gives the floating-point formats a large number of NaN
1095encodings and specifies that NaNs are to be returned as results under certain
1096conditions.
1097However, the standard allows an implementation almost complete freedom over
1098<EM>which</EM> NaN to return in each situation.
1099</P>
1100
1101<P>
1102By default, TestFloat does not check the bit patterns of NaN results.
1103When the result of an operation should be a NaN, any NaN is considered as good
1104as another.
1105This laxness can be overridden with the <CODE>-checkNaNs</CODE> option of
1106programs <CODE>testfloat_ver</CODE> and <CODE>testfloat</CODE>.
1107In order for this option to be sensible, TestFloat must have been compiled so
1108that its internal floating-point implementation (SoftFloat) generates the
1109proper NaN results for the system being tested.
1110</P>
1111
1112<H3>8.3. Conversions to Integer</H3>
1113
1114<P>
1115Conversion of a floating-point value to an integer format will fail if the
1116source value is a NaN or if it is too large.
1117The IEEE Standard does not specify what value should be returned as the integer
1118result in these cases.
1119Moreover, according to the standard, the <I>invalid</I> exception can be raised
1120or an unspecified alternative mechanism may be used to signal such cases.
1121</P>
1122
1123<P>
1124TestFloat assumes that conversions to integer will raise the <I>invalid</I>
1125exception if the source value cannot be rounded to a representable integer.
1126In such cases, TestFloat expects the result value to be the largest-magnitude
1127positive or negative integer or zero, as detailed earlier in
1128<NOBR>section 6.1</NOBR>, <I>Conversion Operations</I>.
1129If option <CODE>-checkInvInts</CODE> is selected with programs
1130<CODE>testfloat_ver</CODE> and <CODE>testfloat</CODE>, integer results of
1131invalid operations are checked for an exact match.
1132In order for this option to be sensible, TestFloat must have been compiled so
1133that its internal floating-point implementation (SoftFloat) generates the
1134proper integer results for the system being tested.
1135</P>
1136
1137
1138<H2>9. Contact Information</H2>
1139
1140<P>
1141At the time of this writing, the most up-to-date information about TestFloat
1142and the latest release can be found at the Web page
1143<A HREF="http://www.jhauser.us/arithmetic/TestFloat.html"><NOBR><CODE>http://www.jhauser.us/arithmetic/TestFloat.html</CODE></NOBR></A>.
1144</P>
1145
1146
1147</BODY>
1148
注意: 瀏覽 TracBrowser 來幫助您使用儲存庫瀏覽器

© 2025 Oracle Support Privacy / Do Not Sell My Info Terms of Use Trademark Policy Automated Access Etiquette