floating-point

(redirected from floating-points)
Also found in: Dictionary.

floating-point

(programming, mathematics)
A number representation consisting of a mantissa, M, an exponent, E, and a radix (or "base"). The number represented is M*R^E where R is the radix.

In science and engineering, exponential notation or scientific notation uses a radix of ten so, for example, the number 93,000,000 might be written 9.3 x 10^7 (where ^7 is superscript 7).

In computer hardware, floating point numbers are usually represented with a radix of two since the mantissa and exponent are stored in binary, though many different representations could be used. The IEEE specify a standard representation which is used by many hardware floating-point systems. Non-zero numbers are normalised so that the binary point is immediately before the most significant bit of the mantissa. Since the number is non-zero, this bit must be a one so it need not be stored. A fixed "bias" is added to the exponent so that positive and negative exponents can be represented without a sign bit. Finally, extreme values of exponent (all zeros and all ones) are used to represent special numbers like zero and positive and negative infinity.

See also floating-point accelerator, floating-point unit.

Opposite: fixed-point.
References in periodicals archive ?
L1 and L2 are the values from the initial floating-point and final block floating-point version of the solution
The initial 32bit floating-point solution is considered to be the benchmark.
If we approximate the GPUs FLOPS (floating-point operations per second) with the bandwidth we see just how little compute potential of the GPU is used.
The novelty presented in this paper is using the block floating-point arithmetic in an area where it is not usually applied.
D'Hollander, "High-level synthesis optimization for blocked floating-point matrix multiplication", ACM SIGARCH Computer Architecture News, vol.
Bauer, "Realization of block floating-point digital filters and application to block implementations", IEEE Trans.
Floating-Point. Floating-point arithmetic instructions in CUDA GPUs comply with 754-2008 IEEE Standard for Floating-Point Arithmetic [28].
Add and multiply(-add) operation instructions for floating-point are natively supported.
Unlike integer instructions, the floating-point add or multiply-add instructions do not support carry flag (CF) bit.
It is found that integer is more advantageous in add operation than floating-point, but the multiply(-add) operation of DPF is 2.6 times the performance of integer.
Floating-point arithmetic does not support bitwise operations which are frequently used.
In CUDA, fused multiply-add (fma) instruction is provided to perform floating-point multiply-add operation.