CPFloat: Custom-Precision Floating-Point numbers

CPFloat is a C library for simulating low-precision floating-point arithmetics. CPFloat provides efficient routines for rounding, performing arithmetic operations, evaluating mathematical functions, and querying properties of the simulated low-precision format. Internally, numbers are stored in float or double arrays. The low-precision format (target format) follows an extension of the formats defined in the IEEE 754 standard [5] and is entirely specified by four parameters:

a positive integer p, which represents the number of digits of precision;
a positive integer e_min, which represents the minimum supported exponent;
a positive integer e_max, which represents the maximum supported exponent; and
a Boolean variable σ, set to true if subnormal are supported and to false otherwise.

Valid choices of p, e_min, and e_max depend on the format in which the converted numbers are to be stored (storage format). A more extensive description of the characteristics of the low-precision formats that can be used, together with more details on admissible values for p, e_min, e_max, and σ can be found in [1].

The library was originally intended as a faster version of the MATLAB function chop [2], which is available on GitHub. The latest versions of the library have a variety of subtle differences compared with chop.

Since 14 June 2022, chop supports specifying the function for generating random numbers. The MEX interface of CPFloat does not offer this capability, as the pseudo-random numbers used are generated in C and not in MATLAB.
Since v0.6.0, CPFloat allows users to specify e_min and e_max separately. In earlier versions, users can only specify e_max, while e_min is set to 1 – e_max.
Since v0.6.0, the default 8-bit format E4M3 has e_max = 8 and e_min = –6, which is consistent with the homonymous format in the December 2023 revision of the OCP 8-bit Floating Point Specification (OFP8) [3]. In chop, e_max = 7 and e_min = –6.

The code to reproduce the results of the tests in [1] is available on GitHub.

Dependencies

The only (optional) dependency of CPFloat is the C implementation of the PCG Library, which provides a variety of high-quality pseudo-random number generators. For an in-depth discussion of the algorithms underlying the PCG Library, we recommend the paper by Melissa O’Neill [4]. If the header file pcg_variants.h in include/pcg-c/include/pcg_variants.h is not included at compile-time with the --include option, then CPFloat relies on the default C pseudo-random number generator.

The PCG Library is free software (see the Licensing information below), and its generators are more efficient, reliable, and flexible than any combination of the functions srand, rand, and rand_r from the C standard library. A warning is issued at compile time if the location of pcg_variant.h is not specified correctly.

Compiling the MEX interface requires a reasonably recent version of MATLAB or Octave.

Developer dependencies

Testing the MEX interface requires the function float_params, which is available on GitHub. The unit tests for the C implementation in test/cpfloat_test.ts require the check unit testing framework for C, including the checkmk script, and the subunit protocol.

Installation

No installation is needed in order to use CPFloat as a header-only library. The shared and static libraries can be built with

make lib

If the compilation is successful, the header and library files of CPFloat will be located in the build/include and build/lib directories, respectively. By default, make lib does not run autotuning.

To rebuild the libraries after running autotuning explicitly, use

make lib-autotuned

The library can be installed in <path> with

make install --prefix=<path>

which copies the header and library files in <path>/include and <path>/lib, respectively. The default value of <path>, which is used if the --prefix option is not supplied, is /usr/local.

MEX interface

The MEX interface can be compiled automatically with either

make mexmat # Compile MEX interface for MATLAB.

or

make mexoct # Compile MEX interface for Octave.

These two commands compile and autotune the MEX interface in MATLAB and Octave, respectively, by using the functions mex/cpfloat_compile.m and mex/cpfloat_autotune.m. To use the interface, the bin/ folder must be in MATLAB’s search path.

On a system where the make build automation tool is not available, we recommend building the MEX interface by running the script cpfloat_compile_nomake.m in the mex/ folder. The script attempts to compile and autotune the MEX interface using the default C compiler. The following code will download the repository as a ZIP file, inflate it, and try to compile it:

zip_url = 'https://codeload.github.com/north-numerical-computing/cpfloat/zip/refs/heads/main';
unzip(zip_url);
movefile('cpfloat-main', 'cpfloat')
cd('cpfloat/mex');
cpfloat_compile_nomake

A different compiler can be used by setting the value of the variable compilerpath appropriately. If the chosen compiler does not support OpenMP, only the sequential version of the algorithm will be produced and no autotuning will take place.

On Windows, we have not been able to compile the PCG Library using the C compiler recommended by MATLAB. Therefore, the script uses the pseudo-random number generator in the C standard library by default.

Autotuning

CPFloat provides a sequential and a parallel implementation of the rounding functions. OpenMP introduces some overhead, and using a single thread is typically faster for arrays with few elements. Therefore, the library provides a facility to switch between the single-threaded and the multi-threaded variants automatically, depending on the size of the input. The threshold is machine-dependent, and the best value for a given system can be found by invoking

make autotune

which compiles the file src/cpfloat_autotune.c, runs it, and updates the files src/cpfloat_threshold_binary32.h and src/cpfloat_threshold_binary64.h.

Documentation

The documentation of CPFloat can be generated with the command

make docs

which relies on Doxygen to format the Javadoc-style comments in the source files, and on Sphinx, with the Breathe and Exhale extensions, to generate the HTML version of the documentation that can be found in the docs/html/ directory.

Using CPFloat

CPFloat can be used as a header-only, shared, or static library. Examples for these three scenarios can be found in the Makefile (cf. targets $(BINDIR)cpfloat_test, $(BINDIR)libcpfloat_shared_test, and $(BINDIR)libcpfloat_static_test, respectively). Here we provide a brief summary.

Header-only library. The only requirement is that the files in the src/ directory be in the include path of the compiler. In order to use the PCG Library, one can either:
- specify the path of the file pcg_variants.h using the preprocessor option --include (see the variable CFLAGS in the Makefile for an example); or
- make sure that pcg_variants.h is in the include path and uncomment the preprocessor instruction on line 34 of src/cpfloat_definitions.h, that is, /* #include "pcg_variants.h" */. In either case, it is necessary link the executable against the pcg-random library, which can be obtained by passing the option -lpcg-random to the linker. The library libpcg-random.a must be in the load path.
Shared library. The five header files in the build/include directory must be in the include path of the compiler. The options -lcpfloat and -lm must be passed to the linker, and the libraries libcpfloat.so and m.so must be in the load path.
Static library. The static library uses the same five header files as the shared library, which are located in the build/include and must be in the include path of the compiler. Executable must be linked with the -static and -lcpfloat options, and the library file libcpfloat.a must be in the load path. Linking against the math library is not needed in this case.

Code validation

The test/ directory contains two sets of test, one for the C library and one for the MEX interface. The unit tests for the C implementation require the check library, and can be run with

make ctest

for the header-only library or with

make libtest

for the shared and static libraries. The two commands use the same batch of unit tests, which is generated from the file test/cpfloat_test.ts using the checkmk script.

The MEX interface can be tested by using either

make mtest # Test MEX interface using MATLAB.

or

make otest # Test MEX interface using Octave.

These two commands run, in MATLAB and Octave respectively, the function test/cpfloat_test.m. This set of tests is based on the MATLAB script test_chop.m, available on GitHub: some changes were necessary in order to make it compatible with Octave.

References

[1] Massimiliano Fasi and Mantas Mikaitis. CPFloat: A C library for simulating low-precision arithmetic. ACM Trans. Math. Softw., 49(2), Article No.: 18, June 2023.

[2] Nicholas J. Higham and Srikara Pranesh, Simulating Low Precision Floating-Point Arithmetic, SIAM J. Sci. Comput., 41, C585-C602, 2019.

[3] Paulius Micikevicius, Stuart Oberman, Pradeep Dubey, Marius Cornea, Andres Rodriguez, Ian Bratt, Richard Grisenthwaite, Norm Jouppi, Chiachen Chou, Amber Huffman, Michael Schulte, Ralph Wittig, Dharmesh Jani, Summer Deng. OCP 8-bit Floating Point Specification (OFP8), pp. 1–16, Revision 1.0, Open Compute Project, June 2023. Revised December 2023.

[4] Melissa E. O’Neill, PCG: A family of simple fast space-efficient statistically good algorithms for random number generation, Technical report HMC-CS-2014-0905, Harvey Mudd College, Claremont, CA, September 2014.

[5] 754-2019 IEEE Standard for Floating-Point Arithmetic, pp. 1–84, Institute of Electrical and Electronics Engineers, July 2019. Revision of IEEE Std 754-2008.

Acknowledgements

The library was written by Massimiliano Fasi and Mantas Mikaitis. We thank Nicolas Louvet, Theo Mary, Ian McInerney, and Siegfried Rump for reporting bugs and suggesting improvements.

Licensing information

CPFloat is distributed under the GNU Lesser General Public License, Version 2.1 or later (see LICENSE.md). Please contact us if you would like to use CPFloat in an open source project distributed under the terms of a license that is incompatible with the GNU LGPL. We might be able to relicense the software for you.

The PCG Library is distributed under the terms of either the Apache License, Version 2.0 or the Expat License, at the option of the user.

The MATLAB function float_params is distributed under the terms of the BSD 2-Clause “Simplified” License.

The MATLAB function chop is distributed under the terms of the BSD 2-Clause “Simplified” License.