Photo by Mika Baumeister on Unsplash

# JSON's Numeric Boundaries: The Lesser-Known Reality of Inaccurate Figures

## Exploring the Complexities of IEEE 754 Floating Point Arithmetic in JSON Numbers

## Introductions

**JSON** (JavaScript Object Notation) has become a cornerstone in the world of data exchange, particularly in web applications. Its appeal lies in its simplicity and readability, making it the preferred choice for developers worldwide. However, when it comes to dealing with numbers, **JSON's** approach is somewhat less straightforward than it initially appears.

Guess what happens when you run this **JavaScript** code?

```
const x = 9223372036854775807
console.log(x)
```

You might be in for an amazing surprise!

### JSON Specification

The **JSON** standard, as defined in RFC 8259 describe **Numbers** as follow:

A number is represented in base 10 using decimal digits. It contains an integer component that may be prefixed with an optional minus sign, which may be followed by a fraction part and/or an exponent part. Leading zeros are not allowed. .... This specification allows implementations to set limits on the range and precision of numbers accepted.

This simplicity can lead to **inconsistencies** when exchanging data across different programming languages. Each language interprets numbers slightly differently, which can affect both the ** size** and

**of these values. It's like translating the same phrase into multiple languages - the core idea remains, but the details might vary slightly from one language to another.**

*precision*For instance, a number that is perfectly valid and accurately represented in one language could be misinterpreted or result in **precision errors** in another. This is particularly relevant when dealing with large numbers or numbers that require high precision, such as **financial calculations**.

### The pitfall

A classic example is the interaction between a **Go** backend and **JavaScript** frontend using **JSON** as the data interchange format.

In **Go**, an `int64`

type is often used to handle large integers, accommodating a wide range of values securely. However, when this `int64`

number is passed to a **JavaScript** frontend via **JSON**, the waters get murky. **JavaScript** represents numbers using the **IEEE 754 double-precision floating-point format**. This format has limitations, especially with integers larger than `Number.MAX_SAFE_INTEGER`

in JavaScript, which is **2 ^{53} -1**.

To illustrate the potential pitfalls of **JSON's** numeric handling, let's consider a concrete example involving a large number. Imagine you have a Go backend that needs to send a large integer value to a **JavaScript** frontend. The **JSON** representation of this data might look like this:

```
{ "bigNumber": 9223372036854775807 } // Max int64 in Go
```

This number the maximum integer that can be represented by `int64`

in **Go**. When this **JSON** is processed by a **JavaScript** frontend, issues can arise. Let's look at what happens when this **JSON** is parsed in **JavaScript**:

```
const jsonData = '{"bigNumber": 9223372036854775807}';
const parsedData = JSON.parse(jsonData);
// 9223372036854776000
console.log(parsedData.bigNumber);
```

In **JavaScript**, due to its handling of numbers as **IEEE 754 double-precision floating-point values**, the `bigNumber`

did not retain its original precision. The `9223372036854775807`

passed from backend will get imprecisely parsed as `9223372036854776000`

.

### Mechanics of IEEE 754

**IEEE 754** is a technical standard describing the technical detail of floating-point arithmetic.

In the standard for double precision (64-bit format), floating-point numbers are represented in three parts:

**S - Sign Bit (1 bit):**Indicates whether the number is positive (0) or negative (1).**E - Exponent (11 bits):**Determines the range of the number, with a**bias of 1023**. This**bias is subtracted**from the stored exponent to obtain the**actual exponent**value.**M - Mantissa (52 bits):**Represents the precision of the number. There's an implied leading 1 in the mantissa for normalized numbers, which is not stored explicitly. Also known as Significand.

The formula to represent a number in this format is:

$$(−1)^S×1.M×2^{E−1023}$$

The Exponent in a floating-point number is somewhat like a slider that enables the representation of either extremely large or incredibly small numbers. With the 11-bit exponent in the IEEE 754 double-precision format, you can cover a vast range of magnitudes, from exceptionally large to minuscule numbers.

However, there's a catch: precision. **Precision** is akin to the level of detail a number can possess. When the exponent is used for a massive number, the detail—or precision—of that number decreases. This is because, in a fixed-size format like double precision, increasing the exponent's value (to represent a larger number) means there's less space for the mantissa to display the detailed portion of the number.

The same issue occurs for extremely small numbers. The IEEE 754 format can represent numbers incredibly close to zero, but once again, the precision is limited. The smaller the number, the less precision it can have in its fractional part.

For Example, value of `1`

in IEEE 754 format :

```
// For beverity, only first and last 4 bit are shown in mantissa part
| sign | exponent | mantissa |
0 01111111111 0000...0000
= 1
```

The sign bit `0`

indicate it is a positive number. The exponent bit `01111111`

represents an actual exponent of `0`

(after adjusting for bias of -1023). The Mantissa `0000....0000`

represents value of `1.0`

Let's take a look at `Number.MAX_SAFE_INTEGER`

value in IEEE 754 format

```
| sign | exponent | mantissa |
0 10000110011 1111...1111
= 9007199254740991 (Number.MAX_SAFE_INTEGER)
```

The exponent value `10000110011`

represents actual exponent of `52`

(1075 - 1023). The Mantissa `1111...1111`

(all set) represents value of `1.1111111111111111111111111111111111111111111111111111`

.

Using the formula presented earlier, we can get decode the IEEE 754 representation and calculate the actual value of `Number.MAX_SAFE_INTEGER`

$$1.1111111111111111111111111111111111111111111111111111×2^{52}$$

Notice that all 52 bits for Mantissa part of the representation are set. So what happens if we want to represent value of `Number.MAX_SAFE_INTEGER + 1`

in IEEE 754 format?

Lets add `1`

to the previous value of `Number.MAX_SAFE_INTEGER`

in IEEE 754 format. All we do is add 1 to the Mantissa and arrive at the following:

```
| sign | exponent | mantissa |
0 10000110100 0000...0000
= 9007199254740992 (Number.MAX_SAFE_INTEGER + 1)
```

Observe how the bit is carried forward and the exponent becomes `10000110100`

, with an actual exponent value of `53`

(1076-1023). Calculating `2^53`

is straightforward, and we arrive at the value of `9007199254740992`

since the Mantissa is `1.0`

.

The next representable number in **IEEE 754** format is `9007199254740994`

, with the binary format as follows:

```
| sign | exponent | mantissa |
0 10000110100 0000...0001
= 9007199254740994 (Number.MAX_SAFE_INTEGER + 3)
```

Now you might ask, what has changed? Look closely at how the actual exponent value has increased. This means that we've shifted the overall scaling of the formula by 2. As a result, we've lost the ability to represent the number `Number.MAX_SAFE_INTEGER + 2 (9007199254740993)`

.

We can verify this by running the following code in **JavaScript**

```
// 9007199254740991
console.log(Number.MAX_SAFE_INTEGER);
// 9007199254740992
console.log(Number.MAX_SAFE_INTEGER + 1);
console.log(Number.MAX_SAFE_INTEGER + 2);
// 9007199254740994
console.log(Number.MAX_SAFE_INTEGER + 3);
```

I've found IEEE-754 Analysis by Dr. Christopher to be useful in visualising effect of Exponent on precision.

### Conclusion

While **JSON** simplifies data exchange with its straightforward number representation, it also brings challenges, particularly in cross-language data handling. The key issue arises from different programming languages interpreting numeric values differently, affecting their size and precision. This is evident in interactions between Go and JavaScript, where **Go's** `int64`

can lose precision in **JavaScript** due to its **IEEE 754 double-precision floating-point format**.

Understanding the limitations of **IEEE 754** is crucial, especially for **high-precision needs like financial computations**. Developers must be aware of these differences and consider alternative strategies, such as **string representations for large numbers** or specialized libraries, to ensure data integrity.

Ultimately, this highlights the importance of a nuanced understanding of the tools and languages we use, emphasizing precision and accuracy in software engineering.