Numberphile videos often "nerd snipe" me, and it's happened again. If I had tagging on my posts I'd be able to provide a link here to all my Numberphile-related posts. Maybe someday. I tried to make this Numberphile diversion a super quick one.
The Numberphile video in question here is a few years old. It's titled "2.920050977316", and it introduces the "Buenos Aires Constant," which is a decimal number that generates the sequence of primes.
This constant was first described by four Numberphile viewers in a (paywalled) formal paper. The paper was "Received Sep 16th 2017, Accepted May 29th 2018, Published online: Jan 30th 2019" and the Numberphile video was released Nov. 26th 2020, so it took quite a while to surface for public discussion.
The video demonstrated using the constant to generate primes, and how to calculate the constant given the primes. What they didn't discuss, however, is how many primes can be generated for some number of decimal places. I wanted to find that out, and also to calculate the constant for myself, so I fired up the text editor and wrote some Python code.
To use the Buenos Aires Constant to calculate the series of primes:
For example, we start the sequence with:
...and find the next prime with:
...and find the following prime with:
...and so on.
In the Numberphile video, Dr. Grime demonstrated the method for using the series of primes to calculate a value for this constant. I instead wanted to use a simple algorithm to find it. And I wanted to write the initial version myself, without using AI. (I guess we're now at the point where programming projects might be assumed to involve AI unless the programmer says otherwise.)
For my algorithm, I figured something like a binary search would work, where we repeatedly halve the search space between two decimal values. Based on how many primes the high, middle, and low values each produce, we can select either the high and middle or the middle and low as our new "high" and "low" values, and repeat this process to add decimal digits of precision and approach the value for the constant.
Python has arbitrary precision integers, which is great, but it does not have arbitrary precision "decimal" numbers (real numbers in math lingo). To solve this, we multiply our decimal values by some large mult factor to create integers, perform integer math, and finally divide our calculation results by that factor to end up with a decimal value. When we calculate the middle values for the binary search, we may require an additional digit of precision, so mult can be increased at that point.
I initially was using a lot of extra digits of precision, put Claude Opus 4.8 clarified for me that we only need to add a digit of precision when the "high" and "low" values differ by one. I eventually figured out why that might make sense: for the binary math behind Python's integers, two odd numbers or two even numbers will result in the least significant bit being 0 when they're added. When that least significant bit is 0, we can divide the sum by two (shifting right by one bit) without any loss of precision. We only need to increase the precision when the least significant bit of the sum is 1. That allows us to perform integer division by two to find the midpoint.
This binary search algorithm is a recursive algorithm, but I knew we'd be invoking the recursion many (millions?) times. So I structured the script as a simple loop with just a Python list to track the stack state. When we split the "high" and "low" bounds into two new pairs of bounds, we can append both to the stack, and pop the stack at the top of the loop to continue the algorithm.
For the first version of the script I wrote, I didn't discard any stack frames. Often, the high, middle, and low points would all calculate the same number of primes, so it's impossible to tell which one (the high-middle or the middle-low) contains the true value of the constant. I was able to calculate a few digits of the constant with this approach, but obviously the stack got huge and the script never really got anywhere.
After a quick consultation with Claude Opus 4.8, I added two lines to the script. We can discard a stack frame (the high-middle and/or the middle-low) if both ends of the range find fewer primes than the best known value. That one change allowed the script to instantly run much much faster.
I've published the script on GitHub.
I used the Python script to output the constant at a few points. Notice the final digits in the constant values shown here are incorrect.
The first 5 primes can be calculated with 3 decimal places:
10 primes with 8 decimal places:
25 primes with 34 decimal places:
50 primes with 89 decimal places:
100 primes with 217 decimal places:
...and so on.
Let's plot a few things to see how these grow with the number of primes:
I was interested in whether the Buenos Aires Constant is somehow a more "compressed" representation of the sequence of primes. For example, we can use "1,2,2,4,2,4,2,4,6" (a list of differences between primes) to recreate the prime sequence "2,3,5,7,11,13,17,19,23,29". That's 17 characters of information to recreate 25. Using the constant, these first 10 primes can be computed with only 8 (or 10 depending on what you're counting) digits/characters: 2.92005098.
8 or 10 characters is fewer than 17! So is the constant really able to generate the sequence of primes using less information than is contained in the primes themselves? No. While this is true for the first 10 primes, if we look at longer sequences of primes the constant soon overtakes the differences list in required length. (The constant does require fewer digits than the primes themselves, however, which is kind of interesting.)
If we look at the plot above, for 89 primes or fewer, they are very close but the Buenos Aires Constant does require fewer digits than the differences list. So that's pretty cool. For more than 89 primes however, the constant's digit count grows at a faster rate. From 500 primes to 750 primes, the constant digit count grows at rate of 3.66 digits/chars per prime, while the "differences list with separator" grows at a rate of 2.37 digits/chars per prime.
Yes, we have another unrelated Numberphile video, released 4 years later in 2024, called "The Prime Constant".
This separate "Prime Constant" is a bit more straightforward. It's just a base 10 representation of the binary number created by setting the Nth binary digit to 0 if the Nth integer is composite, or to 1 if the Nth integer is prime.
Here are some examples:
Encoding the first 5 primes as 1s, in binary and decimal:
the first 10 primes:
the first 25 primes:
the first 50 primes:
the first 100 primes:
As you can see in the above plot, the "compression" rate of this "Prime Constant" number is negative. It requires more digits than the primes themselves. From 500 primes to 750 primes, the "Prime Constant" grows at a rate of 8.49 digits per prime. This is much higher than the Buenos Aires Constant at 3.66 digits per prime, and the "differences list with separator" rate of 2.37 digits/chars per prime.
In fact, the "Prime Constant" uses a number of digits equal to the largest prime represented in the number. In other words, when recording the primes 2 and 3, it uses 3 digits. When recording 10 primes, up to and including 29, it requires 29 digits to be written out. Up to and including the 750th prime, 5693, the Prime Constant requires 5,693 digits, while those primes themselves, including a separator character (such as a comma) require 3,552 digits/characters.
Another strange quirk is that a binary value less than one like "0.0110101000101000101" will always have the same number of digits as the base 10 version! That binary number is "0.4146823883056640625" in decimal, and is also 19-digits long.
None. Just recreational math and programming — it was fun to figure out an algorithm to calculate the constant.