Benne de Weger's math page

personal   research interests   research stories (abc, 3n+1, MD5)
theses   books   papers   reports   data sets
Erdős number   PhD students   educational articles   miscellaneous

personaltop

retired associate professor
Department of Mathematics and Computer Science,
Eindhoven University of Technology, Eindhoven, The Netherlands
email: benne(at)deweger(dot)net
(note: b.m.m.d.weger(at)tue(dot)nl also still works)
private website: deweger.net

research intereststop

research storiestop

my story on the abc-conjecturetop

the conjecture: In september 1985 the abc-conjecture was formulated by David Masser and Joseph Oesterlé at a conference in the UK. The conjecture states that for every ε > 0 there are only finitely many triples of positive coprime integers a, b, c with a + b = c such that c > N1+ε, where N (the radical of (a, b, c)) is the product of all prime factors of abc. The quality of a triple is defined as q(a, b, c) = (log c)/(log N).
initial record: My supervisor professor Robert Tijdeman participated in this conference. Immediately after his return to Leiden he told me about the conjecture, as at the time I was working (for my PhD thesis) on computationally solving x + y = z in S-integers (for a given set of primes S, an S-integer is an integer composed of primes from S only). So to compute a small list of good examples, I just had to add a few lines of code to the program I already had, and to run it again. This immediately produced the record example at the time:
112 + 32 56 73 = 221 23,
with quality 1.62599…. One day later, on September 20, 1985, I went to professor Tijdeman to show him the fresh record. His reaction was to grab his pocket calculator and to type in the numbers, speaking the words: "I want to see the miracle happen." This small list was published in my 1987 Journal of Number Theory paper (Section 5E) and in my PhD Thesis (Chapter 6, p. 130).
beaten: In 1987 Eric Reyssat from Caen found a better example, communicated to me by Michel Waldschmidt. His method was to look at convergents from continued fractions of higher roots of integers (maybe even rationals), in particular he found the fifth root of 109 being extremely close to 23/9, leading to the wonderful example
2 + 310 109 = 235,
of quality 1.62991…, only slightly bigger than mine. This method basically is a simple instance of the methods I used, but as this example has primes above 13 in two places, I had missed it, as I had only looked for examples with primes above 13 in at most one place. My example today is still in second place. For up to date lists of abc-examples see Bart de Smit's tables.
algebraic numbers: In 1999 Niklas Broberg came with good examples for the uniform algebraic number field case. From his work it was immediately apparent to me that many of his examples have to do with exceptional solutions to Diophantine equations and/or exceptional values in recurrence sequences. And that also immediately brought to mind the possibility that the well known exceptional zero in the Berstel ternary recurrence was a good candidate for being a very good abc-example. This turned out to be the case, giving me again a record example, as can be read in my web note A new extreme abc-example, 'published' only on the web in 1999. Then history repeated itself, when a few years later Tim Dokchitser found a better example, leaving my example again at the second place, where it still is. See my web note A newer extreme abc-example due to Tim Dokchitser, 2003. Dutchmen call this being "eternal second" the Zoetemelk-syndroom.
big Ш's: At a conference in Edinburgh in 1996 I attended a talk by Dorian Goldfeld about elliptic curves, in particular Frey-Hellegouarch curves, with large Tate-Shafarevich group Ш (the proper spelling for "Shafarevich" is "Шафаревич", hence the symbol for this group). As Frey-Hellegouarch curves are linked to abc-triples, it turns out that good abc-triples may lead to exceptionally large Ш's. In the coffee break after Dorian's lecture I started up my brand new fancy laptop computer (running Apecs on Windows 95), and in a few minutes I had a record size Ш. I wrote a paper, titled "A+B=C and big Ш's", which yielded a theoretical conjecture on a sharp bound for the size of Ш's, linked to other conjectures, as well as interesting examples. My wife still remembers me sitting at the kitchen table with my laptop and getting excited each time a new large Ш popped up. Needless to say that shortly after this paper appeared, others, notably Abderrahmane Nitaj, found better examples and took over the record. David Broadhurst has found new records in 2021, see https://arxiv.org/abs/2111.07794.
big fudge: In 2018, on a conference in Leiden, I met Hector Pasten, who knew my work on big Ш's, mainly because of a small lemma on the sizes of Tamagawa products, a.k.a. fudge factors. A few years later he had a paper that gave me the idea that good abc-triples also may lead to exceptionally large Tamagawa products. So I did some experiments, and contacted David Broadhurst, who greatly improved my ideas and experiments. I still have to finish the paper, in the meantime, see the paper and tables, 2023.
xyz: Another interesting development in this area is the emergence of the xyz-conjecture, in a 2009 paper by Lagarias and Soundararajan. For abc-triples they introduced the new quality function Q*(a,b,c) = 3/2 (loglog c) / (log P), where P is the largest prime factor of abc. The xyz-conjecture then states that lim sup Q* = 1. I tested the "ABC@home" database for this new quality function. This yielded the new record
1 + 4374 = 4375 with Q* = 1.63904.
Impressive, isn't it? I challenge you to find a better example, and throw me back to second place again. I wrote a note "Numerical data related to the Lagarias-Soundararajan xyz-conjecture", 2012-2016, with my further findings, see the paper and tables.
abcd…: When introduced to the abc-conjecture it's only natural to wonder about the abcd, abcde, …-conjectures. These are taken together as the n-conjecture. My BSc student Coen Ramaekers did some initial investigations, see Coen's bachelor thesis and some tables.
approximation gain: Ongoing work: Karsten Müller introduced the approximation gain of an abc-triple, and inspired me to do computational experiments and to generalise the concept to the p-adic case. See the tables, soon to be accompanied by a paper.

my story on the 3n+1-conjecturetop

the conjecture: Consider the 3n+1-function T: ℕ → ℕ defined by
T(n) = n / 2 if n is even,
T(n) = (3n+1) / 2 if n is odd.
The conjecture states that for any natural number n the sequence n, T(n), T(T(n)), T(T(T(n))), … will eventually reach the number 1, and then enters the cycle 1 → 2 → 1.
advice: If you think you have found a proof of this conjecture, my advice is to think again. If you then still think you have a proof, my advice still is the same.
failed proof attempt: My unpublished note Comments on Opfer's alleged proof of the 3n + 1 Conjecture from 2011 criticizes a proof attempt by a serious mathematician.
cycles: With John Simons (Groningen) we investigated m-cycles. A cycle is an m-cycle if it contains m local maxima (and thus also m local minima). It has been known for a long time that 1 → 2 → 1 is the only 1-cycle (Ray Steiner, 1977), and John Simons proved in 2004 that there are no 2-cycles. Then he contacted me (via Rob Tijdeman), to see if we could prove more. We could, in our Acta Arithmetica paper from 2005 we proved that there are no m-cycles with 2 ≤ m ≤ 68, and in the update from 2010 we reached 2 ≤ m ≤ 75. We now have 2 ≤ m ≤ 77 (unpublished), and in 2023 Christian Hercher even reached 2 ≤ m ≤ 91. We have this still under investigation.
modular Collatz graphs: With Thijs Laarhoven we investigated the graphs of the modular 3n+1-function, that is, for a given modulus N the directed graph on 0, 1, 2, …, N - 1 with edge ij whenever there is an ni (mod N) such that T(n) ≡ j (mod N). In 2009 Thijs wrote an excellent BSc thesis on the case of N a power of 2, that grew out into our joint 2013 paper.
With Achilleas Karras we have work in progress on the determinants of modular Collatz graphs, explaining the wild behaviour of these determinants.
I also have work in progress on the spectra of modular Collatz graphs.
overview papers and talks:
miscellaneous: The 2004 joint paper with John Simons (in Dutch) on a connection with Mersenne Numbers describes John's work winning him the prize in a contest on the occasion of the 225th anniversary of the Wiskundig Genootschap (Dutch Mathematical Society).

my story on applications of MD5 collisionstop

Here's a personal account of the history of the application of MD5 collisions to X.509 certificates. This story complements the `official' hashclash website.
part I - harmless colliding certificates: This line of work started shortly after Xiaoyun Wang took the crypto community by surprise at the Crypto Rump Session in August 2004, by presenting an MD5 collision. That led Arjen Lenstra and me to wondering what nasty things could be done with such collisions. We saw possibilities for interesting applications to X.509 certificates. Applications were already emerging in other areas (software integrity), using the exact collisions Xiaoyun Wang had presented at Crypto, that were using the standard initial IHV.
We had two main problems to solve: we had to neatly hide a random looking data block inside the highly formatted data structure that an X.509 certificate is, and we had to obtain collisions for different initial IHVs than Xiaoyun Wang's original collisions. The first problem was solved by observing that an X.509 certificate already contains a random looking block of data, namely the RSA modulus of the public key. It turned out to be easy to hide the collision inside this modulus, even while retaining the number theoretic structure of the modulus. The second problem turned out to be even easier: one e-mail to Xiaoyun Wang with our specific initial IHV was sufficient to receive the desired collision overnight.
See the old Colliding Certificates website containing a lot of technical stuff. This work led to our ACISP paper, and caused some turbulence in the PKI world.
One reaction that I like is the following, by Jeffrey Kay:
"I just finished reading Colliding X.509 Certificates (...) and I now have chills running up my spine (...) this is a pretty scary paper (...)"
Great. That's why we were doing this.
part II - more interesting colliding certificates: The type of certificates we were able to make using Xiaoyun Wang's type of collision violates fundamental PKI principles, but is actually completely harmless. This is mainly due to the fact that the owner's identity as present in the certificates must be identical in both certificates forming a colliding pair. This is because Xiaoyun Wang's method required the initial IHVs for the two colliding data blocks to be equal. We now call this type of collisions identical prefix collisions.
At EuroCrypt 2005, where Xiaoyun Wang presented her method for constructiong MD5 collisions, it became clear to us that this was the main obstacle for more interesting colliding certificates. In particular, we wanted colliding certificates with different owner identities. For that we needed chosen prefix collisions, where the two colliding data blocks have arbitrary chosen initial IHVs.
Then Marc Stevens, a master student in Eindhoven, came to me looking for a topic for his master thesis project, to be done within the Dutch National Communications Security Agency NBV. We gave him a choice from a few topics, and he chose hash collisions. After a few months he had considerably improved and accelerated Xiaoyun Wang's method for identical prefix collisions, and had ideas on how to construct chosen prefix collisions. This required massive computational efforts, that luckily can easily be parallelized and distributed. So Marc started the HashClash project, using the BOINC framework, in which volunteers donate their spare cycles to scientific advancement. Soon Marc had a few thousand PCs all over the world running for him, and after some 6 months he finally had a chosen prefix collision. This was crafted such that it could be built in a pair of colliding certificates, this time with different owner identities. See the Chosen-prefix Collision website and the Colliding X.509 Certificates for Different Identities website. Marc presented his work at EuroCrypt 2007, in a paper that was selected by the program committee as one of the three notable papers of the conference.
In June 2007 Marc graduated in Eindhoven as MSc, cum laude, with the highest possible grade for his master thesis. He also received the Eindhoven University Best Master Thesis Award 2007.
This result achieved some attention from the Dutch press.
intermezzo - predicting the 2008 US presidential elections: After graduating, before he went to work at CWI as a PhD student of Ronald Cramer, Marc visited Arjen Lenstra in Lausanne for two months. Arjen has a cluster of over 200 PlayStation3 game consoles available for cryptanalytic experiments. Marc developed code for this platform and considerably improved his methods for constructing chosen-prefix collisions for MD5. We now took a less serious but equally interesting and challenging application that was much more appealing to a broad audience: predicting who would become the next US president. Note: this was in the fall of 2007, more than a year before the elections in November 2008. Using only one PlayStation3 Marc produced a chosen-prefix multi-collision for MD5, consisting of twelve perfectly normal pdf files, all having identical MD5 hash values. See the Nostradamus website.
Moreover, as a somewhat more serious (to some) application we constructed a pair of Win32 executables having identical MD5 hash values. Next time you download some software and verify its integrity by an "MD5 checksum", do you trust that there is no other file with the same "checksum" value? See the Colliding Executables website.
These two websites were released on November 30, 2007, and attracted a lot of attention worldwide.
part III - really dangerous colliding certificates: In July 2008 Alex Sotirov, Jake Appelbaum and David Molnar contacted us, as they had ideas on how to mount an attack on a real Certification Authority, based on chosen-prefix collisions, and they needed help on constructing these collisions. This led Marc to making new improvements in his methods, and the full strength of Arjen's PlayStation3 cluster was used to produce, in October 2008, the collision that was needed. The result was a potentially very dangerous attack possibility on the SSL Public Key Infrastructure, in that we had constructed a rogue Certification Authority whose certificate appears to be signed by a commercial CA trusted by the major browsers. This story has been extensively described on the rogue CA website. Alex, Jake and Marc presented this work on December 30, 2008 at the 25th Annual Chaos Communication Congress in Berlin.
It has attracted major media attention. In one blog I found a quote about our work that I like:
"their paper describing the attack is a fascinating read filled with enough gory details to make any security practitioner salivate".
Keep tissues at hand when you click here.
The US Computer Emergency Readiness Team issued a Vulnerability Note, and there were public reactions by companies such as Microsoft, Verisign and Cisco.
The new developments on MD5 collision construction are described in detail in a paper that was presented at Crypto 2009. This paper also contains details of speed improvements to the identical prefix collision construction, and of a construction method for single block chosen prefix collisions. The Crypto 2009 program committee selected our paper for the best paper award.
Our work made it to number 1 of the Top Ten Web Hacking Techniques of 2009. If that's something to be proud of I don't know.
part IV: - really really really dangerous colliding certificates in the wild On June 4, 2012, only weeks before Marc's PhD defense in Leiden (June 19, with Ronald Cramer and Arjen Lenstra as "promotores", Xiaoyun Wang as one of the "opponent"s, and me as "co-promotor"), I saw on a mailing list a message about the Flame malware being authenticated by a forged certificate appearing to have been issued by Microsoft. The message contained a few certificates, but only Microsoft CA certificates, most of which had already been revoked. The forged certificate was not yet publicly available.
I alerted Marc to this immediately. Marc had, in his time at CWI, developed a very nice "forensic tool", that on input of one message only, checks whether this message has a twin message such that these twins form a hash collision constructed by a method such as Wang's/Marc's. I asked Marc to run his tool on the available Microsoft CA certificates (you never know what might come out), which he did, with negative results.
One day later, Microsoft announced in a blog that indeed the certificate used for authentication of Flame has been constructed by a collision for MD5. In that blog they explicitly refer to our previous work. The forged certificate also became available that day, and Marc ran it through his forensic tool, this time with positive results. Moreover, analyzing the results, Marc discovered that a novel chosen-prefix collision attack had been used. This is very interesting news, both from a scientific viewpoint, and because it gives some insight in the capabilities of the designers of Flame. On the other hand, the data from the certificate also suggest that the attack followed to a substantial extent the path we had laid out in our Rogue CA work; it gives the impression that the Flame designers carefully have studied our work, but gave it their own creative twist.
Later it became clear that the Flame malware was merely a predecessor of Stuxnet, the real deal. In Stuxnet also certificates play a role, but hash collisions do not. Apparently it was easier to compromise a real life Certification Authority. See the excellent book "Countdown to Zero", by Kim Zetter, for the Stuxnet story in full detail. The book devotes an entire footnote to our work.
In short: from a scientific viewpoint it's very interesting to see novel attack ideas appear. But I would have preferred them to have come from the open academic community or from the open security researchers community. Above all, I'm not very happy to see our work being abused in malware.
the aftermath: Marc went on with attacking SHA1, and succeeded to find collisions for it too. See his PhD thesis and further work, available from https://marc-stevens.nl/research/.

thesestop

bookstop

paperstop

reportstop

data setstop

Erdős numbertop

2, via Florian Luca
Let's say my hope of ever improving my Erdős number is ε. If you dig into the details of how Florian Luca got a joint paper with Erdős, you will find an argument showing that ε still may be strictly positive. The case of Steve Butler (who reached Erdős number 1 in 2015) supports this.

PhD studentstop

educational articlestop

miscellaneoustop

latest change of this pagetop

February 22, 2025