Whoops: Linux's strcmp() For The m68k Has Always Been Broken

Written by Michael Larabel in Linux Kernel on 21 December 2022 at 01:41 PM EST. 50 Comments
LINUX KERNEL
It turns out the hand-written Assembly code providing an optimized string comparison "strcmp" function for the Motorola 68000 (m68k) processor architecture has "always been broken" and only now uncovered at the end of 2022.

While open-source enthusiasts and others like to think that bugs in open-source code are easily spotted, particularly when it comes to the massive Linux kernel codebase for code touching aging/niche hardware that isn't always the case. For the decades-old Motorola 68000 processors, it was only uncovered recently that its optimized strcmp() implementation is subtly broken but now has become more pronounced following other kernel changes.

Linus Torvalds dealt with the issue and commented that it's "always been broken" and indeed when pulling up the m68k strcmp() code going back to the early Linux 2.6 days when initially imported to Git, the function was indeed broken at least back that far, if not always going back to the original Linux/m68k port.


But as for this broken strcmp implementation being broken for many years in the kernel and going unnoticed, as Torvalds explained that it's subtle in that it's broken only for the overflow case and most strcmp() kernel users don't care about non-US-ASCII orderings. Many developers also just care if strcmp() matches the two strings or not but aren't always concerned by the returned value otherwise.

The m68k strcmp() issue only became more pronounced now with the Linux 6.2 kernel development code where the -funsigned-char compiler flag is being set to better deal with buggy code.

Linus Torvalds summed up the issue in this commit fixing the problem, which is also embedded below for easy reading. The fix/workaround is just deleting the (broken) optimized implementation so it will fallback to using the generic strcmp() implementation within the kernel for those still running Motorola 68000 series hardware.
The m68 hand-written assembler version of strcmp() has always been broken: it returns the difference between the first non-matching byte done as a 8-bit subtraction.

That is _almost_ right, but is broken for the overflow case. The strcmp() function should indeed return the sign of the difference between the first byte that differs, but the subtraction needs to be done in a wider type than 'char'. Otherwise the ordering isn't actually stable.

This went unnoticed for basically forever, because nobody ever cares about non-US-ASCII orderings in the kernel (in fact, most users only care about "exact match or not"), so overflows don't really happen in practice, even if it was very very wrong.

But that mostly unnoticeable bug becomes very noticeable by the recent change to make 'char' be unsigned in the kernel across all architectures (commit 3bc753c06dd0: "kbuild: treat char as always unsigned"). Because the code not only did the subtraction in the wrong type width, it also used 'char' to then make the compiler expand the result from an 8-bit difference to the 'int' return value.

So now with an unsigned char that incorrect arithmetic width was then not even sign-expanded, and always returned just a positive integer.

We could re-instate the old broken code by just turning the 'char' into 'signed char' as has been done elsewhere where people depended on the signedness of 'char', but since the whole function was broken to begin with, and we have a non-broken default fallback implementation, let's just remove this broken function entirely.

Dropping of the broken m68k strcmp() implementation was merged today for Linux 6.2.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week