Note: The opinions expressed here are my own and have no relationship with the opinions or official viewpoints of any organization with which I am associated
I wrote this shortly after the report, but after careful examination of the evidence. I then continued my examinations and discovered that I was wrong.
I am now convinced that the code in question was, indeed, derived from UNIX System V.4, and not an earlier version of UNIX, as some other people have claimed. This does not mean that it was stolen. Nevertheless, this page may still be of interest, so I'm not deleting it. Here's the current analysis.
On 20 August 2003, Heise Verlag in Germany published an update to a report with the (translated) unemotional and objective title “SCO threatens to kill Open Source”. It's the only thing I've seen so far that refers to a presentation of textual similarities between UnixWare and Linux. I don't have time to translate the whole thing, but here's the important bit:
Assisted by his vice president, Chris Sontag, McBride showed examples from the code of Linux 2.5 and 2.6 which should prove that source code has been taken out of Unix without change–an example shown by SCO shows code commentaries ... Identical typing mistakes in the commentaries and unusual formulations had left traitorous traces, claimed Sontag. To prove this, McBride had hired a team for pattern recognition to hunt through tens of thousands lines of code. The few code sequences near the comments were made illegible to protect SCO's copyrights.The is stupid. There's no code in common, just a comment which, admittedly, looks to be the same. I also don't see any “typing mistakes”. But where does it come from? On the SCO side, it includes another line (“The swap map unit is 512 bytes”). Maybe this is not correct for Linux. But people who copy comments so literally don't remove things just because they're wrong; they haven't fixed the broken indentation, for example, assuming that this is really broken indentation in the code, and not a badly prepared slide; there's every possibility that it's the latter.
But if two comments are the same except for an addition, which is the original? The one without the addition, obviously. I see this as an indication that the code was copied from Linux to UnixWare.
In addition, the alleged code sequences near the comments which were made illegible to protect SCO's copyrights are really additional commentary. You don't have to be a C programmer to recognize that comments start with /* and end with */, and that people frequently put a single * in multiline comments for stylistic reasons, something that the person who put together this slide obviously didn't consider important.
The comment is in English written in approximate Greek letters and reads:
As part of the kernel evolution toward modular naming, the functions malloc and free are being renamed to rmalloc and rfree. Compatibility will be maintained by by the following assembler code: (also see free/rfree below).This comment is completely irrelevant to the Linux code to which the first half of the comment has been applied. The Linux version of both of these examples comes from the file arch/ia64/sn/io/ate_utils.c. There are a number of interesting things to note about this file:
/* $Id: ate_utils.c,v 1.1 2002/02/28 17:31:25 marcelo Exp $ * * This file is subject to the terms and conditions of the GNU General Public * License. See the file "COPYING" in the main directory of this archive * for more details. * * Copyright (C) 1992 - 1997, 2000-2002 Silicon Graphics, Inc. All rights reserved. */This is not new code, but the RCS identifier (the string starting with $Id$ in the second line) shows that it was incorporated by marcelo on 28 February 2002. marcelo is Marcelo W. Tosatti, the main person active in developing the Linux virtual memory system.
Further down in the Heise report, you can read:
In total, SCO's testers claim to have found more than 800,000 lines of duplicate code–an example from SCOOK, let's look at this example. In fact, it's a continuation of the previous example, the function atealloc in arch/ia64/sn/io/ate_utils.c. There are a number of things to note about it:
My books about UnixWare SMP locking suggest that in this kind of situation, the lock call would be LOCK, not mutex_spinlock. It returns a value of type pl_t, whereas mutex_spinlock returns a value of type int, and it takes two parameters. This information has been available for years in Vahalia, UNIX Internals: The New Frontiers (Prentice-Hall, 1996). page 213. If SCO now has functions like mutex_spinlock in the kernel, it would appear that they have thrown out their own SMP implementation and incorporated the Linux version.
Main SCO page SCO affair overview Greg's home page Greg's diary