From fsb-return-1288-Bernard.Lang=inria.fr@crynwr.com Mon Nov 2 23:12:36 1998 Message-ID: <363A0C15.9DCDF7A5@netscape.com> Date: Fri, 30 Oct 1998 13:57:25 -0500 From: hecker@netscape.com (Frank Hecker) Organization: Netscape Communications Corp. X-Mailer: Mozilla 4.5 [en] (Win95; U) X-Accept-Language: en MIME-Version: 1.0 To: fsb@crynwr.com Subject: Estimating investments in Linux and other libre software Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Here is an interesting article containing estimates of the programmer time put into developing the Linux kernel and the surrounding GNU, etc., utilities. http://www.theregister.co.uk/981021-000001.html As it happens, Mike Shaver of Netscape and I engaged in some internal correspondence regarding this subject a while back. Mike estimated (based on the amount of code added to the kernel in a year, divided by a nominal 10 lines of completed code per day per developer) that the amount of programmer time put into Linux kernel development was equivalent to approximately 120 full-time developers; compare this to the estimate in the article of 200 part-time kernel developers totalling 500 man-years over 5 years, or the equivalent of 100 full-time people per year. Pretty good agreement, and one which gives some confidence that these are reasonably accurate numbers. (Although I wish that the article had included more background information on the methodology used to create the estimates.) This subject is inherently interesting to me for a number of reasons. First, it enables one to compare the programmer resources being put into Linux and other libre software projects vs. the resources being put into proprietary products like NT, and this in turn gives some clues as to the long-term viability of Linux, etc., vs. NT, clues which will be useful to those considering investing in or participating in the general Linux market (meaning, Linux plus stuff running on top of it plus stuff needed to make it run). In this sense it is a useful complement to the estimates of the total Linux user base that have been published by Red Hat and others. For libre software development in general it is also important to know things like the size of the total pool of available programming talent vs. the amount of that talent being spent on various projects. As libre software projects become more sophisticated and more tied into commercial enterprises (e.g., FSBs), I think those who initiate and manage such projects are going to have to get more sophisticated about recruiting and retaining developers; for one thing, they're going to be competing against other projects trying to recruit from the same developer pool. One aspect of being more sophisticated is having better metrics about actual and potential developer resources; then you can begin to think about implementing formal strategies for recruitment and retention and being able to evaluate their relative success or failure. This is analogous to what more sophisticated non-profit organizations do in terms of measuring the success of direct-mail campaigns or volunteer initiatives. (In fact I would contend that the analogy is actually fairly exact, given that in both cases you're leveraging unpaid volunteer resources and have to deal with all the issues that entails: managing the role of volunteers vs. that of in-house paid staff, paying attention to the ideals that motivate volunteers, etc.) This would be a great area for someone in academia to do a study and come up with some publishable data on past and present investments in libre software both in toto and by project. Most if not all the information you'd need for such a study is publicly available; for example, you could use old tarballs or CVS snapshots to do source line counts and counts of the number of unique contributors for particular projects over time. Frank -- Frank Hecker Pre-sales support, Netscape government sales hecker@netscape.com http://people.netscape.com/hecker/ From fsb-return-1291-Bernard.Lang=inria.fr@crynwr.com Tue Nov 3 16:26:44 1998 Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by margaux.inria.fr (8.7.6/8.7.3) with ESMTP id QAA22638 for ; Tue, 3 Nov 1998 16:26:37 +0100 (MET) Received: from ns.crynwr.com (ns.crynwr.com [192.203.178.14]) by nez-perce.inria.fr (8.8.7/8.8.7) with SMTP id QAA11314 for ; Tue, 3 Nov 1998 16:26:35 +0100 (MET) Received: (qmail 29064 invoked by alias); 3 Nov 1998 15:27:00 -0000 Mailing-List: contact fsb-help@crynwr.com; run by ezmlm Delivered-To: mailing list fsb@crynwr.com Received: (qmail 29057 invoked by uid 0); 3 Nov 1998 15:26:56 -0000 Received: from comton.airs.com (199.103.241.106) by pdam.crynwr.com with SMTP; 3 Nov 1998 15:26:56 -0000 Received: (qmail 20705 invoked by uid 269); 3 Nov 1998 15:26:21 -0000 Message-ID: <19981103152620.20704.qmail@comton.airs.com> From: Ian Lance Taylor Date: 3 Nov 1998 10:26:20 -0500 To: hecker@netscape.com CC: fsb@crynwr.com In-reply-to: <363E2456.A2530C85@netscape.com> (hecker@netscape.com) Subject: Re: Estimating investments in Linux and other libre software Status: R Date: Mon, 02 Nov 1998 16:29:58 -0500 From: hecker@netscape.com (Frank Hecker) Mike estimated (based on the amount of code added to the kernel in a year, divided by a nominal 10 lines of completed code per day per developer) that the amount of programmer time put into Linux kernel development was equivalent to approximately 120 full-time developers.... I'm not comfortable with this sort of calculation. For whatever reason, computer programming is a discipline with very wide variation in productivity between good programmers and mediocre programmers. I believe I've seen estimates of a ratio of 30 to 1 in amount of code produced. My experience with free software is that most of the unpaid work is done by highly productive programmers. After all, they're the ones with the skill to contribute on a part-time basis. Therefore, I think that in any effort to compare programmer time between free software projects and funded software projects, you have to consider that the odds are that the programmers on the free software projects are significantly more productive. That makes me feel that any comparison based on the amount of code developed in a particular period of time, such as the above which uses amount of code added to the kernel in one year, has a real risk of talking about different ideas which sound similar but are really incommensurable. The ``nominal 10 lines of completed code per day per developer'' may simply have nothing to do with the actual code and the actual developers in question. To put it another way, while one can reasonably try to make guesses like ``equivalent to approximately 120 full-time developers,'' to go beyond that into such things as speculation about resources invested in Linux vs. NT with implications for viability, or into considerations such as the size of the free software programming talent pool, seems meaningless. For better or worse, programmers are not commodities. Free software development driven by volunteers is not the same as funded software development driven by paid employees. One aspect of being more sophisticated is having better metrics about actual and potential developer resources; then you can begin to think about implementing formal strategies for recruitment and retention and being able to evaluate their relative success or failure. This is analogous to what more sophisticated non-profit organizations do in terms of measuring the success of direct-mail campaigns or volunteer initiatives. (In fact I would contend that the analogy is actually fairly exact, given that in both cases you're leveraging unpaid volunteer resources and have to deal with all the issues that entails: managing the role of volunteers vs. that of in-house paid staff, paying attention to the ideals that motivate volunteers, etc.) In principle I agree that this sort of thing would be a good idea. In practice I think the free software development pool is relatively small and relatively idiosyncratic compared with the pool of people who contribute to non-profit organizations. For example, the GNU autoconf package languished for a couple of years until it was recently picked up by Ben Elliston. This was not due to a lack of available free programming talent, nor to a feeling that it was unimportant. If it had been a charity, I'm sure people would have contributed. However, as a programming project, it had to wait until one individual had the time, interest, ability, and opportunity. My point is that a theoretical argument about potential developer resources can easily founder on the reality of the actual set of people interested and available to do the work. That doesn't make it a bad idea, of course. But I think it would be inadvisable for an FSB to make plans on this basis. This would be a great area for someone in academia to do a study and come up with some publishable data on past and present investments in libre software both in toto and by project. Most if not all the information you'd need for such a study is publicly available; for example, you could use old tarballs or CVS snapshots to do source line counts and counts of the number of unique contributors for particular projects over time. I would certainly find this interesting. For the GNU project, you could get a rough approximation simply by examining the ChangeLog files. Ian