Windows vs. Linux Performance

Message boards : Number crunching : Windows vs. Linux Performance

To post messages, you must log in.

AuthorMessage
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 49823 - Posted: 20 Dec 2007, 12:22:12 UTC
Last modified: 20 Dec 2007, 12:25:24 UTC

I'm starting to collect data to examine this. In order to make comparisons, I need to find wus done in the same job type. I'm not 100% sure that what is depicted below are all of the same job type. Wish I knew more about how they're named, but I've done my best to find some comparable work. I have 6 systems, 5 of them are "dual boot" with both 32b WinXP and 64b Mandriva Linux. The systems compared are an AMD Athlon64 2800 clawhammer, AMD Athlon64 3700 Sandiego, and an AMD Athlon64 X2 4800.

I set my "run time" preference two two hours, my cache of work to 1 day, and have been rebooting into each OS each 12 hours hoping to get a good chance of having work that is comparable on both OSes at the same time. I might switch to 1 hour run time and .25 cache to improve those odds. Anyhow, after 5 or so days I took a look and tried to find "comparable" wus. What follows is a spreadsheet with what I think are comparable wus.

G = Granted credit
C = Claimed credit
H = Hour
Sec = Seconds (cpu seconds)
Blue text is from Linux
Black text is from Windows



Now, I understand and can show many examples of variations in CPU Seconds within a "comparable" job type within either OS independently, so I shouldn't assume that just because ONE example shows one OS/application to be faster that it is necessarily so. However, from what I have so far (I'll be continuing collecting for more examples) every instance shows the Linux Application to be faster. From what I understand, Rosetta doesn't have a "special" 64b linux app and is sending the 32b linux app to be run on 64b systems, so I'm "assuming" my results would be the same with a 32b linux OS.

Anyone seeing anything different? Anyone finding flaws in my data? Am I comparing apples and oranges on some/all of these? Is it just a coincidence and the standard deviation of run time within a job accounting for all of them?
ID: 49823 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Luuklag

Send message
Joined: 13 Sep 07
Posts: 262
Credit: 4,171
RAC: 0
Message 49838 - Posted: 20 Dec 2007, 21:22:32 UTC - in response to Message 49823.  

well this once again kinda confirms my sayings, that windows uses more cpu and memory to keep itself running in comparison to linux. cause linux dous as good or even better then windows. so it uses more resources/ uses them better.
ID: 49838 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 49843 - Posted: 20 Dec 2007, 22:37:50 UTC - in response to Message 49838.  
Last modified: 20 Dec 2007, 22:46:17 UTC

well this once again kinda confirms my sayings, that windows uses more cpu and memory to keep itself running in comparison to linux. cause linux dous as good or even better then windows. so it uses more resources/ uses them better.

Hi, I don't know what it is. It could be a singular cause such as the compiler used to compile the program(ICC vs MSVC vs GCC or whatever was used), to combinations of various reasons. heck, it could even be that my results just happen to be a fluke. Reasons could include those you listed, and others, such as, I'm picking non comparable wus. I can see where a couple of the ones I selected might not be comparable, but others which certainly appear to be comparable.

In any event. This is just a start. I've added my AMD64 X2 5200 and 6000 to the mix. I've also set my cpu run time to 1 hour and reduced my cache to .5 days in hopes of getting a higher percentage of comparable work across platforms.

[edit]For example, to get those few for the AMD64X2 4800, I merged 927 windows wus, with 99 linux ones, then sorted.
ID: 49843 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 49844 - Posted: 20 Dec 2007, 22:50:14 UTC

I see you often only get one model per task done in the timeframe of your tests... might I suggest an alternative approach to trying to get similar work, and yet increase the number of models you are comparing? Try a 12 or 24hr runtime preference, and grab a 1 or 2 day cache of work on BOTH environments. Then crunch through it. Since you are requesting the work within the same 15 minute period of time, it should tend to be similar tasks. And since you have a longer crunch time, it should tend to give you more models done to get a better average on time and credit per model.

You could ALSO request 3 or 4 days of work, then suspend network activity, and suspend any odd WUs and crunch the ones that look similar first in each environment. Then crunch the rest later.

I think you really just need more data. And, in the end, you want to compare credit per hour of the two environments, or at least credit per hour within the subset of comparable tasks.

...and I'm not positive about it either, but I think you need to compare tasks from the same batch number. So, for example towards the bottom, you have line 41 from batch 2452, and line 42 from batch 2455. I don't think that is the best comparable.
Rosetta Moderator: Mod.Sense
ID: 49844 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 49863 - Posted: 21 Dec 2007, 11:58:39 UTC
Last modified: 21 Dec 2007, 12:16:59 UTC

If I extra the ones that don't match the Third number from the end I'm left with only ONE matching for the AMD64 2800 and 3700, and two matching for the X2 4800. So, I only found ONE matching/core for each 4 days invested. I don't know how long it'd take to get a "sizable" sample. LOL Here's the chart with the extras removed: (remember, blue is linux, black is windows)



I polled the data and determined the "average Claimed Credit/hour" and "average Granted credit/hour" for all 5 dual boot hosts. I only used data collected since Nov 26th to avoid Boinc claimed credit differences in versions which used MSVC 2003. Looking at "granted credit" certainly shows Linux with roughly a 20% better yield in credit(except on the AMD64 2800). This is on a "per core", not "per machine" basis.

ID: 49863 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
DJStarfox

Send message
Joined: 19 Jul 07
Posts: 145
Credit: 1,250,162
RAC: 0
Message 49866 - Posted: 21 Dec 2007, 13:19:38 UTC - in response to Message 49863.  

I agree with Mod.Sense. You need more decoys per task by comparing tasks from the same batch number if possible. The real performance measure is seconds/decoy, not granted credit. (If the linux client is claiming less/more credit, that is a separate issue from raw performance.) I also agree increasing the cpu runtime will help you get more decoys per task. I would try 8 hours runtime pref...that's enough to get 3 decoys in every task that Rosetta may throw your way.
ID: 49866 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 49868 - Posted: 21 Dec 2007, 13:43:06 UTC
Last modified: 21 Dec 2007, 13:49:03 UTC

I agree, more samples would be nice, however the time to acquire them makes it difficult. Also, as seen by the example below, there's as much as a 20% variation in "cpu seconds/decoy" on comparable wus returned by the same host (these also happen to be returned on the same day by MSO's host 629701 an X5355 processor).



since Credit is granted based upon the "average claimed credit/decoy" for all returned decoys, then "granted credit/decoy" is relatively stable, and since one hour is always 3600 seconds, then using "avg granted credit/hour" should be a reliable method for gauging performance. Atleast as long as a large enough (whatever that is) sample is used in the average.
ID: 49868 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
DJStarfox

Send message
Joined: 19 Jul 07
Posts: 145
Credit: 1,250,162
RAC: 0
Message 49870 - Posted: 21 Dec 2007, 13:52:38 UTC - in response to Message 49868.  

I agree, more samples would be nice, however the time to acquire them makes it difficult. Also, as seen by the example below, there's as much as a 20% variation in "cpu seconds/decoy" on comparable wus returned by the same host (these also happen to be returned on the same day by MSO's host 629701 an X5355 processor).



since Credit is granted based upon the "average claimed credit/decoy" for all returned decoys, then "granted credit/decoy" is relatively stable, and since one hour is always 3600 seconds, then using "avg granted credit/hour" should be a reliable method for gauging performance. Atleast as long as a large enough (whatever that is) sample is used in the average.


Here I was thinking Linux is slower, but you found that to be not the case? Perhaps it's because there are no graphics in the code for linux.
ID: 49870 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 49872 - Posted: 21 Dec 2007, 13:59:57 UTC - in response to Message 49870.  
Last modified: 21 Dec 2007, 14:05:49 UTC


Here I was thinking Linux is slower, but you found that to be not the case? Perhaps it's because there are no graphics in the code for linux.

Using "seconds/decoy" is even less accurate looking at his CNTRL_01ABRELAX_SAVE_ALL_OUT_-1ubi_-_filters_1782_???????_0 WUs. He did 149 of them, Hi/lo on seconds/decoy is 437.58 down to 297.36 seconds/decoy. Average was 350.34. hi/lo total cpu time reported ranges 10,400-11,002 seconds, hi/lo decoys completed ranges 36/25

I don't have many linux samples for the 5200 and 6000, but the current improvement % is inline with that of the x2 4800 for which I have enough samples.

I'll be collecting many more samples (about 100 days worth) so things should get more clear/definitive.
ID: 49872 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 49873 - Posted: 21 Dec 2007, 14:27:19 UTC

Another idea... 1hr runtime preference, grab 3 days of work for each environment, at the "same" time. Suspend all but those that match, up the runtime to 24hrs, update to project, complete the matches. Reduce back to 1hr and complete the rest of the work.
Rosetta Moderator: Mod.Sense
ID: 49873 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Astro
Avatar

Send message
Joined: 2 Oct 05
Posts: 987
Credit: 500,253
RAC: 0
Message 49975 - Posted: 23 Dec 2007, 17:02:08 UTC
Last modified: 23 Dec 2007, 17:36:36 UTC

I tried that for the most part and got some good numbers. It's very large, so I cut it down to two pics for the summary sheet, but the last images represent ONE worksheet cut into pages. The last images show ALL the work compared and which machines were used.

Here's the summary: On all but 2 of 75 comparable runs, Linux was faster than Windows. The overall average is 18.95% faster. The next two pics just show the wu type and percentage that Linux was faster.




And here's the data for each individual host. Remember Blue text is Linux, Black is Windows. These are in order on the one huge sheet (AMD64 2800, AMD64 3700, AMD64 X2 4800, AMD64 X2 5200, and AMD64 X2 6000)












The All hosts are dual boot from Win into 64b Mandriva linux and use 64b boinc. The AMD64 2800, AMD64 3700, and AMD64 X2 4800 use Win xp 32b and use 32b boinc. The AMD64 X2 5200 and 6000 use Winxp Pro 64 and 64b boinc for windows.
ID: 49975 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Windows vs. Linux Performance



©2024 University of Washington
https://www.bakerlab.org