Message boards : Number crunching : Intel i7 CPU
Author | Message |
---|---|
Chilean Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
|
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,673,616 RAC: 11,118 |
according to boincstats there are only 2 i7 cpus running Rosetta so far (that i could find): Intel(R) Core(tm) i7 CPU 940 @ 2.93GHz http://boincstats.com/stats/host_cpu_stats.php?pr=rosetta&teamid=&st=1400&or= I'm not sure how to find one of those in the rosetta stats though... Incidentally, I also spotted a transmeta CPU in the list! |
Kelowna Insta-Print Send message Joined: 18 Dec 05 Posts: 15 Credit: 66,449,833 RAC: 346 |
|
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,673,616 RAC: 11,118 |
hmmm.... from a v small sample size it looks like nehalem is currently slower than kentsfield! I assume because of its smaller L2 cache? from one page of results from that machine, it gets on average 20.5 credits per core per hour. My Q6600 gets 18.2 per core-hour, but the i7 is 3.2GHz against my 2.4GHz, so i7 would get 20.5/3.2*2.4 = 15.4 credits per hour at 2.4GHz... If i use my Q6600's RAC (1570) that falls to 1570/24/4 = 16.1 credits per core-hour, but that's still a bit higher than the i7. Looks like Penryn will still be top-dog here, although it might vary with different tasks, or if the bakerlab guys can implement some kind of optimisations for SSE etc. |
Chilean Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
hmmm.... from a v small sample size it looks like nehalem is currently slower than kentsfield! I assume because of its smaller L2 cache? I thought the Nehalems would still have butt-loads of L2 Cache even though they fixed the ridiculous CPU-Northbridge-CPU "interCore" communication. Which reduces the delay of communication between cores. Oh well. We'll have too see a larger sample. |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,673,616 RAC: 11,118 |
256KB per core L2 - the L3 is much higher latency (although obviously a lot better than going to RAM). Same setup as Barca/Deneb I believe...hmmm.... from a v small sample size it looks like nehalem is currently slower than kentsfield! I assume because of its smaller L2 cache? |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,673,616 RAC: 11,118 |
hmmm.... from a v small sample size it looks like nehalem is currently slower than kentsfield! I assume because of its smaller L2 cache? I thought BOINC only ran one task per physical CPU now? If not, we should be telling people to do that for better throughput... |
Chilean Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
hmmm.... from a v small sample size it looks like nehalem is currently slower than kentsfield! I assume because of its smaller L2 cache? I don't think the OS is capable of distinguishing between a virtual and physical core. When HT is enabled, it tells the OS that it has twice the available cores (2 virtual cores per 1 physical core). |
Michael G.R. Send message Joined: 11 Nov 05 Posts: 264 Credit: 11,247,510 RAC: 0 |
It takes a while for RAC to stabilize. Could that be the reason why I7 seems slower? |
Chilean Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
|
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,673,616 RAC: 11,118 |
It takes a while for RAC to stabilize. Could that be the reason why I7 seems slower? no - i wasn't looking at RAC, i just dumped a page of its submitted results into excel and worked out the average granted credit per hour. There's variation in the credit depending on a few factors but it gives a ball-park figure. A stable RAC is the best test though. Maybe Nehalem will perform better if all the cores are running tasks that depend on the same files so the L3 is used more efficiently... I've worked out why the credit from my E2180 was low - it was throttling down to 1.2GHz - it's now at 2.6GHz and i've dropped the voltage to 1.1V :) Strangely, CPU-z identifies it as an E4400... |
The_Bad_Penguin Send message Joined: 5 Jun 06 Posts: 2751 Credit: 4,271,025 RAC: 0 |
i'm gonna hafta have rosie wait awhile, i broke down and purchased ($210!) an i7 920. now, i have to save some $$$ for the mobo and ddr3 ram, and build the darn thing. think it'll eventually be worth the effort. curious to see what 8 virtual cores can do 24/7 for rosie... |
Chilean Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
i'm gonna hafta have rosie wait awhile, i broke down and purchased ($210!) an i7 920. now, i have to save some $$$ for the mobo and ddr3 ram, and build the darn thing. think it'll eventually be worth the effort. what do u do for a living now? or did u just win the lotto? |
The_Bad_Penguin Send message Joined: 5 Jun 06 Posts: 2751 Credit: 4,271,025 RAC: 0 |
would love to win the lotto!!! am just a lowly law school graduate, trying to eek out a living... $210 for the i7 was a super-sale price, regular price is $300 + $25 tax. |
Chilean Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
|
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,673,616 RAC: 11,118 |
i'm gonna hafta have rosie wait awhile, i broke down and purchased ($210!) an i7 920. now, i have to save some $$$ for the mobo and ddr3 ram, and build the darn thing. think it'll eventually be worth the effort. i'd be interested to see how good the i7 is on power consumption with Rosie running... should be very good. |
GT82 [HWU] Send message Joined: 26 Aug 07 Posts: 15 Credit: 154,103 RAC: 0 |
From these numbers it seems that Nehalem is not performing well with Rosetta. But we must consider that it has 8 logic cores, in fact the computer page on Boinc says "Number of CPUs: 8" long-term RAC can only tell us more... |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,673,616 RAC: 11,118 |
From these numbers it seems that Nehalem is not performing well with Rosetta. But we must consider that it has 8 logic cores, in fact the computer page on Boinc says "Number of CPUs: 8" Previously hyperthreading hasn't helped - I'd think it would be the same for nehalem as the L2 cache will have to swap info for the two threads. |
FoldingSolutions Send message Joined: 2 Apr 06 Posts: 129 Credit: 3,506,690 RAC: 0 |
The i7 processor has an inclusive L3 cache. Which for those who don't know what that means is basically a small portion (1MB) of the 8MB L3 cache is reserved for the contents of all 4 256KB L2 caches so that if core 1 wants something from core 3 then rather than diving straight into core 3's L2 cache it can simply go to the L3 which prevents a core from "stalling". Hyper-threading will not improve the credits per hour of individual cores, but certainly will for the CPU as a whole. Just think of it as 2/3 + 2/3 = 4/3. So even if each "hyperthreaded" core is only 2/3 as powerful as a full core. These two add up to equal greater than the sum of its parts. It is also true that the small amount of L2 in i7 is a concern for those applications which "cache thrash" such as Rosetta. But this effect will be minimised by the large L3 cache and the new Quickpath interconnect which much reduces latency to the main memory. So the 8 cores will be waiting less for data. The large L2 caches we have been seeing on some of the latest Core 2 CPU's are to make up for the extreme latency problems of the ancient & inadequate front side bus. The TDP of all released Core i7 CPU's is 130W, fortunately this isn't what Rosetta will use as this figure is assuming the CPU is fed perfectly with data to process and that all instruction units (MMX SSE SSE2 etc) are being used. So I would expect the CPU itself to be knocking out more like ~90W We also have to remember that unfortunately Intel doesn't design processors for people like us who see them as tools for science, but rather for large corporations who don't like to wait for access to massive data-bases of who's been taking too many days off etc. To this end we may have to wait for a while for Rosetta to benefit significantly from any major changes in CPU architecture. Also the advent of green computing could see less major improvements too :( |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,673,616 RAC: 11,118 |
Hyper-threading will not improve the credits per hour of individual cores, but certainly will for the CPU as a whole. Not sure that will be the case - if you're running two hyperthreads then each can have half of the L2 cache. Thats 128KB each. If it wants more then it has to go to L3 which I think is over 40 cycles rather than the 12 or so for L2. Running one thread would give double the L2 cache per thread and therefore reduce the waits for L3. As previously, HT might help other threads keep out of the way of Rosetta, but for rosetta throughput the critical factor will be whether the increased use of the L3 (with its latency penalty) will be compensated for by the interleaving of the threads under HT. From what I've read the L3 will prove useful for apps that share data between cores, but not for discrete threads. I've seen a flaw in my previous logic though - my original calc was per core, but of course there are 8 logical cores here, and each is getting slightly less credit than my Q6600 per cycle, but as there are two threads (it looks like that machine is running 8 threads from the sum of the time on all tasks), then that's quite impressive. I'd be interested to see one with and without HT for comparison as it might be even quicker with only four rosetta threads running. |
Message boards :
Number crunching :
Intel i7 CPU
©2024 University of Washington
https://www.bakerlab.org