Message boards : Number crunching : Current issues with 7+ boinc client
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · Next
Author | Message |
---|---|
microchip Send message Joined: 10 Nov 10 Posts: 10 Credit: 2,255,420 RAC: 2,219 |
Works for me here on Linux, though I still use BOINC 6. Is this problem only related to BOINC 7 + GPU? My specs are: AMD Phenom II x6 1090T @ stock speed nVIDIA GeForce GTX 560 @ stock speed openSUSE 12.1 64-bit Team Belgium |
tanstaafl9999 Send message Joined: 8 Mar 12 Posts: 2 Credit: 1,688,827 RAC: 0 |
I don't think this is just an nVida GPU problem. I'm using an AMD 5870, and had so many errors here that I stopped getting new tasks a while ago. Right now I just check in here occasionally to see if there are any signs of a fix for this. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Works for me here on Linux, though I still use BOINC 6. Is this problem only related to BOINC 7 + GPU? Yes, BOINC v7 always is part of the problem scenario. Machines that were working fine with whatever hardware they had upgrade to BOINC 7 and start getting validation errors on their reported tasks. It would be interesting to see a machine (perhaps with network suspended) complete a few tasks that have not been reported yet, has a few partially done, and has a few it has not yet started on, upgrade from BOINC 6 to 7. Curious which of the above would be flagged with errors. If the ones that were completed before the upgrade are tagged as errors, then would that not surely point to a BOINC problem? Because you know they would have reported fine under BOINC 6. If those are ok, but the tasks in progress are flagged in error, would that not surely point to a BOINC problem? Because you know they would have reported fine under BOINC 6. And if those are ok, but only the tasks that had not been started yet were flagged in errors, would that not surely point to a BOINC problem? I mean the Rosetta version and program is identical between BOINC versions. The validater is the same regardless of client version. The only thing that ALL machines I've seen reported as having problems have in common is the upgrade to BOINC 7. The only reason I can see that people keep feeling this is a Rosetta problem is because they have other projects that do not encounter the problem. But to my thinking, that does not rule out the thought that there are subtle problems of some kind in BOINC 7. Or perhaps with BOINC 7 clients and older BOINC server version code. Rosetta Moderator: Mod.Sense |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,162,382 RAC: 4,112 |
Works for me here on Linux, though I still use BOINC 6. Is this problem only related to BOINC 7 + GPU? I think the last line is the key...when version 7 came out they said it was a BIG change from earlier versions, IF Rosetta is STILL using OLDER Server coding then that is a very likely cause of the problem. Your test would confirm that I think. But even that does not mean Rosetta would do anything The FIRST post in THIS thread was from 15 Oct 2012 while "David E K" has NOT EVEN POSTED since Message 74269 - Posted 13 Nov 2012 22:57:59 UTC!!! It has been almost A MONTH and he has done NOTHING except start a thread and say "I am very busy but will look into it at some point"! That is like calling a company and asking where your check and they say "the check is in the mail"!!! In about a week, or less, I have to move my 30+ cpu's someplace other than where they are currently crunching, Rosetta would be my FIRST choice! BUT it is NOT possible due to the problems here, and DOWNGRADING is not an option!! In fact I am now using Boinc version 7.0.40 on ALL of my machines!!! |
Alez Send message Joined: 3 Apr 12 Posts: 13 Credit: 3,534,368 RAC: 0 |
I have 4 different intel machines crunching, all running boinc 7, all running windows 7 etc. I crunch for 23 different projects and the only project I have errors on is Rosetta and only on my i7 with 3 different Nvidea cards fitted. The tasks do not even run on the i7, they error out after only a few seconds. I have not managed to successfully crunch a single unit on the i7. Every one fails almost instantly. None of the other 22 projects have this issue. |
Alez Send message Joined: 3 Apr 12 Posts: 13 Credit: 3,534,368 RAC: 0 |
I have 4 different intel machines crunching, all running boinc 7, all running windows 7 etc. I crunch for 23 different projects and the only project I have errors on is Rosetta and only on my i7 with 3 different Nvidea cards fitted. Sorry edited to say that they have all units have ran full course but with the invalid error when completed. thinking of something else. Have upgraded to boinc 7.40 and will try a unit with that and see if it makes any difference. |
Mad_Max Send message Joined: 31 Dec 09 Posts: 209 Credit: 25,935,543 RAC: 12,792 |
Good work! Can you confirm what drivers they're using and if they're not the latest, then whether updating the drivers make any difference? No drives reinstall not make any difference. Curently this version of drivers is installed: Version: 306.97 WHQL - Release Date: Wed Oct 10, 2012. With nVidia GTX 670 - all R@H WUs fails at validation stage. Same computer, with same drivers version but with different videocard (nVidia GTX 560Ti) work fine. |
Mad_Max Send message Joined: 31 Dec 09 Posts: 209 Credit: 25,935,543 RAC: 12,792 |
One more machine with BUG: https://boinc.bakerlab.org/rosetta/result.php?resultid=548993345 NV kepler GPU as well ( ASUS GTX670-DC2OG-2GD5) |
Mad_Max Send message Joined: 31 Dec 09 Posts: 209 Credit: 25,935,543 RAC: 12,792 |
I mean the Rosetta version and program is identical between BOINC versions. The validater is the same regardless of client version. The only thing that ALL machines I've seen reported as having problems have in common is the upgrade to BOINC 7. The only reason I can see that people keep feeling this is a Rosetta problem is because they have other projects that do not encounter the problem. But to my thinking, that does not rule out the thought that there are subtle problems of some kind in BOINC 7. Or perhaps with BOINC 7 clients and older BOINC server version code. One MAJOR reason why people keep feeling this is a Rosetta problem is because we TRY MANY TIMES(on different computers) to downgrade back to BOINC v6 and it NOT fix the problem! If problem is due bug in BOINC 7.x, then use the old version should solve the problem. But it does not. So if it's still a BOINC,when it is present in both 7.x and 6.x. But for example removing videocard from computer FIX problem (despite which version of BOINC is used) + no problems in other dc project is a reason too of course. |
Chilean Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
|
Alez Send message Joined: 3 Apr 12 Posts: 13 Credit: 3,534,368 RAC: 0 |
Boinc 7.40 made no difference. unit ran all way to conclusion and same invalid result. |
Mad_Max Send message Joined: 31 Dec 09 Posts: 209 Credit: 25,935,543 RAC: 12,792 |
NV GTX 670 + BOINC 6.12.34 + Rosetta@home = 100% Error rate (same as with 7.x BOINC): https://boinc.bakerlab.org/rosetta/results.php?hostid=1584545&offset=20 |
Rush Send message Joined: 10 Mar 07 Posts: 1 Credit: 23,293,152 RAC: 2,884 |
For what it's worth, I was encountering the client errors on all Rosetta work units using BOINC 7.0.27 switched to 6.12.34 and the first three units completed reported successful. System specs from event log: Wed 12 Dec 2012 02:18:33 PM EST | | Starting BOINC client version 6.12.34 for x86_64-pc-linux-gnu Wed 12 Dec 2012 02:18:33 PM EST | | log flags: file_xfer, sched_ops, task Wed 12 Dec 2012 02:18:33 PM EST | | Libraries: libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3 Wed 12 Dec 2012 02:18:33 PM EST | | Data directory: /home/rush/BOINC Wed 12 Dec 2012 02:18:33 PM EST | | Processor: 6 AuthenticAMD AMD Phenom(tm) II X6 1045T Processor [Family 16 Model 10 Stepping 0] Wed 12 Dec 2012 02:18:33 PM EST | | Processor: 512.00 KB cache Wed 12 Dec 2012 02:18:33 PM EST | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni monitor Wed 12 Dec 2012 02:18:33 PM EST | | OS: Linux: 3.2.0-23-generic Wed 12 Dec 2012 02:18:33 PM EST | | Memory: 7.80 GB physical, 1.75 GB virtual Wed 12 Dec 2012 02:18:33 PM EST | | Disk: 217.22 GB total, 172.16 GB free Wed 12 Dec 2012 02:18:33 PM EST | | Local time is UTC -5 hours Wed 12 Dec 2012 02:18:33 PM EST | | NVIDIA GPU 0: GeForce 8400 GS (driver version unknown, CUDA version 4020, compute capability 1.1, 512MB, 22 GFLOPS peak) Wed 12 Dec 2012 02:18:33 PM EST | | No general preferences found - using BOINC defaults Wed 12 Dec 2012 02:18:33 PM EST | | Preferences: Wed 12 Dec 2012 02:18:33 PM EST | | max memory usage when active: 3993.34MB Wed 12 Dec 2012 02:18:33 PM EST | | max memory usage when idle: 7188.00MB Wed 12 Dec 2012 02:18:33 PM EST | | max disk usage: 10.00GB Wed 12 Dec 2012 02:18:33 PM EST | | don't use GPU while active Wed 12 Dec 2012 02:18:33 PM EST | | suspend work if non-BOINC CPU load exceeds 25 % I have windows boxes with various configurations running V7 without any errors on Rosetta work units. Rush |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
I guess i'll add mine to the one's that DON'T have a problem with running Rosetta, this is a fairly new toy I built it's been running here with a slight O/C of 3.9ghz running 8 threads, and I don't use gpu's on it can't be bothered with them. ------------------------------------ Thu 13 Dec 2012 07:17:00 EST | | Starting BOINC client version 7.0.27 for x86_64-pc-linux-gnu Thu 13 Dec 2012 07:17:00 EST | | log flags: file_xfer, sched_ops, task Thu 13 Dec 2012 07:17:00 EST | | Libraries: libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3 Thu 13 Dec 2012 07:17:00 EST | | Data directory: /var/lib/boinc-client Thu 13 Dec 2012 07:17:00 EST | | Processor: 8 GenuineIntel Intel(R) Core(TM) i7-2700K CPU @ 3.50GHz [Family 6 Model 42 Stepping 7] Thu 13 Dec 2012 07:17:00 EST | | Processor: 8.00 MB cache Thu 13 Dec 2012 07:17:00 EST | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid Thu 13 Dec 2012 07:17:00 EST | | OS: Linux: 3.2.0-34-generic Thu 13 Dec 2012 07:17:00 EST | | Memory: 7.76 GB physical, 7.96 GB virtual Thu 13 Dec 2012 07:17:00 EST | | Disk: 65.43 GB total, 51.41 GB free Thu 13 Dec 2012 07:17:00 EST | | Local time is UTC +11 hours Thu 13 Dec 2012 07:17:00 EST | | No usable GPUs found Thu 13 Dec 2012 07:17:00 EST | | Config: GUI RPC allowed from: Thu 13 Dec 2012 07:17:00 EST | | A new version of BOINC is available. <a href=http://boinc.berkeley.edu/download.php>Download it.</a> Thu 13 Dec 2012 07:17:00 EST | World Community Grid | URL http://www.worldcommunitygrid.org/; Computer ID 2133997; resource share 100 Thu 13 Dec 2012 07:17:00 EST | rosetta@home | URL https://boinc.bakerlab.org/rosetta/; Computer ID 1557494; resource share 100 Thu 13 Dec 2012 07:17:00 EST | World Community Grid | General prefs: from World Community Grid (last modified 14-Nov-2012 15:29:20) Thu 13 Dec 2012 07:17:00 EST | World Community Grid | Computer location: school Thu 13 Dec 2012 07:17:00 EST | | General prefs: using separate prefs for school Thu 13 Dec 2012 07:17:00 EST | | Preferences: Thu 13 Dec 2012 07:17:00 EST | | max memory usage when active: 6359.16MB Thu 13 Dec 2012 07:17:00 EST | | max memory usage when idle: 7154.05MB Thu 13 Dec 2012 07:17:00 EST | | max disk usage: 10.00GB Thu 13 Dec 2012 07:17:00 EST | | don't use GPU while active Thu 13 Dec 2012 07:17:00 EST | | (to change preferences, visit the web site of an attached project, or select Preferences in the Manager) Thu 13 Dec 2012 07:17:00 EST | | Not using a proxy |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,162,382 RAC: 4,112 |
I guess i'll add mine to the one's that DON'T have a problem with running Is there a chance I can convince you gpu's are a GOOD thing? I am including my signature for you to see what gpu's can do for both your points BUT MORE IMPORTANTLY the Projects!! The first 6 are gpu only numbers, the next few, down to Seti, are a mixture of gpu and cpu numbers. Seti on down is cpu ONLY!! You can contribute ALOT more to a project thru a gpu then you can with a cpu! Now NOT all projects can use a gpu yet, Rosetta, being a prime example of that, but for those that can your contribution to the cause can be exponential compared to a gpu! I too used to crunch with cpu's only, but in the last few years gpu's have taken off and even the newer 'Sandy Bridge' i7's have built in gpu's that the new version of Boinc. 7.0.40, can detect and use to crunch with. |
mmstick Send message Joined: 4 Dec 12 Posts: 8 Credit: 606,792 RAC: 0 |
I had no idea this was a problem. I've been crunching with my Radeon HD 7950 in World Community Grid and POEM@Home while doing Rosetta@home tasks and never had a single problem with invalidated or errored work units; Using BOINC v7 as well. |
Chilean Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
I guess i'll add mine to the one's that DON'T have a problem with running My i7 is giving around +4K of RAC, which is pretty impressive. It's around 75 GFLOPs fast (or so I've read). Yet my NVIDIA card (which is running GPUGRID), is currently doing 100K of RAC and still climbing. It's around 850 GFLOPs fast. This is why I rather crunch WCG (which uses the Rosetta software) with my CPU instead of R@H. Now, if R@H would fix this bug, I'd gladly join back. |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,162,382 RAC: 4,112 |
I had no idea this was a problem. I've been crunching with my Radeon HD 7950 in World Community Grid and POEM@Home while doing Rosetta@home tasks and never had a single problem with invalidated or errored work units; Using BOINC v7 as well. Umm not exactly: 549205051 499214705 9 Dec 2012 22:07:32 UTC 11 Dec 2012 7:06:39 UTC Over Client error Compute error 11,468.76 79.64 --- 549203504 499213284 9 Dec 2012 21:58:59 UTC 9 Dec 2012 23:27:03 UTC Over Validate error Done 580.03 --- --- 549209311 499218156 9 Dec 2012 22:45:08 UTC 14 Dec 2012 5:32:01 UTC Over Validate error Done 215.59 --- --- And then a ton of units 'aborted by user'. I sent as far back as the stats I can see and you only had one valid unit that you credits for. You may have had nothing but success prior to what I can see, I have no idea, but you did have some problems too. I still think the problem is based around the gpu and it's drivers, Chilean has two things that are contradictory there...his list says: Thu 13 Dec 2012 07:17:00 EST | | No usable GPUs found but further down he says "Yet my NVIDIA card (which is running GPUGRID)", so either they are not from the same pc or there IS a problem someplace! |
Student Send message Joined: 24 Oct 06 Posts: 3 Credit: 57,404 RAC: 0 |
I've just tried out turning off the graphic card in my laptop (core i5-3210M, GT640m, 8GB ram) in BIOS (from switchable to intergrated) and it works. Units got gredit immediatelly! What a surprise :-). All wu's granted successfully except for one I didn't manage to download, I don't know why and I don't care) Before I bought this my computer I had been crunching only Rosetta. But with this new computer all WU's got client error, so I changed project. Sorry :-(. Dowgrading from BM 7 to BM 6 didn't help. Now with dectivated GT640M and with BM 7.0.28 it's working well. My conclusion is, the problem is not only the Boinc manager because all other projects works well, cept for rosetta. My question is, why is just the presence of grafic card causing the troubles, when all other project goes well??? I really didin't buy this laptop to have graphic card off :-). |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,162,382 RAC: 4,112 |
I've just tried out turning off the graphic card in my laptop (core i5-3210M, GT640m, 8GB ram) in BIOS (from switchable to intergrated) and it works. Units got gredit immediatelly! What a surprise :-). All wu's granted successfully except for one I didn't manage to download, I don't know why and I don't care) I think that since it works everywhere else, ie other projects, it must be the Server side of the equation that is causing the problems. The unit crunches just fine, it is when we send them back that the problem occurs. Somehow the Server is looking for something and getting something else and CRASH goes the unit. The beta project for Rosetta works just fine with no problems, so someone made a change, to the Rosetta Server, and obviously had no clue what they were doing or how it would affect other things! They are probably STILL walking around clueless, otherwise one would think they would FIX IT!! |
Message boards :
Number crunching :
Current issues with 7+ boinc client
©2024 University of Washington
https://www.bakerlab.org