Message boards : Number crunching : Client errors
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next
Author | Message |
---|---|
Ananas Send message Joined: 1 Jan 06 Posts: 232 Credit: 752,471 RAC: 0 |
This definitely is a core client problem. It usually occurs when the cache is either really stuffed (high values for the first two options in "Network usage") or the other project has collected a really high "long term debit". Unfortunately the BOINC GUI has no feature to reset those debits, the command line thingie can do it though, e.g. : boinccmd.exe --host <YourComputerName> --set_debts http://boinc.fzk.de/poem/ 0 0 or boinccmd.exe --host <YourComputerName> --set_debts https://boinc.bakerlab.org/rosetta/ 0 100000 The second value is the one to modify the long term debits but the command needs the one for short term as well, that's why you have to put both 0's |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,208,737 RAC: 3,249 |
Okay but ALL pc's say that as one project I am trying to get to a goal on has no units, so NO project is the highest priority, except the one with the goal, but since it has no units why wouldn't Rosie fill in with some? THAT is the problem, I had NO cpu units on the pc and Rosie refused to get any despite having plenty available!! On that pc I only had 3 projects selected, 1 was a gpu project, Poem, I had it set to NOT get cpu units, and ABC which has no units and Rosie. Obviously I want to crunch Rosie when there are no ABC units. Poem got plenty of gpu units, while my cpu cores were starving. It is a 6 core pc and I only run 1 gpu unit at a time on that pc, so Poem using 1 cpu core made no difference. BUT I am off to Poem on that pc now and it is okay. With the size of the credits here my goal is a LONG way off, 1 pc here or there for me is immaterial in the long run. Thanks for your help! |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
Use <work_fetch_debug>1</work_fetch_debug> in cc_config.xml to see BOINC's work fetch policy decisions in the log. . |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,208,737 RAC: 3,249 |
Use <work_fetch_debug>1</work_fetch_debug> in cc_config.xml to see BOINC's work fetch policy decisions in the log. THANKS! |
Pushkin Send message Joined: 10 Mar 07 Posts: 14 Credit: 7,068,050 RAC: 0 |
Hi, I did some tests, which suprised surprised me a little. First I tried to run Rosetta in a separate session without X running - both tasks eded up with client error (see results no. 560429506 and 560429572). Then I ran Rosetta in a virtual machine (VirtualBox 4.1.18, BOINC 7.0.27 x86_64) with Windows installed. This task succeeded (task no. 560619860). There is a lot of emulated hardware, so it did not suprise me too much. But my third try was running Rosetta in WINE and BOINC 7.0.27 x86 - this task (no. 560653185) ended up with success again. There is no emulated hardware, just a few libraries, so I am quite suprised. Or is WINE for Rosetta so much different environment in comparison to native Linux? Tomorrow I will try to replace nVidia drivers by opensource drivers and I'll let you know about the result. Greetings, Pushkin |
Pushkin Send message Joined: 10 Mar 07 Posts: 14 Credit: 7,068,050 RAC: 0 |
Hi, Hi, today I installed Nouveau driver instead of proprietary nVidia drivers. The result is - success. The task no. 560822374 ended without client error. It seems, that the problem is really caused by nVidia drivers, not the hardware itself. Greetings, Pushkin |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1997 Credit: 9,747,451 RAC: 10,562 |
I'll ask people here to submit more test jobs to Ralph. By the way, Ralph code of 3.45 sucks.... No screensever, no checkpoint, a LOT of errors. Please, fix it |
Polian Send message Joined: 21 Sep 05 Posts: 152 Credit: 10,141,266 RAC: 0 |
Nice test! To add, on one box, I am using kmod vice nouveau without any problems. It is, however, a 1st-gen i7-950 with a GTX460, not an Ivy Bridge CPU nor the latest model video card. It seems to me that most (all?) folks having this particular problem are using nVidia cards with Ivy Bridge CPUs - but this could be wrong. Can anyone verify? |
Chilean Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
Ivy Bridge here. Bug here. |
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
I'll ask people here to submit more test jobs to Ralph. Any specific examples would help. 3.45 is what is running on R@h so hopefully there isn't a general issue that we need to address other than the ones we are already aware of. Also, there may be a lot of errors because some lab members in our group are testing new jobs on Ralph (the main purpose of Ralph) so they may fail. |
JAMES DORISIO Send message Joined: 25 Dec 05 Posts: 15 Credit: 201,474,191 RAC: 28,532 |
This computer also has this problem & is not Ivy Bridge. Intel(R) Pentium(R) 4 CPU 3.00GHz Nvidia gts450. Ubuntu linux 12.04 amd64 ,nvidia driver 310.14, Boinc 7.0.27 It was ok until upgrading from Ubuntu 10.04 & new drivers & boinc that came with it. https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=1485068 I have been checking Ralph but it never shows tasks available, I will try to set up a computer there anyway as soon as i get a chance. Jim |
Polian Send message Joined: 21 Sep 05 Posts: 152 Credit: 10,141,266 RAC: 0 |
This computer also has this problem & is not Ivy Bridge. Ugh. There goes that theory, heh. I was initially wondering about the integrated graphics controller in ivy bridge CPUs with NVIDIA cards. |
Mad_Max Send message Joined: 31 Dec 09 Posts: 209 Credit: 26,262,530 RAC: 19,111 |
Yes Ivy Bridge is not essential part of the bug. This bug seen on many CPUs from Pentium 4 through 4 generations of "Core" CPUs. (not sure about AMD processors) Nvidia GPU is essential part. Seems not hardware but active(not just intalled) nv driver. Even not drivers itself, but someting drivers related. No clear "good" or clearly "bad" versions. But the bug is most often seen after installing / updating drivers. And often disappears after changing to a different version. |
Pushkin Send message Joined: 10 Mar 07 Posts: 14 Credit: 7,068,050 RAC: 0 |
Hi guys, something strange happened. After all those playing with drivers I went back to proprietary drivers and since then I receive successful tasks - 560862664, 561025484, 561025532, 561026069 and 561026753. Did anything change in Rosetta code or should I start to believe in miracles? :-O Greetings, Pushkin |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,208,737 RAC: 3,249 |
Yes Ivy Bridge is not essential part of the bug. This bug seen on many CPUs from Pentium 4 through 4 generations of "Core" CPUs. (not sure about AMD processors) I have an Ivy Bridge cpu in this laptop and it is working just fine. I have an Intel i7-3612QM cpu and an Intel HD Graphics 4000 gpu. Rosetta is working just fine, knock on wood! NO gpu crunching though! |
JuhaM Send message Joined: 2 Nov 07 Posts: 3 Credit: 2,740,103 RAC: 396 |
Tasks 560643811, 560643810, 560643809 and 560643787 all validated as invalid. Although I got credit from them. If I remember right the same has happened for about last six months. All task validate as invalid, but still grant credit. It's really confusing to crunch Rosetta when the outcome is this !?! I crunch other projects at the same time, mainly POEM GPU tasks and WCG CPU tasks. Hardware: BOINC version 7.0.27 CPU: Hardware Class: cpu Arch: X86-64 Vendor: "AuthenticAMD" Model: 21.1.2 "AMD FX(tm)-6100 Six-Core Processor GPU: NVidia GTX 460 driver 304.51 (from Ubuntu repository) Distributor ID: Ubuntu Description: Ubuntu 12.10 Release: 12.10 Codename: quantal Kernel: Linux 3.5.0-23-generic #35-Ubuntu SMP x86_64 RAM 16 GB |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
Tasks 560643811, 560643810, 560643809 and 560643787 all validated as invalid. Although I got credit from them. Here you also get credits for invalid results. They are doing that AFAIK because of the rather high percentage of bad WUs. (Although IMHO credits should be only awarded if the WU errors out for both wingmen, with the current way many people for sure ignore issues with their computers.) . |
JAMES DORISIO Send message Joined: 25 Dec 05 Posts: 15 Credit: 201,474,191 RAC: 28,532 |
Successful tasks completed on ralph@home. I managed to pick up some tasks on ralph. Computer ralph http://ralph.bakerlab.org/show_host_detail.php?hostid=29722 Tasks for computer ralph (all success) http://ralph.bakerlab.org/results.php?hostid=29722 Computer rosetta https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=1579123 Tasks for computer rosetta (all client error) https://boinc.bakerlab.org/rosetta/results.php?hostid=1579123 Intel I7-3770, Ubuntu linux 12.04 amd64, nvidia driver 310.14. Boinc 7.0.27 There were no changes to this computer, same exact setup, it actually ran some ralph and rosetta tasks at the same time. To David I hope this comfirms that ralph does not have this issue. If you any questions please post them or PM me. Thanks Jim |
Alun Send message Joined: 27 Feb 10 Posts: 5 Credit: 69,418 RAC: 0 |
Is anyone actually actively investigating the issues with nvidia drivers & cards at the moment, or (as it seems from the forums) is it falling to the community to find the problem in the UoW's Rosetta applications? Question / point: If it was purely a driver issue wouldn't we be seeing errors on other GPU projects running on the same box? GPUGrid, Milkyway, Einstein and SETI are all fine - only Rosetta gets borked by updated drivers... |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,208,737 RAC: 3,249 |
Is anyone actually actively investigating the issues with nvidia drivers & cards at the moment, or (as it seems from the forums) is it falling to the community to find the problem in the UoW's Rosetta applications? AND as James Dorsio points out the projects Beta site works just fine!! |
Message boards :
Number crunching :
Client errors
©2024 University of Washington
https://www.bakerlab.org