Message boards : Number crunching : Rosetta stops crunching
Author | Message |
---|---|
fjpod Send message Joined: 9 Nov 07 Posts: 17 Credit: 2,201,029 RAC: 0 |
Is it just me?? For the past week, one of my computers (dual core) would spontaneously stop crunching Rosetta on one core only. The timer keeps going way up, but when the work isn't getting done. When you look in task manager, you can see that 50% of the cores are idle. At first I thought something was going wrong with my computer/CPU, but now it is also happening on one of my other computers (Q6700). Is there something wrong with the WUs? |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,208,737 RAC: 3,249 |
Is it just me?? For the past week, one of my computers (dual core) would spontaneously stop crunching Rosetta on one core only. The timer keeps going way up, but when the work isn't getting done. When you look in task manager, you can see that 50% of the cores are idle. At first I thought something was going wrong with my computer/CPU, but now it is also happening on one of my other computers (Q6700). No it is more likely a setting in the Boinc Client...I see you are using the 6.12.?? version in at least one of your pc's, go into the Boinc Manager, down by the clock, and open Tools, then Computing preferences, click on the processor usage tab and see if there is a number other than zero on the line that says "While processor usage is less than [_] percent (0 means not restriction)". If there is a number change it to a zero, the default is 25. This will tell your pc that no matter what else is running continue running Boinc at the low priority setting. With the default of 25 in the box it means that when your cpu usage hits 25% remaining, or unused, stop crunching Boinc until it returns above that. This setting has been in Boinc for quite a while now so maybe your pc is just busier doing other things lately. REMEMBER to click OK at the bottom to actually accept and make the change. This is a pc by pc change, it is not a global one. |
fjpod Send message Joined: 9 Nov 07 Posts: 17 Credit: 2,201,029 RAC: 0 |
I had raised this number to 50 or 60 when I went on 6.12, because at 25 the CPUs were stopping frequently. I'll give zero a try...thanks. |
fjpod Send message Joined: 9 Nov 07 Posts: 17 Credit: 2,201,029 RAC: 0 |
OOPS...I was already running on zero restriction, so that can't be it...and usually if cpus stop due to this restriction, they all stop, but in my case, only one or two are stopping. The only way to get them going again is to shut down Boinc and restart it. In the case of busy cpus, Boinc manager notifys you that cpus are suspended...but not in my case. The countdown clock keeps going as if nothing is wrong. Anybody else seeing this? |
svincent Send message Joined: 30 Dec 05 Posts: 219 Credit: 12,120,035 RAC: 0 |
This problem has been mentioned before (in the Mini Rosetta 3.14 thread I believe). It seems to happen to me only on W7 and only on tasks with names starting with T followed by a digit number. A workaround is to quit and restart BOINC. |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,208,737 RAC: 3,249 |
I had raised this number to 50 or 60 when I went on 6.12, because at 25 the CPUs were stopping frequently. I'll give zero a try...thanks. 50 or 60 would have been going the wrong way, you would want 10 or 15 to have LESS stoppages. It is kind of reverse thinking...when the pc hits that percentage of unused cpu it stops Boinc, so at 25% 76% of the cpu can be doing other things, but at 50% only 50% of the cpu can be doing other things before Boinc gets stopped. But I am sorry that is not the problem!! |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,821,902 RAC: 15,180 |
I had raised this number to 50 or 60 when I went on 6.12, because at 25 the CPUs were stopping frequently. I'll give zero a try...thanks. I don't think that's right Mikey - BOINC is allowed to run when other CPU usage is less than that value, so a higher value means BOINC can run while more other stuff is running. I am assuming it does what it says though - I haven't tested it! |
mikey Send message Joined: 5 Jan 06 Posts: 1895 Credit: 9,208,737 RAC: 3,249 |
I had raised this number to 50 or 60 when I went on 6.12, because at 25 the CPUs were stopping frequently. I'll give zero a try...thanks. You could be right, I always put mine at zero and just let it run. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Yes there is still some quirk with the Boinc manager where it thinks a task is active, but no CPU time is allocated to it. Yes, a complete exit of BOINC and restart seems to generally be the best resolution. Rosetta Moderator: Mod.Sense |
fjpod Send message Joined: 9 Nov 07 Posts: 17 Credit: 2,201,029 RAC: 0 |
good to know that I am not the only one to notice this. I was beginning to think something was wrong with my hardware. I think the WUs are defective, because once a batch gets processed, the next batch will be OK. ...and the right way to allow more cpu use is to raise the number from 10 to 20 to...80, and finally 00. The 00 really should be 100 (%) useage. The 00 is really a misnomer and counter-intuitive...but hey, who's complaining. |
Message boards :
Number crunching :
Rosetta stops crunching
©2024 University of Washington
https://www.bakerlab.org