Waiting to Run

Message boards : Number crunching : Waiting to Run

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Profile E the P

Send message
Joined: 5 Jun 06
Posts: 36
Credit: 28,333,251
RAC: 0
Message 74054 - Posted: 19 Oct 2012, 18:24:47 UTC

I'm running BOINC on an Ubuntu 12 system and about 6-8 weeks ago it began to develop a problem (no new software/hardware changes). It will frequently get stuck with one job at the "Waiting to Run" state. If I manuall abort that work unit it will begin to run the next job normally. The pattern is inconsistant. Sometimes it will process 2-4 work units just fine, other times it will hang on 2-3 in a row. Any thoughts?
ID: 74054 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Polian
Avatar

Send message
Joined: 21 Sep 05
Posts: 152
Credit: 10,141,266
RAC: 0
Message 74056 - Posted: 20 Oct 2012, 2:43:53 UTC

Sounds like a BOINC issue... what core client version are you using?
ID: 74056 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 74057 - Posted: 20 Oct 2012, 2:46:43 UTC

Are you running other BOINC projects as well? Or is Rosetta the only tasks your machine has? BOINC does not necessarily run the resource allocations the way a person would. It balances a number of objectives to determine which tasks would be best to run at any given time. When you delete a task from your machine, it changes the information the BOINC Manager is using to decide what to run next. In some cases, that might cause it to run another Rosetta task immediately, other times, perhaps not.

Does your BOINC configuration allow more active CPUs than you are seeing remain active? Or is your concern about which tasks the active CPUs are working on?
Rosetta Moderator: Mod.Sense
ID: 74057 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile E the P

Send message
Joined: 5 Jun 06
Posts: 36
Credit: 28,333,251
RAC: 0
Message 74074 - Posted: 22 Oct 2012, 13:28:49 UTC - in response to Message 74057.  

Are you running other BOINC projects as well? Or is Rosetta the only tasks your machine has? BOINC does not necessarily run the resource allocations the way a person would. It balances a number of objectives to determine which tasks would be best to run at any given time. When you delete a task from your machine, it changes the information the BOINC Manager is using to decide what to run next. In some cases, that might cause it to run another Rosetta task immediately, other times, perhaps not.

Does your BOINC configuration allow more active CPUs than you are seeing remain active? Or is your concern about which tasks the active CPUs are working on?


My settings allow more than one CPU active. The problem is that a job will start and then quickly move to "waiting to run" which holds up all the other jobs and basically BOINC stops (I'm only running Rosetta@home). It stays that way until I "abort" the waiting job which then breaks the log jam and starts the next job in line.
ID: 74074 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Link
Avatar

Send message
Joined: 4 May 07
Posts: 356
Credit: 382,349
RAC: 0
Message 74077 - Posted: 22 Oct 2012, 18:39:42 UTC - in response to Message 74074.  

Since your computers are hidden, we can only guess, but maybe you have not enough RAM?

How many CPU cores do you have/use and how much RAM does the machine have? How much of it do you allow BOINC to use?
.
ID: 74077 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile E the P

Send message
Joined: 5 Jun 06
Posts: 36
Credit: 28,333,251
RAC: 0
Message 74081 - Posted: 23 Oct 2012, 12:39:16 UTC - in response to Message 74077.  

Since your computers are hidden, we can only guess, but maybe you have not enough RAM?

How many CPU cores do you have/use and how much RAM does the machine have? How much of it do you allow BOINC to use?


RAM could be an issue, it's an old machine. My current settings are 75% of memory when in use and 90% when not in use.
ID: 74081 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Link
Avatar

Send message
Joined: 4 May 07
Posts: 356
Credit: 382,349
RAC: 0
Message 74083 - Posted: 23 Oct 2012, 13:53:19 UTC - in response to Message 74081.  
Last modified: 23 Oct 2012, 13:54:24 UTC

And how many CPU cores do you have, how many of them is BOINC allowed to use and how much RAM does the machine have? For Rosetta you need at least 512MB (better 1GB) per core (available to BOINC) + 1-2GB for OS. So for example a dual core machine should have 4GB, quad core 6GB and so on. For 32-bit machines more than 4GB does of course not make any sense, but it might be difficult to run 4 Rosetta WUs at once on such machines, specially if you also do something else on them.
.
ID: 74083 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 74096 - Posted: 25 Oct 2012, 2:02:05 UTC

If BOINC hit the configured memory limit, it would put a task on hold as you describe and it may not decide to start other tasks when it has already started one that didn't have enough memory to run.

For a machine that is short of memory, I'd suggest adding a project that requires less memory to the mix. That way the machine can run a Rosetta task (it sounds like at least one has enough memory to keep running), along with one or more tasks from the other project.

Many of the subprojects of WCG have much lower memory requirements. The BOINC Manager monitors memory usage and as it detects memory usage that exceeds the configured value, it suspends a task. If other tasks are available that have lower memory requirements, it will run them and take care of getting enough new work to keep the machine busy.
Rosetta Moderator: Mod.Sense
ID: 74096 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile E the P

Send message
Joined: 5 Jun 06
Posts: 36
Credit: 28,333,251
RAC: 0
Message 74100 - Posted: 25 Oct 2012, 20:50:52 UTC - in response to Message 74096.  

If BOINC hit the configured memory limit, it would put a task on hold as you describe and it may not decide to start other tasks when it has already started one that didn't have enough memory to run.

For a machine that is short of memory, I'd suggest adding a project that requires less memory to the mix. That way the machine can run a Rosetta task (it sounds like at least one has enough memory to keep running), along with one or more tasks from the other project.

Many of the subprojects of WCG have much lower memory requirements. The BOINC Manager monitors memory usage and as it detects memory usage that exceeds the configured value, it suspends a task. If other tasks are available that have lower memory requirements, it will run them and take care of getting enough new work to keep the machine busy.


Frankly, I'm not certain how many cores I have. I don't have a lot of memory on the machine 512MB? I don't mind it holding, but it appears to hang and not start any other jobs. I can't find a way to set preferences for this machine only since the Unbuntu version doesn't appear to have that option.
ID: 74100 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 74102 - Posted: 26 Oct 2012, 0:37:09 UTC

Have a look at the global preferences file mentioned here. Specifically the <max_cpus> tag would be one way to specify to only use 1 CPU.

Also, as you start BOINC Manager, the message log will record how many CPUs it finds on the machine. You can display the message log using the command line interface described here. Specifically the --get_messages operation.
Rosetta Moderator: Mod.Sense
ID: 74102 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Link
Avatar

Send message
Joined: 4 May 07
Posts: 356
Credit: 382,349
RAC: 0
Message 74120 - Posted: 28 Oct 2012, 11:38:54 UTC - in response to Message 74100.  
Last modified: 28 Oct 2012, 11:39:32 UTC

Frankly, I'm not certain how many cores I have. I don't have a lot of memory on the machine 512MB?

I'd suggest, that you unhide your computers in your Rosetta@Home Preferences, than we can tell you that and we won't need to guess so much.
.
ID: 74120 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile E the P

Send message
Joined: 5 Jun 06
Posts: 36
Credit: 28,333,251
RAC: 0
Message 74131 - Posted: 30 Oct 2012, 12:22:04 UTC - in response to Message 74120.  

Frankly, I'm not certain how many cores I have. I don't have a lot of memory on the machine 512MB?

I'd suggest, that you unhide your computers in your Rosetta@Home Preferences, than we can tell you that and we won't need to guess so much.


My computers are now viewable.
ID: 74131 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Link
Avatar

Send message
Joined: 4 May 07
Posts: 356
Credit: 382,349
RAC: 0
Message 74132 - Posted: 30 Oct 2012, 14:30:02 UTC - in response to Message 74131.  

According to your first post, the problem occurs on a Linux system, that would be this one.

So yes, it has not enough RAM, 1GB for 2 cores is not enough for Rosetta most of the time, specially if you are doing something else with this machine too. Since it's a Pentium 4 it has just one physical core, so it won't hurt much if you limit BOINC to 50% of the CPUs for this machine.

Alternatively you can allow BOINC to use more RAM, but that might lead to quite excessive page file usage. I'm actually surprised, that you don't experience similar problems on your Xeon machine, that has also just 1GB for 2 cores.
.
ID: 74132 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,677,569
RAC: 10,479
Message 74134 - Posted: 30 Oct 2012, 17:39:15 UTC - in response to Message 74132.  

Since it's a Pentium 4 it has just one physical core, so it won't hurt much if you limit BOINC to 50% of the CPUs for this machine.

I agree: limiting BOINC to 50% of (virtual) CPUs is the sensible way to go.
ID: 74134 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile E the P

Send message
Joined: 5 Jun 06
Posts: 36
Credit: 28,333,251
RAC: 0
Message 74151 - Posted: 1 Nov 2012, 12:28:01 UTC - in response to Message 74134.  

Since it's a Pentium 4 it has just one physical core, so it won't hurt much if you limit BOINC to 50% of the CPUs for this machine.

I agree: limiting BOINC to 50% of (virtual) CPUs is the sensible way to go.


How do I do that for this machine since the LINUX management interface does not have a machine specific preference section like the Windows client?
ID: 74151 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Link
Avatar

Send message
Joined: 4 May 07
Posts: 356
Credit: 382,349
RAC: 0
Message 74154 - Posted: 1 Nov 2012, 18:57:45 UTC - in response to Message 74151.  
Last modified: 1 Nov 2012, 19:18:44 UTC

How do I do that for this machine since the LINUX management interface does not have a machine specific preference section like the Windows client?

You have two possibilities:
1. Configure remote access from another host, see Controlling BOINC remotely.
2. Assign this host to a new venue (home/school/work) and set it there.

I'd go with the first one, that way you can always check easily what that computer is doing from any other Windows machine and eventually adjust some settings if needed.
.
ID: 74154 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,677,569
RAC: 10,479
Message 74158 - Posted: 2 Nov 2012, 11:23:22 UTC
Last modified: 2 Nov 2012, 11:23:41 UTC

I think you can just create/edit your cc_config.xml to include:

<ncpus>1</ncpus>

from here: http://boinc.berkeley.edu/wiki/Client_configuration
ID: 74158 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile E the P

Send message
Joined: 5 Jun 06
Posts: 36
Credit: 28,333,251
RAC: 0
Message 74187 - Posted: 6 Nov 2012, 14:29:43 UTC - in response to Message 74154.  

How do I do that for this machine since the LINUX management interface does not have a machine specific preference section like the Windows client?

You have two possibilities:
1. Configure remote access from another host, see Controlling BOINC remotely.
2. Assign this host to a new venue (home/school/work) and set it there.

I'd go with the first one, that way you can always check easily what that computer is doing from any other Windows machine and eventually adjust some settings if needed.


I put this computer in a new group and told it to use only one CPU and 50% of memory. I'll report back to see if this does anythng.
ID: 74187 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Link
Avatar

Send message
Joined: 4 May 07
Posts: 356
Credit: 382,349
RAC: 0
Message 74189 - Posted: 6 Nov 2012, 20:04:34 UTC - in response to Message 74187.  
Last modified: 6 Nov 2012, 20:06:04 UTC

I put this computer in a new group and told it to use only one CPU and 50% of memory. I'll report back to see if this does anythng.

50% of 1GB, i.e. 512MB might be not enough for some Rosetta tasks, better make 75% out of it again (also for when idle).
.
ID: 74189 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile E the P

Send message
Joined: 5 Jun 06
Posts: 36
Credit: 28,333,251
RAC: 0
Message 74194 - Posted: 7 Nov 2012, 14:00:28 UTC - in response to Message 74189.  

I put this computer in a new group and told it to use only one CPU and 50% of memory. I'll report back to see if this does anythng.

50% of 1GB, i.e. 512MB might be not enough for some Rosetta tasks, better make 75% out of it again (also for when idle).


Ok, I'll give that a shot.
ID: 74194 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : Waiting to Run



©2024 University of Washington
https://www.bakerlab.org