Message boards : Number crunching : Waiting to Run
Author | Message |
---|---|
E the P Send message Joined: 5 Jun 06 Posts: 36 Credit: 28,333,251 RAC: 0 |
I'm running BOINC on an Ubuntu 12 system and about 6-8 weeks ago it began to develop a problem (no new software/hardware changes). It will frequently get stuck with one job at the "Waiting to Run" state. If I manuall abort that work unit it will begin to run the next job normally. The pattern is inconsistant. Sometimes it will process 2-4 work units just fine, other times it will hang on 2-3 in a row. Any thoughts? |
Polian Send message Joined: 21 Sep 05 Posts: 152 Credit: 10,141,266 RAC: 0 |
|
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Are you running other BOINC projects as well? Or is Rosetta the only tasks your machine has? BOINC does not necessarily run the resource allocations the way a person would. It balances a number of objectives to determine which tasks would be best to run at any given time. When you delete a task from your machine, it changes the information the BOINC Manager is using to decide what to run next. In some cases, that might cause it to run another Rosetta task immediately, other times, perhaps not. Does your BOINC configuration allow more active CPUs than you are seeing remain active? Or is your concern about which tasks the active CPUs are working on? Rosetta Moderator: Mod.Sense |
E the P Send message Joined: 5 Jun 06 Posts: 36 Credit: 28,333,251 RAC: 0 |
Are you running other BOINC projects as well? Or is Rosetta the only tasks your machine has? BOINC does not necessarily run the resource allocations the way a person would. It balances a number of objectives to determine which tasks would be best to run at any given time. When you delete a task from your machine, it changes the information the BOINC Manager is using to decide what to run next. In some cases, that might cause it to run another Rosetta task immediately, other times, perhaps not. My settings allow more than one CPU active. The problem is that a job will start and then quickly move to "waiting to run" which holds up all the other jobs and basically BOINC stops (I'm only running Rosetta@home). It stays that way until I "abort" the waiting job which then breaks the log jam and starts the next job in line. |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
Since your computers are hidden, we can only guess, but maybe you have not enough RAM? How many CPU cores do you have/use and how much RAM does the machine have? How much of it do you allow BOINC to use? . |
E the P Send message Joined: 5 Jun 06 Posts: 36 Credit: 28,333,251 RAC: 0 |
Since your computers are hidden, we can only guess, but maybe you have not enough RAM? RAM could be an issue, it's an old machine. My current settings are 75% of memory when in use and 90% when not in use. |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
And how many CPU cores do you have, how many of them is BOINC allowed to use and how much RAM does the machine have? For Rosetta you need at least 512MB (better 1GB) per core (available to BOINC) + 1-2GB for OS. So for example a dual core machine should have 4GB, quad core 6GB and so on. For 32-bit machines more than 4GB does of course not make any sense, but it might be difficult to run 4 Rosetta WUs at once on such machines, specially if you also do something else on them. . |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
If BOINC hit the configured memory limit, it would put a task on hold as you describe and it may not decide to start other tasks when it has already started one that didn't have enough memory to run. For a machine that is short of memory, I'd suggest adding a project that requires less memory to the mix. That way the machine can run a Rosetta task (it sounds like at least one has enough memory to keep running), along with one or more tasks from the other project. Many of the subprojects of WCG have much lower memory requirements. The BOINC Manager monitors memory usage and as it detects memory usage that exceeds the configured value, it suspends a task. If other tasks are available that have lower memory requirements, it will run them and take care of getting enough new work to keep the machine busy. Rosetta Moderator: Mod.Sense |
E the P Send message Joined: 5 Jun 06 Posts: 36 Credit: 28,333,251 RAC: 0 |
If BOINC hit the configured memory limit, it would put a task on hold as you describe and it may not decide to start other tasks when it has already started one that didn't have enough memory to run. Frankly, I'm not certain how many cores I have. I don't have a lot of memory on the machine 512MB? I don't mind it holding, but it appears to hang and not start any other jobs. I can't find a way to set preferences for this machine only since the Unbuntu version doesn't appear to have that option. |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Have a look at the global preferences file mentioned here. Specifically the <max_cpus> tag would be one way to specify to only use 1 CPU. Also, as you start BOINC Manager, the message log will record how many CPUs it finds on the machine. You can display the message log using the command line interface described here. Specifically the --get_messages operation. Rosetta Moderator: Mod.Sense |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
Frankly, I'm not certain how many cores I have. I don't have a lot of memory on the machine 512MB? I'd suggest, that you unhide your computers in your Rosetta@Home Preferences, than we can tell you that and we won't need to guess so much. . |
E the P Send message Joined: 5 Jun 06 Posts: 36 Credit: 28,333,251 RAC: 0 |
Frankly, I'm not certain how many cores I have. I don't have a lot of memory on the machine 512MB? My computers are now viewable. |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
According to your first post, the problem occurs on a Linux system, that would be this one. So yes, it has not enough RAM, 1GB for 2 cores is not enough for Rosetta most of the time, specially if you are doing something else with this machine too. Since it's a Pentium 4 it has just one physical core, so it won't hurt much if you limit BOINC to 50% of the CPUs for this machine. Alternatively you can allow BOINC to use more RAM, but that might lead to quite excessive page file usage. I'm actually surprised, that you don't experience similar problems on your Xeon machine, that has also just 1GB for 2 cores. . |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,677,569 RAC: 10,479 |
Since it's a Pentium 4 it has just one physical core, so it won't hurt much if you limit BOINC to 50% of the CPUs for this machine. I agree: limiting BOINC to 50% of (virtual) CPUs is the sensible way to go. |
E the P Send message Joined: 5 Jun 06 Posts: 36 Credit: 28,333,251 RAC: 0 |
Since it's a Pentium 4 it has just one physical core, so it won't hurt much if you limit BOINC to 50% of the CPUs for this machine. How do I do that for this machine since the LINUX management interface does not have a machine specific preference section like the Windows client? |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
How do I do that for this machine since the LINUX management interface does not have a machine specific preference section like the Windows client? You have two possibilities: 1. Configure remote access from another host, see Controlling BOINC remotely. 2. Assign this host to a new venue (home/school/work) and set it there. I'd go with the first one, that way you can always check easily what that computer is doing from any other Windows machine and eventually adjust some settings if needed. . |
dcdc Send message Joined: 3 Nov 05 Posts: 1832 Credit: 119,677,569 RAC: 10,479 |
I think you can just create/edit your cc_config.xml to include: <ncpus>1</ncpus> from here: http://boinc.berkeley.edu/wiki/Client_configuration |
E the P Send message Joined: 5 Jun 06 Posts: 36 Credit: 28,333,251 RAC: 0 |
How do I do that for this machine since the LINUX management interface does not have a machine specific preference section like the Windows client? I put this computer in a new group and told it to use only one CPU and 50% of memory. I'll report back to see if this does anythng. |
Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0 |
I put this computer in a new group and told it to use only one CPU and 50% of memory. I'll report back to see if this does anythng. 50% of 1GB, i.e. 512MB might be not enough for some Rosetta tasks, better make 75% out of it again (also for when idle). . |
E the P Send message Joined: 5 Jun 06 Posts: 36 Credit: 28,333,251 RAC: 0 |
I put this computer in a new group and told it to use only one CPU and 50% of memory. I'll report back to see if this does anythng. Ok, I'll give that a shot. |
Message boards :
Number crunching :
Waiting to Run
©2024 University of Washington
https://www.bakerlab.org