Waiting to Run

Author	Message
dcdc Send message Joined: 3 Nov 05 Posts: 1834 Credit: 124,413,051 RAC: 9,834	Message 74609 - Posted: 28 Nov 2012, 14:48:18 UTC Does it just switch to "Waiting to Run" when you move the mouse? With the current setting I believe it will switch back to the last checkpoint (probably 0%) when interrupted becuase when you move the mouse/press a key it will switch from 90% RAM available to 50% available, possibly causing the switch to "Waiting to Run" and back to 0% complete. Are you able to leave the work in memory (paged to disk) when the task is suspended? That way it won't have to drop back to the last checkpoint and can continue processing from where it was up to. i.e. change this to 1: <leave_apps_in_memory>0</leave_apps_in_memory> ID: 74609 · Rating: 0 · rate: / Reply Quote

Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0	Message 74611 - Posted: 28 Nov 2012, 16:07:15 UTC - in response to Message 74608. Last modified: 28 Nov 2012, 16:15:40 UTC Actually, shouldn't these be the other way around: <max_ncpus_pct>100.000000</max_ncpus_pct> <cpu_usage_limit>50.000000</cpu_usage_limit> I think max_ncpus_pct is the number of processors (so 50% to use one physical processor) and cpu_usage_limit is the proportion of run-time to pause-time while running. I'd recommend swap those values and get BOINC to re-read the file. Yep. Haven't seen that. Well I was a bit hopeful. With the following changes (see my current settings below) it ran for about a 1 1/2 days and then one job hit "waiting to run". You might have missed my first post: <leave_apps_in_memory>1</leave_apps_in_memory> <ram_max_used_busy_pct>65.000000</ram_max_used_busy_pct> If with that it should still go into waiting to run, I'd try: <ram_max_used_busy_pct>70.000000</ram_max_used_busy_pct> <ram_max_used_idle_pct>70.000000</ram_max_used_idle_pct> And than set in your client_state.xml (exit BOINC first): <user_run_request>1</user_run_request> You have there a 2 probably right now. . ID: 74611 · Rating: 0 · rate: / Reply Quote

E the P Send message Joined: 5 Jun 06 Posts: 36 Credit: 28,333,251 RAC: 0	Message 74627 - Posted: 30 Nov 2012, 13:42:48 UTC - in response to Message 74611. Actually, shouldn't these be the other way around: <max_ncpus_pct>100.000000</max_ncpus_pct> <cpu_usage_limit>50.000000</cpu_usage_limit> I think max_ncpus_pct is the number of processors (so 50% to use one physical processor) and cpu_usage_limit is the proportion of run-time to pause-time while running. I'd recommend swap those values and get BOINC to re-read the file. Yep. Haven't seen that. Well I was a bit hopeful. With the following changes (see my current settings below) it ran for about a 1 1/2 days and then one job hit "waiting to run". You might have missed my first post: <leave_apps_in_memory>1</leave_apps_in_memory> <ram_max_used_busy_pct>65.000000</ram_max_used_busy_pct> If with that it should still go into waiting to run, I'd try: <ram_max_used_busy_pct>70.000000</ram_max_used_busy_pct> <ram_max_used_idle_pct>70.000000</ram_max_used_idle_pct> And than set in your client_state.xml (exit BOINC first): <user_run_request>1</user_run_request> You have there a 2 probably right now. Ok, I've made the two suggested changes, changing leave_apps_in_memory to "1" and upping the busy memory to 65%. I'll report back on how it works. ID: 74627 · Rating: 0 · rate: / Reply Quote

E the P Send message Joined: 5 Jun 06 Posts: 36 Credit: 28,333,251 RAC: 0	Message 74662 - Posted: 4 Dec 2012, 13:02:42 UTC - in response to Message 74627. Actually, shouldn't these be the other way around: <max_ncpus_pct>100.000000</max_ncpus_pct> <cpu_usage_limit>50.000000</cpu_usage_limit> I think max_ncpus_pct is the number of processors (so 50% to use one physical processor) and cpu_usage_limit is the proportion of run-time to pause-time while running. I'd recommend swap those values and get BOINC to re-read the file. Yep. Haven't seen that. Well I was a bit hopeful. With the following changes (see my current settings below) it ran for about a 1 1/2 days and then one job hit "waiting to run". You might have missed my first post: <leave_apps_in_memory>1</leave_apps_in_memory> <ram_max_used_busy_pct>65.000000</ram_max_used_busy_pct> If with that it should still go into waiting to run, I'd try: <ram_max_used_busy_pct>70.000000</ram_max_used_busy_pct> <ram_max_used_idle_pct>70.000000</ram_max_used_idle_pct> And than set in your client_state.xml (exit BOINC first): <user_run_request>1</user_run_request> You have there a 2 probably right now. Ok, I've made the two suggested changes, changing leave_apps_in_memory to "1" and upping the busy memory to 65%. I'll report back on how it works. Sadly, same results. Runs for about a day to day 1/2 and then hangs. Here is my latest overide file: --------------------------------------------------------- <global_preferences> <run_on_batteries>0</run_on_batteries> <run_if_user_active>1</run_if_user_active> <run_gpu_if_user_active>0</run_gpu_if_user_active> <idle_time_to_run>0.000000</idle_time_to_run> <start_hour>0.000000</start_hour> <end_hour>0.000000</end_hour> <net_start_hour>0.000000</net_start_hour> <net_end_hour>0.000000</net_end_hour> <leave_apps_in_memory>1</leave_apps_in_memory> <confirm_before_connecting>0</confirm_before_connecting> <hangup_if_dialed>0</hangup_if_dialed> <dont_verify_images>0</dont_verify_images> <work_buf_min_days>0.100000</work_buf_min_days> <work_buf_additional_days>0.250000</work_buf_additional_days> <max_ncpus_pct>50.000000</max_ncpus_pct> <cpu_scheduling_period_minutes>60.000000</cpu_scheduling_period_minutes> <disk_interval>60.000000</disk_interval> <disk_max_used_gb>100.000000</disk_max_used_gb> <disk_max_used_pct>50.000000</disk_max_used_pct> <disk_min_free_gb>0.000000</disk_min_free_gb> <vm_max_used_pct>75.000000</vm_max_used_pct> <ram_max_used_busy_pct>65.000000</ram_max_used_busy_pct> <ram_max_used_idle_pct>90.000000</ram_max_used_idle_pct> <max_bytes_sec_up>0.000000</max_bytes_sec_up> <max_bytes_sec_down>0.000000</max_bytes_sec_down> <cpu_usage_limit>100.000000</cpu_usage_limit> <suspend_cpu_usage>0.000000</suspend_cpu_usage> </global_preferences> ID: 74662 · Rating: 0 · rate: / Reply Quote

Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0	Message 74667 - Posted: 4 Dec 2012, 19:35:47 UTC - in response to Message 74662. Have you tried this part? If with that it should still go into waiting to run, I'd try: <ram_max_used_busy_pct>70.000000</ram_max_used_busy_pct> <ram_max_used_idle_pct>70.000000</ram_max_used_idle_pct> And than set in your client_state.xml (exit BOINC first): <user_run_request>1</user_run_request> You have there a 2 probably right now. Also post the log, when BOINC suspends the task. It also might be helpful to use <cpu_sched>, <cpu_sched_debug> and <mem_usage_debug> in cc_config, so we can better see in the log what's going on there. What is the size of the pagefile/partition (whatever that is called in Linux)? . ID: 74667 · Rating: 0 · rate: / Reply Quote

E the P Send message Joined: 5 Jun 06 Posts: 36 Credit: 28,333,251 RAC: 0	Message 74704 - Posted: 10 Dec 2012, 14:11:15 UTC - in response to Message 74667. Have you tried this part? If with that it should still go into waiting to run, I'd try: <ram_max_used_busy_pct>70.000000</ram_max_used_busy_pct> <ram_max_used_idle_pct>70.000000</ram_max_used_idle_pct> And than set in your client_state.xml (exit BOINC first): <user_run_request>1</user_run_request> You have there a 2 probably right now. Also post the log, when BOINC suspends the task. It also might be helpful to use <cpu_sched>, <cpu_sched_debug> and <mem_usage_debug> in cc_config, so we can better see in the log what's going on there. What is the size of the pagefile/partition (whatever that is called in Linux)? So far it's gone the entire weekend with no hangups. I'll keep monitoring and apply your suggesting is it hangs. ID: 74704 · Rating: 0 · rate: / Reply Quote

E the P Send message Joined: 5 Jun 06 Posts: 36 Credit: 28,333,251 RAC: 0	Message 74966 - Posted: 24 Jan 2013, 15:27:42 UTC - in response to Message 74704. Have you tried this part? If with that it should still go into waiting to run, I'd try: <ram_max_used_busy_pct>70.000000</ram_max_used_busy_pct> <ram_max_used_idle_pct>70.000000</ram_max_used_idle_pct> And than set in your client_state.xml (exit BOINC first): <user_run_request>1</user_run_request> You have there a 2 probably right now. Also post the log, when BOINC suspends the task. It also might be helpful to use <cpu_sched>, <cpu_sched_debug> and <mem_usage_debug> in cc_config, so we can better see in the log what's going on there. What is the size of the pagefile/partition (whatever that is called in Linux)? So far it's gone the entire weekend with no hangups. I'll keep monitoring and apply your suggesting is it hangs. Well as a final warp-up I am processing for about 4-5 days without a termination. At this point I can live with that. I want to thank everyone on this list for your suggestions and help. ID: 74966 · Rating: 0 · rate: / Reply Quote

Kong Kandal den 1. Send message Joined: 28 Apr 06 Posts: 1 Credit: 9,024,376 RAC: 0	Message 75450 - Posted: 24 Apr 2013, 14:41:19 UTC - in response to Message 74966. Have you tried this part? If with that it should still go into waiting to run, I'd try: <ram_max_used_busy_pct>70.000000</ram_max_used_busy_pct> <ram_max_used_idle_pct>70.000000</ram_max_used_idle_pct> And than set in your client_state.xml (exit BOINC first): <user_run_request>1</user_run_request> You have there a 2 probably right now. Also post the log, when BOINC suspends the task. It also might be helpful to use <cpu_sched>, <cpu_sched_debug> and <mem_usage_debug> in cc_config, so we can better see in the log what's going on there. What is the size of the pagefile/partition (whatever that is called in Linux)? So far it's gone the entire weekend with no hangups. I'll keep monitoring and apply your suggesting is it hangs. Well as a final warp-up I am processing for about 4-5 days without a termination. At this point I can live with that. I want to thank everyone on this list for your suggestions and help. Hello I am experiencing the same problem and have not found any solution. I have tried all the tricks in this thread,- but nothing seems to help. Any advices will be appreciated. Thank you. ID: 75450 · Rating: 0 · rate: / Reply Quote

Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0	Message 75547 - Posted: 30 Apr 2013, 9:36:37 UTC - in response to Message 75450. For any advices you need to unhide your computers or post a link to the details page of the affected machine. Also post the contents of your global_prefs_override.xml or if that file is not available on your computer (in your BOINC data directory), than global_prefs.xml. . ID: 75547 · Rating: 0 · rate: / Reply Quote

Xenus Send message Joined: 14 May 09 Posts: 2 Credit: 664,972 RAC: 0	Message 75561 - Posted: 4 May 2013, 16:11:30 UTC - in response to Message 74054. I'm running BOINC on an Ubuntu 12 system and about 6-8 weeks ago it began to develop a problem (no new software/hardware changes). It will frequently get stuck with one job at the "Waiting to Run" state. If I manuall abort that work unit it will begin to run the next job normally. The pattern is inconsistant. Sometimes it will process 2-4 work units just fine, other times it will hang on 2-3 in a row. Any thoughts? Exactly the same problem in Ubuntu 12.04 and 12.10 with Boinc 7.0.27 and Rosetta tasks. Also get the next task stuck on "Waiting to Run" for no good reason. Aborting that task then gets the tasks "Ready to Start" running. ID: 75561 · Rating: 0 · rate: / Reply Quote

Xenus Send message Joined: 14 May 09 Posts: 2 Credit: 664,972 RAC: 0	Message 75562 - Posted: 4 May 2013, 16:21:57 UTC - in response to Message 75561. I'm running BOINC on an Ubuntu 12 system and about 6-8 weeks ago it began to develop a problem (no new software/hardware changes). It will frequently get stuck with one job at the "Waiting to Run" state. If I manuall abort that work unit it will begin to run the next job normally. The pattern is inconsistant. Sometimes it will process 2-4 work units just fine, other times it will hang on 2-3 in a row. Any thoughts? Exactly the same problem in Ubuntu 12.04 and 12.10 with Boinc 7.0.27 and Rosetta tasks. Also get the next task stuck on "Waiting to Run" for no good reason. Aborting that task then gets the tasks "Ready to Start" running. Looks like the max memory issue. Increasing the percentage of memory usable gets the process running again. Seems like the Rosetta jobs have large and/or different memory requirements. Ideally there should be log message to indicate job can't run without more memory or it should simply abort itself to allow another job to run. ID: 75562 · Rating: 0 · rate: / Reply Quote

Link Send message Joined: 4 May 07 Posts: 356 Credit: 382,349 RAC: 0	Message 75662 - Posted: 25 May 2013, 16:39:03 UTC - in response to Message 75562. Looks like the max memory issue. Increasing the percentage of memory usable gets the process running again. Seems like the Rosetta jobs have large and/or different memory requirements. Ideally there should be log message to indicate job can't run without more memory or it should simply abort itself to allow another job to run. You need to allow at least 500MB per Rosetta task, better 1GB since some tasks need that much. Check all the posts in this thread if the issue comes back, all the relevant setting has been posted above. . ID: 75662 · Rating: 0 · rate: / Reply Quote