aborting work units

Message boards : Number crunching : aborting work units

To post messages, you must log in.

AuthorMessage
Profile Stephen

Send message
Joined: 26 Apr 08
Posts: 32
Credit: 429,286
RAC: 0
Message 57917 - Posted: 16 Dec 2008, 6:02:23 UTC

is it safe to abort work units on the older applications in order to focus time on the newer application versions?

I mean, does it affect your results negatively by aborting work units in bulk just to ensure we're using the most up-to-date version?
ID: 57917 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mike Tyka

Send message
Joined: 20 Oct 05
Posts: 96
Credit: 2,190
RAC: 0
Message 57918 - Posted: 16 Dec 2008, 6:17:47 UTC - in response to Message 57917.  
Last modified: 16 Dec 2008, 6:19:31 UTC

is it safe to abort work units on the older applications in order to focus time on the newer application versions?

I mean, does it affect your results negatively by aborting work units in bulk just to ensure we're using the most up-to-date version?


Yes! Definitely, if you're running stuff with 1.46 or older please abort it. It'll just get sent out again anyway.

Mike
http://beautifulproteins.blogspot.com/
http://www.miketyka.com/
ID: 57918 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 57932 - Posted: 16 Dec 2008, 13:06:56 UTC

There are presently 4 versions of Rosetta that are "current". So, you want to check the list and make sure you don't abort a list of tasks just to download more of the same.
Rosetta Moderator: Mod.Sense
ID: 57932 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Speedy
Avatar

Send message
Joined: 25 Sep 05
Posts: 163
Credit: 808,337
RAC: 3
Message 60197 - Posted: 17 Mar 2009, 20:35:18 UTC

Is their a quick way to abort and resend tasks? my host was detached from this project while I had tasks still crunching. I apologize for my mistake.
Have a crunching good day!!
ID: 60197 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Murasaki
Avatar

Send message
Joined: 20 Apr 06
Posts: 303
Credit: 511,418
RAC: 0
Message 60200 - Posted: 17 Mar 2009, 22:03:50 UTC - in response to Message 60197.  

Is their a quick way to abort and resend tasks?



  • Switch to the advanced view in your BOINC Manager and click on the tasks tab.
  • Select a task and click the "abort" button.
  • You will abandon work on the task but remain connected to the project.


ID: 60200 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 60201 - Posted: 17 Mar 2009, 22:42:00 UTC

No speedy, when you detached, the tasks are aborted. And even if that didn't come through, the project servers won't resend the tasks. You'll get new ones and the old ones will be reissued, either due to the receipt of the abort, or due to reaching the deadline with no result.
Rosetta Moderator: Mod.Sense
ID: 60201 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Speedy
Avatar

Send message
Joined: 25 Sep 05
Posts: 163
Credit: 808,337
RAC: 3
Message 60208 - Posted: 18 Mar 2009, 4:23:21 UTC

Thank you for your responses Murasaki & Mod.Sense. I thought this was the case, I just wanted to make sure.
Have a crunching good day!!
ID: 60208 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Cesium_133*
Avatar

Send message
Joined: 1 Dec 08
Posts: 28
Credit: 225,332
RAC: 0
Message 61441 - Posted: 29 May 2009, 6:33:21 UTC
Last modified: 29 May 2009, 6:35:15 UTC

A couple of questions that have not been addressed anywhere in my research, to which answers would be helpful:

1A. I am able to get a lot of WU's by suspending the other projects I run (I am running 4... Rosetta, Climate Prediction, Hydrogen, and AI at 85%, 7.5%, 3.75%, and 3.75% devoted time respectively). I do this sometimes for each project in order; they'll suspend, and the one left running phones home for new WU's. Is this encouragable, acceptable, tolerable, neutral, or malevolent conduct? My intentions are good... it's just that...

1B. I just found I would run short of time on some Rosetta WU's, so I aborted them before they began running or went past due during a computation. What happens to those WU's? Please tell me they don't get 86'ed and/or cause the project to suffer... are they "recycled", re-distributed, recomputed, what?

1C. If I let the scheduler send me tasks, rather than baiting it, can I be confident that I will always have work if there exists work to be done?

2. Is the scheduler programming, the code that delivers and parcels out WU's, hopelessly obsolete or actually functional? Seems to be some disagreement and nescience on that...

3. I don't merit or want credit for aborted tasks... I trust I don't get any?

4. Why, exactly, are there hard and fast deadlines for WU's? Is there some inflexible point past which BOINC will not allow a WU to remain in the wild?

Thanks for your help. I request an expert, or his designee, help me out :) Best, John :D
The lovely lady you see isn't I, but Hayley Westenra, a classical crossover singer from Christchurch, NZ. There is no known voice as hers. Check her out- she's seraphic.

ID: 61441 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,208,737
RAC: 3,249
Message 61442 - Posted: 29 May 2009, 9:57:29 UTC - in response to Message 61441.  
Last modified: 29 May 2009, 9:59:25 UTC

A couple of questions that have not been addressed anywhere in my research, to which answers would be helpful:

1A. I am able to get a lot of WU's by suspending the other projects I run (I am running 4... Rosetta, Climate Prediction, Hydrogen, and AI at 85%, 7.5%, 3.75%, and 3.75% devoted time respectively). I do this sometimes for each project in order; they'll suspend, and the one left running phones home for new WU's. Is this encouragable, acceptable, tolerable, neutral, or malevolent conduct? My intentions are good... it's just that...

You are becoming a micro-manager of Boinc and it will work out but can also cause more problems, like missed deadlines. Boinc is designed to be left alone.

1B. I just found I would run short of time on some Rosetta WU's, so I aborted them before they began running or went past due during a computation. What happens to those WU's? Please tell me they don't get 86'ed and/or cause the project to suffer... are they "recycled", re-distributed, recomputed, what?

They all get recycled to other people and you just lose some of the units you can get from that project. When you abort some the total number of units you can get, from that project, goes down for a short time and then as you return units on time it goes back up again.

1C. If I let the scheduler send me tasks, rather than baiting it, can I be confident that I will always have work if there exists work to be done?

For the most part yes, it is designed to just do its thing with little to no intervention on our parts. It doesn't always work that way but it is supposed to. One thing that will happen is your 85% etc settings will be followed over time not in the short term. One project will have work while another may not at the exact moment Boinc asks for it so another project will give you a bit more. Over time it will even out.

2. Is the scheduler programming, the code that delivers and parcels out WU's, hopelessly obsolete or actually functional? Seems to be some disagreement and nescience on that...

No it works okay for the most part, most people never touch it and it works just fine.

3. I don't merit or want credit for aborted tasks... I trust I don't get any?

No you do not

4. Why, exactly, are there hard and fast deadlines for WU's? Is there some inflexible point past which BOINC will not allow a WU to remain in the wild?

Because each project handles their own data in their own way, some feel the need for shorter deadlines, some are okay with longer ones. Kind of depends on what their contract for whoever is paying them to do the work says. Some like Malaria try to use the data faster so they can get the drugs out quicker, some like Seti are not as concerned because the data has taken millions of years to get here anyway!

Thanks for your help. I request an expert, or his designee, help me out :) Best, John :D

I am neither, just a cruncher like yourself.
ID: 61442 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 61445 - Posted: 29 May 2009, 13:30:23 UTC - in response to Message 61441.  
Last modified: 29 May 2009, 13:34:19 UTC

A couple of questions that have not been addressed anywhere in my research, to which answers would be helpful:

1A. I am able to get a lot of WU's by suspending the other projects I run (I am running 4... Rosetta, Climate Prediction, Hydrogen, and AI at 85%, 7.5%, 3.75%, and 3.75% devoted time respectively). I do this sometimes for each project in order; they'll suspend, and the one left running phones home for new WU's. Is this encouragable, acceptable, tolerable, neutral, or malevolent conduct? My intentions are good... it's just that...


If you would state your intentions it might help people comment on alternative approaches to achieve your goals. If the net result is that your machine contacts the project fewer times during the week, it is a net good thing for the project servers. There is less overhead supporting your machine. On the other hand, you are one of tens of thousands of users. Don't sweat the small stuff.


1B. I just found I would run short of time on some Rosetta WU's, so I aborted them before they began running or went past due during a computation. What happens to those WU's? Please tell me they don't get 86'ed and/or cause the project to suffer... are they "recycled", re-distributed, recomputed, what?


Perhaps running close to deadlines is due to point 1a above. They do get reissued when you abort them, or they pass their deadline. Remember the name of the game here is models completed. The specific models that will be done as any specific task are not special in any way, they just add to the total. I mean there is no gap in the data. If your task with model 1, 2 and 3 is not completed, the project might create a new task which will work on models 4, 5 and 6 and so long as 3 are completed, it all works out about the same.

1C. If I let the scheduler send me tasks, rather than baiting it, can I be confident that I will always have work if there exists work to be done?


You will always have work. But not always from all of your projects. Eventually BOINC is going to figure out that you are getting behind on your climate model, and it is going to devote the time it takes to try and complete that before it's deadline. And so it may plan to run climate for several days (or until the estimated completion time is sufficiently reduced to convince the BOINC manager that it will be completed in time).

2. Is the scheduler programming, the code that delivers and parcels out WU's, hopelessly obsolete or actually functional? Seems to be some disagreement and nescience on that...


The scheduler works alright. The recent problems are due to the code that runs on the home computers. It doesn't always request work when it should, and sometimes request much more work then it needs (misestimates how much work to request).

3. I don't merit or want credit for aborted tasks... I trust I don't get any?


correct.

4. Why, exactly, are there hard and fast deadlines for WU's? Is there some inflexible point past which BOINC will not allow a WU to remain in the wild?


The scientific method is playing itself out on many simultaneous projects within BakerLab. Picture a single graduate student writing their thesis. They start with a hypothesis... "if we modify the energy function in this way, it should better direct the program to solving proteins with zinc" for example. Then you devise experiments to try and confirm or refute your hypothesis... a batch or a number of batches of work units. The next step is key to your question. Analyze the results of your experiments. When will you start this phase of your research?

The deadlines give a way to help your BOINC Manager schedule the work from various projects. And if you graphed it out, you would find that most tasks that go past 10 days without being reported back are never actually completed. The host has stopped running BOINC. The host has lost the work in progress. The host has lost it's internet connection. Whatever it is, the odds of seeing them back are very low. So the deadlines basically give the researchers a stake in the sand where they can feel confident they've got about all the data they are going to get for this go around, and they can begin their detailed analysis of the results.

Now picture the above application of scientific method being done by several dozen scientists at the same time. This is the overview of the project you are helping with. Many subteams of researchers, exploring various ideas, and techniques. Studying what works well and what does not. Over and over again.

Thanks for your help. I request an expert, or his designee, help me out :) Best, John :D


mikey's comment applies to me as well:
I am neither, just a cruncher like yourself. ...but I've been around a while.
Rosetta Moderator: Mod.Sense
ID: 61445 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Cesium_133*
Avatar

Send message
Joined: 1 Dec 08
Posts: 28
Credit: 225,332
RAC: 0
Message 61570 - Posted: 4 Jun 2009, 4:32:47 UTC - in response to Message 61445.  

If you would state your intentions it might help people comment on alternative approaches to achieve your goals.


My intentions are to get as many WU's assigned to my machine as it can process by the required deadline of each WU. Also, I want such WU's to be compliant in quantity and required total crunching time over a given, long period (>= a matter of a few months, at least) with my set %ages for Rosetta et al. Nothing fishy or abusive; just good use of my resources. I know I would finish more discrete WU's with Hydrogen than Rosetta, but I want my PC to run Rosetta more, and it's doing so... thus, good there...

If the net result is that your machine contacts the project fewer times during the week, it is a net good thing for the project servers.


Granted, though I have yet to figure out the exact algorithm BOINC uses for contacting home -for credit-. Every WU is allowing BOINC to upload its finished data forthwith on completion, as it should, but if a WU finishes closer than a certain amount of time before the deadline, BOINC will report it alone for credit. WU's can be running high-priority before this unknown time, though, finish up and report, and queue up on my machine to be either reported manually or called in per the 24-hour default time.

Eventually BOINC is going to figure out that you are getting behind on your climate model, and it is going to devote the time it takes to try and complete that before its deadline.


As it should, yes... I have 3 WU's on climate... one due April 2010, the others November 2011. Must be big ones. Over time, I would expect to be assigned WU's finishable by their respective deadlines per my allocated resource time.

...(t)he next step is key to your question. Analyze the results of your experiments. When will you start this phase of your research?


Understood. What I'm getting at is the hour/minute/second precision of the deadlines. Why not just have a given day at 11:59:59 PM all the time? Do the individual projects work off a formula which relies on when they're compiled, or sent into the wild? It just seems illogical to have some WU's with a deadline 2 minutes later than some others, unless it's all automated. Even then, why not use the Easy Button and go to the end of the day? Maybe it's too good an idea :)

I sit around thinking of this stuff, btw... I could use a gf... lol... such as the one I have in my pic and sig... :D
The lovely lady you see isn't I, but Hayley Westenra, a classical crossover singer from Christchurch, NZ. There is no known voice as hers. Check her out- she's seraphic.

ID: 61570 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Murasaki
Avatar

Send message
Joined: 20 Apr 06
Posts: 303
Credit: 511,418
RAC: 0
Message 61578 - Posted: 4 Jun 2009, 19:51:20 UTC - in response to Message 61570.  

Understood. What I'm getting at is the hour/minute/second precision of the deadlines. Why not just have a given day at 11:59:59 PM all the time?


There are a number of reasons that I can think of. I have listed a few of them below.

If all WUs are due back at a certain time, then all incomplete WUs will re-enter the job queue at the same time. On a good day the servers should be able to cope with the sudden jump in workload, but on a bad day (perhaps after the issuing of a bugged Rosetta version) a large number of incomplete tasks may need to be handled at once.

Most reissued tasks would be sent out soon after the deadline, meaning that crunchers operating in that timezone are more likely to pick up a re-issued task than users in other timezones. If there is a batch of WUs where an error is causing them to run slowly, then you will be issuing a higher proportion of bad WUs to the crunchers operating in that timezone. Hardly an encouragement for them to continue crunching.

If a batch of WUs is not running properly due to an error, the project team may not be able to see a pattern until a few WUs have timed out. Under the current system, the project team can monitor the situation in real time and cancel or amend a batch if a particular group is timing out. Under the fixed timescale method the project team won't know about the problem until the day after the failed batch passes the deadline (assuming that they set the deadline as midnight in their timezone as in your example). That means most if not all of the failed batch will have been reissued before the project team has a chance to pull the plug.

ID: 61578 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Paul D. Buck

Send message
Joined: 17 Sep 05
Posts: 815
Credit: 1,812,737
RAC: 0
Message 61580 - Posted: 4 Jun 2009, 20:40:42 UTC

BOINC "phones home" for credit at least once every 24 hours if tasks have been completed. More often if it needs work, at which point it will also report completed work. So, it can SEEM random ...

The reandomness of the deadlines is a good thing in that it means that we all don't try to report at the same time. Traffic jam does not begin to describe the situation.

You are correct though in the thought that sometimes BOINC is a little to anal in its handling of deadlines and the client can panic and do inappropriate things in many more cases than we would like to imagine ... I have tried to get the developers to rethink some of the rules to avoid these situations but to no avail.

Interestingly enough had they done so, the new problems seen at SaH recently would not be issues ... restarting suspended tasks not happening would not be an issue if BOINC almost never suspended tasks ... :)
ID: 61580 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dgnuff
Avatar

Send message
Joined: 1 Nov 05
Posts: 350
Credit: 24,773,605
RAC: 0
Message 61591 - Posted: 6 Jun 2009, 10:04:32 UTC - in response to Message 61570.  

If you would state your intentions it might help people comment on alternative approaches to achieve your goals.


My intentions are to get as many WU's assigned to my machine as it can process by the required deadline of each WU. Also, I want such WU's to be compliant in quantity and required total crunching time over a given, long period (>= a matter of a few months, at least) with my set %ages for Rosetta et al. Nothing fishy or abusive; just good use of my resources. I know I would finish more discrete WU's with Hydrogen than Rosetta, but I want my PC to run Rosetta more, and it's doing so... thus, good there...


The best thing to do is to determine your long term project percentages, set them up by adjusting the resource share of your chosen projects, set Boinc to keep about three to four days work ahead, and just leave it alone.

The Boinc scheduler is a fairly complex beast, it has to be because of the various projects it deals with. Different work unit durations, different deadlines, things like CPDN with workunits that run for months at a time, etc.

Keeping this in mind, it's really designed to be run in "set and forget" mode. It'll take a week or two to sort everything out, but if you leave it alone, it will respect your resource share choices, and more importantly, leaving it alone and not micromanaging reduces the risk of missed deadlines.

Also, unless you have a very strong reason to do so (intermittent connection, e.g. dial up), there is no benefit whatsoever is maintaining a large work buffer. I run with a three day buffer, and simply don't have any problems.
ID: 61591 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : aborting work units



©2024 University of Washington
https://www.bakerlab.org