Not getting work

Message boards : Number crunching : Not getting work

To post messages, you must log in.

AuthorMessage
dex3703

Send message
Joined: 30 Dec 19
Posts: 2
Credit: 452,429
RAC: 0
Message 105321 - Posted: 4 Mar 2022, 0:21:19 UTC

Hello,

I have a couple computers set up for this project. One received a handful of units, which it finished, but hasn't gotten more. The other has never gotten any units. Is there something I need to do?

Thanks,
D
ID: 105321 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Number Cruncher

Send message
Joined: 22 Dec 21
Posts: 4
Credit: 599,895
RAC: 0
Message 105323 - Posted: 4 Mar 2022, 4:41:14 UTC - in response to Message 105321.  

Rosetta 4.20 tasks are not always available. they send out a few days. This may be the problem. If you install virtualbox, you will receive some python tasks and those are always available.
ID: 105323 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjs5

Send message
Joined: 22 Nov 10
Posts: 273
Credit: 23,054,272
RAC: 7,218
Message 105325 - Posted: 4 Mar 2022, 17:28:34 UTC - in response to Message 105323.  

Rosetta 4.20 tasks are not always available. they send out a few days. This may be the problem. If you install virtualbox, you will receive some python tasks and those are always available.


Even if you install the VirtualBox version of BOINC, you still have to "ALLOW" that computer to accept the vbox work units. I fell into that trap. I just installed VirtualBox BOINC and nothing happened. I had to ALLOW each computer to accept WU.

Rosetta added an ALLOW/SKIP option to each COMPUTER profile. You have to explicitly set the ALLOW option. The Rosetta people failed to add a "WARNING" or any information that would help a user find this failure.

I am still getting a number of failures and hung Rosetta WU where they just keep running. This is happening on a machine with plenty of memory, disk and all enabled to run BOINC WU.
ID: 105325 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
dex3703

Send message
Joined: 30 Dec 19
Posts: 2
Credit: 452,429
RAC: 0
Message 105326 - Posted: 4 Mar 2022, 21:08:30 UTC - in response to Message 105325.  

Thanks for the explanations. My stats change, although I never see any work on that computer, so I guess it's getting them at night. I'm running Linux so don't know if there's VirtualBox for that setup. I'll just leave it alone.
ID: 105326 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rjs5

Send message
Joined: 22 Nov 10
Posts: 273
Credit: 23,054,272
RAC: 7,218
Message 105327 - Posted: 4 Mar 2022, 22:06:20 UTC - in response to Message 105326.  

Thanks for the explanations. My stats change, although I never see any work on that computer, so I guess it's getting them at night. I'm running Linux so don't know if there's VirtualBox for that setup. I'll just leave it alone.


I am running a Fedora Linux box. I had installed BOINC but there was no BOINC+VirtualBox packages so I just installed the virtualbox packages in addition. It seemed to work.

I am seeing mainly Rosetta Python WU being sent down. They take a huge amount of memory and I am seeing a few hung jobs. There seem to be many jobs available so you should see the machine running them.

I am running 18 CPU on a an 18C/36/T machine with 64gb of memory. The 18 WU will cause Linux to consume all 64gb of memory and a good chunk of the swap space.
ID: 105327 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 19 Jan 06
Posts: 881
Credit: 52,257,545
RAC: 0
Message 105328 - Posted: 4 Mar 2022, 22:26:51 UTC - in response to Message 105327.  

I am seeing mainly Rosetta Python WU being sent down. They take a huge amount of memory and I am seeing a few hung jobs. There seem to be many jobs available so you should see the machine running them.

I am running 18 CPU on a an 18C/36/T machine with 64gb of memory. The 18 WU will cause Linux to consume all 64gb of memory and a good chunk of the swap space.

The pythons reserve almost 3 GB each to download, but require less than 1 GB to run. It allows for some creative thinking. You can, for example, use two (or more) BOINC instances to download more.

The "hangs" are of two types. If you see very low CPU usage (the "0 CPU" jobs) after about 5 minutes, you can just abort them. They occur on both Windows and Linux.

If you see "Vm job unmanageable", that is mainly on Linux, and you can't get rid of them. It is due to the VBox wrapper; there are discussions on it.
But on Windows, you can pretty much eliminate it by running VirtualBox 5.2.44 rather than 6.1.x. I am converting over to Win10 now.

And I cited the write rate for you on your Win11 discussion. They could kill your SSD if you are not careful. I use very large write caches.
ID: 105328 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
doug

Send message
Joined: 28 Mar 20
Posts: 8
Credit: 1,601,417
RAC: 1,629
Message 105343 - Posted: 6 Mar 2022, 17:50:43 UTC

Hi,

I'm also all of a sudden not getting any Rosetta tasks. This just started a few days ago, maybe a week. Previously, I was running Rosetta tasks fine and getting new tasks pretty much continuously. The Server Status page currently shows around 5000 Python tasks available. Until this recent drought, I was running the Python tasks fine. In fact, there have been times recently when I only got Python tasks. Now, nothing.

Windows 10, with all latest updates (21H2, build 19044.1526). 16G RAM, plenty of disk space. Intel Core i5-3470. Nvidia GeForce GTX 1060.

There are no Rosetta errors in the BOINC Event Log, other than some "Project Communication failed" from early this morning (Sunday, EST).

Anyone have any ideas or suggestions?

Thanks.

Doug
ID: 105343 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MJH333

Send message
Joined: 29 Jan 21
Posts: 18
Credit: 6,290,467
RAC: 13,756
Message 105344 - Posted: 6 Mar 2022, 18:45:00 UTC - in response to Message 105343.  

Doug,

It sounds to me as if one of the Pythons running on your machine has errored out. This causes the server to stop sending you Python tasks. (And there are currently no Rosetta 4.20 tasks available, so you are not getting any tasks at all.)

To check, go to your Rosetta account homepage, click on "View" next to "Computers on this account", then click on "Details".

At the bottom, there is a toggle switch headed "VirtualBox VM jobs". If it says "Allow", click it, and you will start getting Pythons again once your Boinc Manager communicates with the server.

If you click "Allow" and then "Return to host page", you will see that the toggle now says "Skip" (and clicking it would then stop you getting Pythons).

If the toggle switch says "Skip" when you look at it, then there is something else afoot!

Cheers,
Mark
ID: 105344 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
doug

Send message
Joined: 28 Mar 20
Posts: 8
Credit: 1,601,417
RAC: 1,629
Message 105345 - Posted: 6 Mar 2022, 19:30:45 UTC - in response to Message 105344.  

Mark,

Thanks so much for that clear and concise answer! That was exactly the problem! I have now gotten 2 Python tasks. Happy Days are here again!

Doug
ID: 105345 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MJH333

Send message
Joined: 29 Jan 21
Posts: 18
Credit: 6,290,467
RAC: 13,756
Message 105346 - Posted: 6 Mar 2022, 20:24:36 UTC - in response to Message 105345.  

Thanks, Doug, glad to be able to help.

This has happened to me several times now, so I've had a fair bit of practice at clicking the "Allow" switch!

Cheers,
Mark
ID: 105346 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
keputnam

Send message
Joined: 18 Sep 05
Posts: 24
Credit: 2,088,785
RAC: 0
Message 105706 - Posted: 26 Mar 2022, 3:02:52 UTC
Last modified: 26 Mar 2022, 3:03:39 UTC

I've verified that "allow" is selected for my one computer

I have RAM available

Server Status says ~5K jobs available (and has for about 3 days)

When I click update, I see this in the log

2022-03-25 7:57:36 PM | Rosetta@home | update requested by user
2022-03-25 7:57:39 PM | Rosetta@home | Sending scheduler request: Requested by user.
2022-03-25 7:57:39 PM | Rosetta@home | Requesting new tasks for CPU
2022-03-25 7:57:41 PM | Rosetta@home | Scheduler request completed: got 0 new tasks
2022-03-25 7:57:41 PM | Rosetta@home | No tasks sent
2022-03-25 7:57:41 PM | Rosetta@home | Project requested delay of 31 seconds

three projects with equal or lower resource share are all getting WUs on a regular basis

Anyone have any suggestions?
ID: 105706 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 105711 - Posted: 26 Mar 2022, 9:48:13 UTC

Is your job queue filled with other projects?
Sometimes this message is when the queue is full in place of don't need new tasks or computer has reached its limit messages.

If your have work from other projects set everything to no new tasks, let your queue empty and then undo the no new tasks on Rosetta first and see what you get for a message if no work downloads.
ID: 105711 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
MJH333

Send message
Joined: 29 Jan 21
Posts: 18
Credit: 6,290,467
RAC: 13,756
Message 105712 - Posted: 26 Mar 2022, 10:57:23 UTC - in response to Message 105706.  

I've verified that "allow" is selected for my one computer

Anyone have any suggestions?
If the Allow switch is showing, then your PC is set not to receive Vbox tasks. You need to click it to allow Vbox tasks. It will then change to Skip (and if you click that, you will stop getting Vbox tasks).

It appears that, if one of the Python tasks errors out, the system automatically turns off Python tasks on the computer that had the error.
ID: 105712 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
keputnam

Send message
Joined: 18 Sep 05
Posts: 24
Credit: 2,088,785
RAC: 0
Message 105713 - Posted: 26 Mar 2022, 15:23:14 UTC - in response to Message 105712.  

Thanks

Why oh why does ROSETTA have to everything in their own no-intuitive way
ID: 105713 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 105717 - Posted: 27 Mar 2022, 0:22:15 UTC - in response to Message 105713.  
Last modified: 27 Mar 2022, 0:24:07 UTC

Thanks

Why oh why does ROSETTA have to everything in their own no-intuitive way



It kind of makes sense. It's kind of like a turn on and turn off button. You see the next status available.
So things are enabled if the skip is showing and if they are not the allow button is showing.
It's confusing on a quick think, but logically it makes sense at least from a computer standpoint.

It's just like the placement of this button. It's in a location that is related to your computer not the project. I went looking for it in project settings the first time,
ID: 105717 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bryn Mawr

Send message
Joined: 26 Dec 18
Posts: 393
Credit: 12,110,248
RAC: 4,952
Message 105720 - Posted: 27 Mar 2022, 11:12:24 UTC - in response to Message 105713.  

Thanks

Why oh why does ROSETTA have to everything in their own no-intuitive way


The button is showing what will happen if you press it, not the current status.
ID: 105720 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Rob Hounsell

Send message
Joined: 26 Sep 05
Posts: 6
Credit: 2,891,985
RAC: 1,266
Message 105979 - Posted: 20 Apr 2022, 1:04:14 UTC - in response to Message 105720.  

IDK if this helps, but i wasn't getting any work units after installing the 64 bit version of VirtualBox. It wasn't until recently when I tried to create a new 64-bit machine that I noticed that it would only allow me to select a 32-bit OS. It turned out that to get 64-bit VMs I had to enable some virtualization setting in my motherboard bios. Odd that it worked for 32-bit VMs when it was disabled, but I'm getting Rosetta work units now.
ID: 105979 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Number crunching : Not getting work



©2024 University of Washington
https://www.bakerlab.org