out of work

Message boards : Number crunching : out of work

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile anders n

Send message
Joined: 19 Sep 05
Posts: 403
Credit: 537,991
RAC: 0
Message 25828 - Posted: 1 Sep 2006, 14:40:18 UTC

I get this message.

2006-09-01 16:40:33|rosetta@home|No work from project

Anybody?

Anders n
ID: 25828 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,951,714
RAC: 4,571
Message 25829 - Posted: 1 Sep 2006, 14:52:06 UTC

Getting same here - do the admins know?
ID: 25829 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
The_Bad_Penguin
Avatar

Send message
Joined: 5 Jun 06
Posts: 2751
Credit: 4,271,025
RAC: 0
Message 25830 - Posted: 1 Sep 2006, 14:54:51 UTC - in response to Message 25829.  
Last modified: 1 Sep 2006, 14:56:02 UTC

If you go to the R@H homepage (top right corner, "Server Status"), presently it says 6 wu's in queue.

Usually, theres a few thousand.


Getting same here - do the admins know?

ID: 25830 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1832
Credit: 119,951,714
RAC: 4,571
Message 25831 - Posted: 1 Sep 2006, 14:56:10 UTC

just emailed DK incase they don't know about it...
ID: 25831 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ethan
Volunteer moderator

Send message
Joined: 22 Aug 05
Posts: 286
Credit: 9,304,700
RAC: 0
Message 25833 - Posted: 1 Sep 2006, 15:07:38 UTC

I've emailed the staff, it's just after 8am here so it may take a bit for the coffee to kick in.
ID: 25833 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TestPilot

Send message
Joined: 23 Sep 05
Posts: 30
Credit: 419,033
RAC: 0
Message 25834 - Posted: 1 Sep 2006, 15:15:29 UTC - in response to Message 25833.  

it's just after 8am here

Are you guys not in Washington?

TestPilot, AKA Administrator
ID: 25834 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Christoph Jansen
Avatar

Send message
Joined: 6 Jun 06
Posts: 248
Credit: 267,153
RAC: 0
Message 25835 - Posted: 1 Sep 2006, 15:18:01 UTC - in response to Message 25834.  

it's just after 8am here

Are you guys not in Washington?


They are in Seattle in the State of Washington, not Washington D.C.
ID: 25835 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 25839 - Posted: 1 Sep 2006, 15:53:23 UTC

Now that CASP is over and the turnaround times are not so critically any longer I suggest to increase the work buffer again in order to have for a longer time work available even if the make_work_process dies. Although it was requested to increase the deadline again to 2 weeks (or 10 days).
ID: 25839 · Rating: 3 · rate: Rate + / Rate - Report as offensive    Reply Quote
James Thompson

Send message
Joined: 13 Oct 05
Posts: 46
Credit: 186,109
RAC: 0
Message 25840 - Posted: 1 Sep 2006, 16:33:15 UTC - in response to Message 25828.  

I've submitted 11,000 new work units to the queue, the home page should be updated soon. I'll put a quick message into the Active Workunits Thread in the next hour. Thank you all for keeping an eye on this, and my apologies for letting the workunit queue run dry.

Increasing the buffer is a great idea, I'll talk to David Kim about doing this.

Cheers,

James

I get this message.

2006-09-01 16:40:33|rosetta@home|No work from project

Anybody?

Anders n


ID: 25840 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
SuperG //1.303.02%

Send message
Joined: 4 May 06
Posts: 14
Credit: 1,561,763
RAC: 0
Message 25928 - Posted: 3 Sep 2006, 6:37:14 UTC - in response to Message 25839.  

Now that CASP is over and the turnaround times are not so critically any longer I suggest to increase the work buffer again in order to have for a longer time work available even if the make_work_process dies. Although it was requested to increase the deadline again to 2 weeks (or 10 days).



Hey Tralala -- Noticed you once mentioned a reason you liked Rosetta was
"User setable length of Work units!"

Unfortunately, I'm not finding where that setting is.
In general preferences, or in boinc manager, or?
ID: 25928 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [B^S] thierry@home
Avatar

Send message
Joined: 17 Sep 05
Posts: 182
Credit: 281,902
RAC: 0
Message 25931 - Posted: 3 Sep 2006, 8:09:35 UTC

You'll find this feature in 'Your account' > Rosetta@home preferences > Target CPU run time.
ID: 25931 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
SuperG //1.303.02%

Send message
Joined: 4 May 06
Posts: 14
Credit: 1,561,763
RAC: 0
Message 25955 - Posted: 3 Sep 2006, 16:20:03 UTC - in response to Message 25931.  

You'll find this feature in 'Your account' > Rosetta@home preferences > Target CPU run time.


Thank for that. In trying to keep my computers from waiting for work, perhaps you can recommend a very short or very long interval to maximize work done?

For context, the machines have lots of cpu power, terabytes of disk, T3+
networks, and are 100% testing/dedicated to Rosetta. Seem to return a
result approx. every 2.5 hours and have 4-8 cores each.


ID: 25955 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
tralala

Send message
Joined: 8 Apr 06
Posts: 376
Credit: 581,806
RAC: 0
Message 25956 - Posted: 3 Sep 2006, 16:38:40 UTC - in response to Message 25955.  

You'll find this feature in 'Your account' > Rosetta@home preferences > Target CPU run time.


Thank for that. In trying to keep my computers from waiting for work, perhaps you can recommend a very short or very long interval to maximize work done?

For context, the machines have lots of cpu power, terabytes of disk, T3+
networks, and are 100% testing/dedicated to Rosetta. Seem to return a
result approx. every 2.5 hours and have 4-8 cores each.



In order to not run out of work you may want to increase your reconnect time. This is in your general settings: "Connect to network about every"
(determines size of work cache; maximum 10 days). Standard is 0.1 you may want to up this to 0.5 or even 1. More is not needed a downtime of more than 24 hours is not likely.
ID: 25956 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Feet1st
Avatar

Send message
Joined: 30 Dec 05
Posts: 1755
Credit: 4,690,520
RAC: 0
Message 25957 - Posted: 3 Sep 2006, 16:43:53 UTC - in response to Message 25955.  

SG Tralala just posted what I was going to. But I wanted to also point you to the caution mentioned in the QA item on the WU runtime pref. Basically, don't change BOTH your WU runtime preference and General preference for connect every ...days at the same time nor in large steps.
Add this signature to your EMail:
Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might!
https://boinc.bakerlab.org/rosetta/
ID: 25957 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
SuperG //1.303.02%

Send message
Joined: 4 May 06
Posts: 14
Credit: 1,561,763
RAC: 0
Message 25969 - Posted: 3 Sep 2006, 23:39:15 UTC - in response to Message 25957.  

SG Tralala just posted what I was going to. But I wanted to also point you to the caution mentioned in the QA item on the WU runtime pref. Basically, don't change BOTH your WU runtime preference and General preference for connect every ...days at the same time nor in large steps.


Tralala and Feet1st:
Thank you, that is most helpful in getting a quick understanding of maximizing these computers for the purpose of the results to Rosetta. (ah, the beauty of community...)

Your data suggests setting runtime to 1 day & reconnect to 2 days. If anyone
has a better idea, I'd appreciate hearing it. Our compute environment:
- Each computer = 2x/4x fast Opteron; dual-core; 1Gig memory/core
- 4-12 terabytes of disk per computer; RAID5
- multi-T3 network,
- 100% dedicated to Rosetta, as each computer is brought through testing.


ID: 25969 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile JChojnacki
Avatar

Send message
Joined: 17 Sep 05
Posts: 71
Credit: 10,927,714
RAC: 7,827
Message 25975 - Posted: 4 Sep 2006, 3:11:36 UTC - in response to Message 25969.  


Your data suggests setting runtime to 1 day & reconnect to 2 days. If anyone
has a better idea, I'd appreciate hearing it.


That is a great suggestion.
It should work for you, very nicely.

~Joel
ID: 25975 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
SuperG //1.303.02%

Send message
Joined: 4 May 06
Posts: 14
Credit: 1,561,763
RAC: 0
Message 25979 - Posted: 4 Sep 2006, 6:18:01 UTC - in response to Message 25975.  


Your data suggests setting runtime to 1 day & reconnect to 2 days. If anyone
has a better idea, I'd appreciate hearing it.


That is a great suggestion.
It should work for you, very nicely.

~Joel


Thanks, Joel. Hope this gets more work done...the true objective. We only
powered on current computers on Aug. 21. Gratifying two weeks. Will be good to
see the actual Rosetta results with a testing node at 1/2 power. [128 cores vs.
32 now) And in few months w/quad-cores. [256 cores=full node]

We remain open to more suggestions to optimize for actual work results...
ID: 25979 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile [B^S] thierry@home
Avatar

Send message
Joined: 17 Sep 05
Posts: 182
Credit: 281,902
RAC: 0
Message 25980 - Posted: 4 Sep 2006, 7:13:13 UTC

Be sure, but I'm sure it's done, to select 'Leave applications in memory while suspended'. This is important but only if you run several projects.
ID: 25980 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
SuperG //1.303.02%

Send message
Joined: 4 May 06
Posts: 14
Credit: 1,561,763
RAC: 0
Message 26062 - Posted: 5 Sep 2006, 1:45:27 UTC - in response to Message 25979.  

Getting a "no work sent" message from the Rosetta servers, due to "reached
daily quota." This may be a simple problem to fix, but hasn't been easy so far (looked in FAQ's, etc.). It only effects the faster (8 core) machines, all the others are getting and crunching jobs just fine.

Any suggestions on how to get new jobs to these big idle machines??

[quote]
Your data suggests setting runtime to 1 day & reconnect to 2 days. If anyone
has a better idea, I'd appreciate hearing it.


That is a great suggestion.
It should work for you, very nicely.

~Joel



ID: 26062 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
doc :)

Send message
Joined: 4 Oct 05
Posts: 47
Credit: 1,106,102
RAC: 0
Message 26065 - Posted: 5 Sep 2006, 3:18:06 UTC

the daily quota is per cpu(core), not per machine.
errors and aborted WUs decrease the quota, valid returned WUs increase it again.
quick dirty workaround would be reset project (or re-attach? not 100% sure there), but find out why the quota got down that far first. :)
ID: 26065 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · Next

Message boards : Number crunching : out of work



©2025 University of Washington
https://www.bakerlab.org