Servers?

Message boards : Number crunching : Servers?

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2125
Credit: 41,249,734
RAC: 9,368
Message 68167 - Posted: 22 Oct 2010, 3:51:08 UTC

Why don't you take a look at the posts in the current "Houston, we have a problem..." thread?

Is that the one I mentioned in my last post that reported a task shortage rather than a server issue, which is the subject of this thread? The one the thread-starter correctly describes to you as "completely different in nature"/

There are, of course, other posts in other threads, but you probably don't want to look at them because...

Probably nothing. I asked for any mention from anyone else of the same issue you had at the time you had it (or even between your 1st and 2nd posts here) not something completely different 9 days later.

Sid Celery's post is so inane that it doesn't deserve any more of a reply than it has already been given, or that I include here.

I thought I'd established you didn't answer any questions before and neither have you now.

BTW, my machine repeatedly asked for more work and was told that none was available, and that, perhaps, the servers were down.

In the Boinc Manager's Messages tab you can select a range of messages with times so you can copy and paste a relevant range when you have an issue. It's really as simple as that. Vague assertions don't help anyone to help you when you need and ask for help.

The problem might have been different, but the symptoms appear to have been similar

What portion of Rosetta contributors, however, do we imagine understand or care about specific server or IP protocol issues when they simply visit a “Server Status Page” that tells them all servers are up and running, but their computers are idle, waiting for work?

It appears that, regardless of whether boinc.org responds to the ping or nslookup commands, rosetta.org does not always respond without an error. When you were unable to resolve the name, I had no problems. Earlier this evening I had no problems. Now, however, as I write this message (11:45 PM PDT on 10/20/2010) I receive the following message:

*** cdns2.cox.net can't find rosetta.org: Server failed

What conclusions can be drawn from this intermittent error message?

I have been trained throughout my career to identify problems, analyze them, and offer solutions or prompt others to offer solutions.

All I can say to that lot is "wow".

1. You can't tell the difference between a server not being contactable and a contactable server that just has nothing to send?
2. Others aren't asking at all. You are. Different problem, different solution.
3. I conclude that you're looking to connect to a completely different site for "Rosetta Inpharmatics". Who on earth they are, I dont know. I don't think you'll get any WUs from there.
4. I can only assume you failed the training.

If you neither understand the question you ask nor the answers you receive I don't see the point in trying either. If you have another problem, either try praying or just don't worry about it so much. I won't be.
ID: 68167 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Ingleside

Send message
Joined: 25 Sep 05
Posts: 107
Credit: 1,514,472
RAC: 0
Message 68298 - Posted: 31 Oct 2010, 12:38:48 UTC - in response to Message 68023.  

My machine received no new work for at least 24 hours, even though you say that the site was down for only ten hours. Do you know the algorithm for reconnecting after an outage?

It doesn't seem anyone has answered this yet...

If user doesn't manually hits "update", the v6.10.xx-clients uses the following rules:

1: If can't make connection to Scheduling-server, the client does a random backoff between 1 minute and 4 hours.
If makes 10 failed connections in a row, tries to download the projects home-page. If client fails to download the home-page, client takes a 24-hour backoff.

2: If makes a connection, but if gets messages like "Project is shut down", or "Can't open database" or similar, server-side orders client on a 1-hour backoff.

3: If project is up, but client for one of many possible reasons doesn't get any work, client does a random backoff. This random backoff you won't see, except if you select the project on project-tab, and hits "properties". The random backoff starts with a 1-minute upper limit for 1st. failed work-request, and for each successive failed work-request the upper limit is doubled, meaning 1-minute, 2-minutes, 4-minutes, 8-minutes, ... , upto a max upper limit of 24 hours.

4: Rosetta@home AFAIK uses so old server-code, that if you hits your daily quota (due to many errors), you'll be deferred until midnight server-side + upto 1 hour random backoff.

5: Depending on Rosetta@home's server-code, if it's old enough since last time upgraded, it's also possible you'll get a 24-hour deferral if you hits limits like "not enough memory" or "not enough free disk space" and so on.

Client also includes a couple additional reasons for not asking for work, even work is needed:
6: If one (or more) downloads is currently backing-off, all work-request to project is blocked. (The project-wide deferral on downloads doesn't count, only if an individual download has a backoff).
7: If #tasks that has one or more files to upload/is uploading is > 2* #cpu's, all work-request to project is blocked.

For #7, since your computers is dual-cores, it's counted as 2 cpu's in BOINC, meaning if you've got 5 or more tasks that wants to upload file(s), work-request is blocked.

As for which of these rules you've been affected with I don't know...




BTW, since a bank-analogy apparently is popular in this thread, you can look on BOINC as a "bank" that only handles 3 types of bank-jobs, this is for customers to put money into their account, take money out of their account, and for customers to ask for how much money they've got on their account.

For each customer, something like this will happen:
a: Customer tells their account-number.
b: Bank-employee looks-up account-number.
c: Customer shows their identification.
d: Bank-employee verifies the identification is ok (like name is correct for account-number).
e: Customer specifies he wants to take-out some money.
f: Bank-employee checks how much money is available on the account.
g: If enough money on account, bank-employee grabs the staple of 10-dollar-bills, and starts counting-out 10-dollar-bills until:
g1: There's no more 10-dollar-bills.
g2: The account doesn't contain enough money to get another 10-dollar-bill.
g3: The customers specified amount is reached.
h: The customer needs to write his signature, before he gets the money the bank-employee has counted-out for him.
i: The bank-employee records the number of 10-dollar-bills given-out to the customer, and the account-info is updated with the new amount of money on it.

If instead customer wants to deposit money, step a-d is the same, while step e is changed:
e: Customer specifies he wants to deposit some money.
f: Customer handles the bank-employee some money.
g: Bank-employee counts-up how much money it is.
i: The bank-employee records how much money he's got, and updates the account-info with the new amount of money on it.

Or, instead the customer only wants to know how much money is on his account. Step a-d is still the same, while step e is changed:
e: Customer specifies he wants to know how much money is currently on his account.
f: Bank-employee checks how much money is available on the account.
g: Bank-employee tells the customer how much money is on the account.


As you can see from this, even just looking-up the account-info is 7 steps, while giving-out some work is only 2 additional steps, meaning 9 steps total.

Also, then the bank-employee in step g is sitting with a staple of 10-dollar-bills, counting-out example 12 bills instead of only counting-out example 2 bills doesn't take much extra time, so both for the customer and the bank it's better he get 12 bills at once, than he gets only 2 bills, and must stand in line 5 times more to get an additional 10 bills...

For the bank, 12 bills at once is 9 steps, while 6 x 2 bills is 54 steps...


Well, for BOINC it's not exactly the same, since each task has it's own update. But, for each Scheduler-request you'll still need to look-up user_id, host_id, preferences, update computer-info, possibly update preferences. So, if on average assumes you don't need to update preferences, each scheduler-request is 4 steps + 1 per task.
Meaning, 12 tasks as a single scheduler-request means 16 database-hits.
12 tasks as 6 scheduler-requests means 6 * 4 + 12 = 36 database-hits.

And, this is only by counting the work-requests, on top of this you'll get the hits when you report finished tasks...

"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
ID: 68298 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3

Message boards : Number crunching : Servers?



©2024 University of Washington
https://www.bakerlab.org