Message boards : Number crunching : SERVER PROBLEMS.
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · Next
Author | Message |
---|---|
David E K Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Jul 05 Posts: 1018 Credit: 4,334,829 RAC: 0 |
Why is it that when the server goes "down" the status on the home page does not change? is it an interal issue with the server that is not bad enough to trigger a status change or is it that the server has to be "offline" completely? The status page is cached and updates every 10 minutes or so. On the homepage under "Server Status", you will see either "Scheduler running" or "Scheduler disabled." This page is not cached so it should be up-to-date. It obviously doesn't give specifics about each server daemon, but it will tell you if the project/scheduler is down. |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
This one dosen't seem to want to go home, all the failed ones went back no problem but this result isn't moving, it's been trying for hours. Thu 30 Jul 2009 16:52:14 EST||Project communication failed: attempting access to reference site Thu 30 Jul 2009 16:52:14 EST|rosetta@home|Temporarily failed upload of lr8_B_seq_score12_ss5.0_rlbd_1l6p_IGNORE_THE_REST_DECOY_14598_298_0_0: HTTP error Thu 30 Jul 2009 16:52:14 EST|rosetta@home|Backing off 3 hr 53 min 24 sec on upload of lr8_B_seq_score12_ss5.0_rlbd_1l6p_IGNORE_THE_REST_DECOY_14598_298_0_0 Thu 30 Jul 2009 16:52:16 EST||Internet access OK - project servers may be temporarily down. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Why is it that when the server goes "down" the status on the home page does not change? is it an interal issue with the server that is not bad enough to trigger a status change or is it that the server has to be "offline" completely? ahh thanks..i was referring to the server status page in my post. so then what has been happening recently is a dameon is going down and causes the error messages we have been seeing. but the "whole" server is still running which is why we still see green on the server status page? |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
come on guys...this is getting old..... do the lr tasks have bugs in them? i havent had much luck with them lately. 8/2/2009 1:39:29 AM|rosetta@home|Started download of relax_options_lr10_seq_score12_mtyka 8/2/2009 1:43:10 AM||Project communication failed: attempting access to reference site 8/2/2009 1:43:10 AM|rosetta@home|Temporarily failed download of lr5_5cro.out.zip: HTTP error 8/2/2009 1:43:10 AM|rosetta@home|Started download of boinc_rb1_1a19.pdb 8/2/2009 1:43:12 AM||Internet access OK - project servers may be temporarily down. 8/2/2009 1:44:31 AM||Project communication failed: attempting access to reference site 8/2/2009 1:44:31 AM|rosetta@home|Temporarily failed download of relax_options_lr10_seq_score12_mtyka: HTTP error 8/2/2009 1:44:31 AM|rosetta@home|Started download of lr10_1a19.out.zip 8/2/2009 1:44:32 AM||Internet access OK - project servers may be temporarily down. |
Murasaki Send message Joined: 20 Apr 06 Posts: 303 Credit: 511,418 RAC: 0 |
come on guys...this is getting old..... I have completed 1 lr task under 1.87 and another under 1.88. Both came out valid. I have a third lr underway on 1.90 and it seems to be going fine so far. |
jay Send message Joined: 12 Jan 08 Posts: 20 Credit: 195,801 RAC: 0 |
Greetings! I am having intermittent problems uploading. Some times the WU go on the first time. Other times - it may take several retries. I like in Florda and use a DSL and *assumed* that the network was not at fault. I would like to test this with a ping. First of all, what is the address of theupload server? boinc.bakerlab.org or srv4.bakerlab.org I tried a short test on each: PING boinc.bakerlab.org (140.142.20.103): 100 data bytes 108 bytes from 140.142.20.103: icmp_seq=0 ttl=45 time=93 ms 108 bytes from 140.142.20.103: icmp_seq=1 ttl=45 time=93 ms 108 bytes from 140.142.20.103: icmp_seq=2 ttl=45 time=93 ms ----boinc.bakerlab.org PING Statistics---- 3 packets transmitted, 3 packets received, 0.0% packet loss round-trip (ms) min/avg/max/med = 93/93/93/93 PING srv4.bakerlab.org (140.142.20.112): 100 data bytes 108 bytes from 140.142.20.112: icmp_seq=0 ttl=45 time=93 ms 108 bytes from 140.142.20.112: icmp_seq=1 ttl=45 time=203 ms 108 bytes from 140.142.20.112: icmp_seq=2 ttl=45 time=406 ms ----srv4.bakerlab.org PING Statistics---- 3 packets transmitted, 3 packets received, 0.0% packet loss round-trip (ms) min/avg/max/med = 93/234/406/203 I turned on debug on BOINC file transfer. Here is what it said: 8/3/2009 4:32:59 PM rosetta@home Scheduler request completed 8/3/2009 4:35:50 PM Project communication failed: attempting access to reference site 8/3/2009 4:35:50 PM rosetta@home [file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval -184 8/3/2009 4:35:50 PM rosetta@home [file_xfer_debug] file transfer status -184 8/3/2009 4:35:50 PM rosetta@home Temporarily failed upload of lr8_newhb_run02_rlbn_1t2i_IGNORE_THE_REST_NATIVE_NOCON_14611_66_0_0: HTTP error 8/3/2009 4:35:50 PM rosetta@home Backing off 2 min 27 sec on upload of lr8_newhb_run02_rlbn_1t2i_IGNORE_THE_REST_NATIVE_NOCON_14611_66_0_0 8/3/2009 4:35:51 PM Internet access OK - project servers may be temporarily down. 8/3/2009 4:38:18 PM rosetta@home Started upload of lr8_newhb_run02_rlbn_1t2i_IGNORE_THE_REST_NATIVE_NOCON_14611_66_0_0 8/3/2009 4:38:18 PM rosetta@home [file_xfer_debug] URL: http://srv4.bakerlab.org/rosetta_cgi/file_upload_handler 8/3/2009 4:38:39 PM Project communication failed: attempting access to reference site 8/3/2009 4:38:39 PM rosetta@home [file_xfer_debug] FILE_XFER_SET::poll(): http op done; retval -107 8/3/2009 4:38:39 PM rosetta@home [file_xfer_debug] file transfer status -107 8/3/2009 4:38:39 PM rosetta@home Temporarily failed upload of lr8_newhb_run02_rlbn_1t2i_IGNORE_THE_REST_NATIVE_NOCON_14611_66_0_0: connect() failed 8/3/2009 4:38:39 PM rosetta@home Backing off 6 min 22 sec on upload of lr8_newhb_run02_rlbn_1t2i_IGNORE_THE_REST_NATIVE_NOCON_14611_66_0_0 8/3/2009 4:38:41 PM Internet access OK - project servers may be temporarily down. Suggestions?? As I take the 30 minutes to look at other posts and write this up, I find that the WU uploaded.. Still, I would like some insight on the process.... Many thanks!! Jay |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
Yes Jay srv4 is the upload server, and as you can see from your PING, it's responsivness is rather inconsistent. It is under heavy load and is still recovering from recent difficulties. The BOINC software that you run on your machine is all set up for these eventualities. It will retry sending the file for you. Rosetta Moderator: Mod.Sense |
jay Send message Joined: 12 Jan 08 Posts: 20 Credit: 195,801 RAC: 0 |
Thank you Mod.Sense for the informative response!! I appreciate the time you take giving answers... I don't always stay connected and usually connect; enable network activity; get some coffee; do updates; disable network activity; and disconnect.... Thanks again, Jay |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2125 Credit: 41,249,734 RAC: 8,235 |
Yes Jay srv4 is the upload server, and as you can see from your PING, it's responsiveness is rather inconsistent. It is under heavy load and is still recovering from recent difficulties. I'm not sure if I just got lucky, but I'd guess the servers finally caught up round about the time you posted. No error messages for upload or download in the last 15 hours and no re-tries. |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
I have copied these from the mini 1.90 thread so they might be seen by admin's. As some people are still having problems. Message 62781 - Posted 5 Aug 2009 16:52:59 UTC - in response to Message ID 62773. 05/08/2009 17:06:15|rosetta@home|Started download of boinc_rb1_1aiu.pdb 05/08/2009 17:06:17|rosetta@home|Finished download of boinc_rb1_1aiu.pdb 05/08/2009 17:06:17|rosetta@home|Started download of lr8_1aiu.out.zip 05/08/2009 17:06:45|rosetta@home|Finished download of minirosetta_database_rev31588.zip 05/08/2009 17:06:45|rosetta@home|Started download of boinc_rb1_1acf.pdb 05/08/2009 17:06:45|rosetta@home|[error] Signature verification failed for minirosetta_database_rev31588.zip 05/08/2009 17:06:45|rosetta@home|[error] Checksum or signature error for minirosetta_database_rev31588.zip How do I fix these errors? :( https://boinc.bakerlab.org/rosetta/results.php?hostid=986605&offset=20 ================================================================================ Message 62787 - Posted 6 Aug 2009 0:22:00 UTC - in response to Message ID 62786. 05/08/2009 17:06:15|rosetta@home|Started download of boinc_rb1_1aiu.pdb 05/08/2009 17:06:17|rosetta@home|Finished download of boinc_rb1_1aiu.pdb 05/08/2009 17:06:17|rosetta@home|Started download of lr8_1aiu.out.zip 05/08/2009 17:06:45|rosetta@home|Finished download of minirosetta_database_rev31588.zip 05/08/2009 17:06:45|rosetta@home|Started download of boinc_rb1_1acf.pdb 05/08/2009 17:06:45|rosetta@home|[error] Signature verification failed for minirosetta_database_rev31588.zip 05/08/2009 17:06:45|rosetta@home|[error] Checksum or signature error for minirosetta_database_rev31588.zip How do I fix these errors? :( https://boinc.bakerlab.org/rosetta/results.php?hostid=986605&offset=20 You could try to reset project if you have not tried already, could not hurt! I have just added Rosetta to the list on two boxes, one Windoze, one Linux and have this error on both. The project hasn't even started, not much to reset. Running BOINC 5.10.45 |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
It looks like the system has slowed to a crawl once again with the release of the new app! My three are all having problems WITH U/L & D/L. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
8/7/2009 12:25:33 AM rosetta@home update requested by user 8/7/2009 12:25:36 AM rosetta@home Sending scheduler request: Requested by user. 8/7/2009 12:25:36 AM rosetta@home Reporting 2 completed tasks, not requesting new tasks 8/7/2009 12:25:58 AM Project communication failed: attempting access to reference site 8/7/2009 12:25:59 AM Internet access OK - project servers may be temporarily down. 8/7/2009 12:26:01 AM rosetta@home Scheduler request failed: Failure when receiving data from the peer then: 8/7/2009 12:28:05 AM rosetta@home update requested by user 8/7/2009 12:28:06 AM rosetta@home Sending scheduler request: Requested by user. 8/7/2009 12:28:06 AM rosetta@home Reporting 2 completed tasks, not requesting new tasks 8/7/2009 12:28:41 AM rosetta@home Scheduler request completed |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
There seems to be a few of us having this problem. Wed 12 Aug 2009 07:42:34 EST|rosetta@home|Temporarily failed upload of abinitio_withrelax_homfrag__no_native_1uzc_0001A__SAVE_ALL_OUT_14620_1950_0_0: HTTP error Wed 12 Aug 2009 07:43:38 EST|rosetta@home|Temporarily failed upload of abinitio_withrelax_homfrag__no_native_1uzc_0001A__SAVE_ALL_OUT_14620_2922_0_0: HTTP error |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Hi. Is there a problem with the validator now, i have a couple of tasks that have been sitting their for hours waiting, is something else broken? B.T.W. Thanks for fixing the other problem with the U/L's. |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Hi. Is there a problem again, l looked at the server page and there is a lot of red their. nothing on the front page about any work to be done on the servers! |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
It looks like you are missing the BD file for some or all tasks with the new app! ERROR: in::file::zip minirosetta_database.zip does not exist! ERROR:: Exit from: ....srcappspublicboincminirosetta.cc line: 97 BOINC:: Error reading and gzipping output datafile: default.out What happened to not releasing new things just before a weekend? |
Speedy Send message Joined: 25 Sep 05 Posts: 163 Credit: 808,098 RAC: 0 |
According to the front page the last time they brought a new application was Aug6 & there's no mention of a new application over on RALPH either Have a crunching good day!! |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Aug 21, 2009 The project is offline for the moment as we deal with an error in the recent application upate. Hopefully we will have the project back online within the next hour or so. Sorry for any inconvenience. |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
A shiny new app for all 1.97, plus a few files! sorry, couldn't help myself ;) |
P . P . L . Send message Joined: 20 Aug 06 Posts: 581 Credit: 4,865,274 RAC: 0 |
Morning all. Is there a problem, none of my rigs are getting any work this morning i see some others are having the same issues. Mon 07 Sep 2009 07:24:25 EST|rosetta@home|Sending scheduler request: To fetch work. Requesting 6794 seconds of work, reporting 0 completed tasks Mon 07 Sep 2009 07:24:53 EST|rosetta@home|Scheduler request succeeded: got 0 new tasks |
Message boards :
Number crunching :
SERVER PROBLEMS.
©2024 University of Washington
https://www.bakerlab.org