Compute error

Message boards : Number crunching : Compute error

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Profile Chris Holvenstot
Avatar

Send message
Joined: 2 May 10
Posts: 220
Credit: 9,106,918
RAC: 0
Message 70004 - Posted: 10 Apr 2011, 1:34:35 UTC

Hank said:

I suppose that when there are problems, the folks whose job it is to solve them spend time on that rather than getting into potentially endless conversations on the forum. ;)


I agree - they don't need to spend their workday on endless conversations - I think that 90% of us would be thrilled to see just a simple entry in the "News" section of the home page.

If you've never read it take a moment and check it out. I understand if you actually read it they will give you 3 credit hours in ANCIENT AMERICAN HISTORY.

ID: 70004 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,178,626
RAC: 3,201
Message 70005 - Posted: 10 Apr 2011, 10:42:01 UTC - in response to Message 70004.  

Hank said:

I suppose that when there are problems, the folks whose job it is to solve them spend time on that rather than getting into potentially endless conversations on the forum. ;)


I agree - they don't need to spend their workday on endless conversations - I think that 90% of us would be thrilled to see just a simple entry in the "News" section of the home page.

If you've never read it take a moment and check it out. I understand if you actually read it they will give you 3 credit hours in ANCIENT AMERICAN HISTORY.



Yes you would! The latest entry in the News section is:
"Feb 23, 2011
Outage Notice: We are going to update our scheduler tomorrow, Thursday the 24th. The project will be offline intermittantly throughout the day."

It is not only old it is about an outage that is long over with!!! Someone stopping by to check out Rosetta would NOT be inclined to stay as the last thing is about an outage!! How would they know it is over, their first inclination is that it is STILL going on!! They NEED to put some time into the webpage, if they are going to have it it NEEDS to be updated!!
ID: 70005 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Kirby54925
Avatar

Send message
Joined: 4 Feb 10
Posts: 4
Credit: 6,423,293
RAC: 0
Message 70007 - Posted: 10 Apr 2011, 16:13:29 UTC

Hopefully Dr. Baker will remember his promise to communicate the project's status to us better using Twitter and Facebook. I mean, it's not that hard to compose a 140-character message on Twitter updating us on what's going on. It really should only take about 10-20 seconds at most, making the whole "I'm busy, blah blah blah" excuse moot.
ID: 70007 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chris Holvenstot
Avatar

Send message
Joined: 2 May 10
Posts: 220
Credit: 9,106,918
RAC: 0
Message 70009 - Posted: 11 Apr 2011, 5:31:52 UTC
Last modified: 11 Apr 2011, 5:38:09 UTC

Kirby said ...

Hopefully Dr. Baker will remember his promise to communicate the project's status to us better using Twitter and Facebook.


You know, I would not look to or expect the good doctor to get involved with giving status on the technical glitches we have been seeing - rather I would like to hear from him some sort of short blurp on a regular basis concerning the projects current direction and accomplishments.

In simple "layman's terms"

I think that it is more in the realm of the sysadmins to provide us some sort of status when things start going wrong. Or heck, enslave a few grad students to be responsible for that - after all from my time on various campuses it would appear that the average grad student is somewhere between an indentured servant and chattel.

Maybe what we all need to do is just quietly hit the suspend button for a couple days and get their attention.

But that's just frustration talking.
ID: 70009 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,178,626
RAC: 3,201
Message 70012 - Posted: 11 Apr 2011, 10:42:50 UTC - in response to Message 70009.  

Kirby said ...

Hopefully Dr. Baker will remember his promise to communicate the project's status to us better using Twitter and Facebook.


You know, I would not look to or expect the good doctor to get involved with giving status on the technical glitches we have been seeing - rather I would like to hear from him some sort of short blurp on a regular basis concerning the projects current direction and accomplishments.

In simple "layman's terms"

I think that it is more in the realm of the sysadmins to provide us some sort of status when things start going wrong. Or heck, enslave a few grad students to be responsible for that - after all from my time on various campuses it would appear that the average grad student is somewhere between an indentured servant and chattel.

Maybe what we all need to do is just quietly hit the suspend button for a couple days and get their attention.

But that's just frustration talking.


Unfortunately Dr. Baker knows that those on the boards make up less than 5% of the total users and most users NEVER participate in any kind of discussion, meaning that most users will just continue on obliviously trying, regardless of anything the project does.
ID: 70012 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Jack Johnson
Avatar

Send message
Joined: 27 Apr 10
Posts: 5
Credit: 467,860
RAC: 0
Message 70014 - Posted: 11 Apr 2011, 14:53:28 UTC

Yes your right fatbozz. Not only are the work units failing with the TO. . . ,but also ilv. . .. So correct me if I'm wrong, but the majority of work units with errors are starting with TO. . . and ilv. . ..
Be safe,
Jack
One light to illuminate all darkness.
ID: 70014 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
dondrusco

Send message
Joined: 2 Jan 07
Posts: 3
Credit: 4,772,623
RAC: 0
Message 70020 - Posted: 12 Apr 2011, 13:37:49 UTC

Only for information - my computer starts to have too many computation error WUs.
It starts after last MS Win XP patch KB968930 installed on my computer.

https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=1358049


ID: 70020 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TJ

Send message
Joined: 29 Mar 09
Posts: 127
Credit: 4,799,890
RAC: 0
Message 70024 - Posted: 12 Apr 2011, 20:37:25 UTC
Last modified: 12 Apr 2011, 20:37:55 UTC

I am not twittering, facebooking or whatever. I would like to read posts about issues on this site. The home page of Rosetta.

Yesterday and today I have a lot compute error again. As there is les or now information about I am off for a few weeks with this project.
Greetings,
TJ.
ID: 70024 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Kirby54925
Avatar

Send message
Joined: 4 Feb 10
Posts: 4
Credit: 6,423,293
RAC: 0
Message 70029 - Posted: 13 Apr 2011, 2:56:38 UTC

I am not twittering, facebooking or whatever. I would like to read posts about issues on this site. The home page of Rosetta.


I would rather have updates be both on the main site and Twitter because while the main site is the most obvious place to put it in, Twitter has more robust servers. Besides, you don't need an account to read public tweets. Don't be pretentious about it.
ID: 70029 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
TJ

Send message
Joined: 29 Mar 09
Posts: 127
Credit: 4,799,890
RAC: 0
Message 70031 - Posted: 13 Apr 2011, 21:14:18 UTC - in response to Message 70029.  

I am not twittering, facebooking or whatever. I would like to read posts about issues on this site. The home page of Rosetta.


I would rather have updates be both on the main site and Twitter because while the main site is the most obvious place to put it in, Twitter has more robust servers. Besides, you don't need an account to read public tweets. Don't be pretentious about it.


Good point.

However I won't check the public tweets as well.
Greetings,
TJ.
ID: 70031 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
.clair.

Send message
Joined: 2 Jan 07
Posts: 274
Credit: 26,399,595
RAC: 0
Message 70037 - Posted: 14 Apr 2011, 20:59:17 UTC

If they have the time to put it on Twit then why not the home page as well ????
ID: 70037 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Kirby54925
Avatar

Send message
Joined: 4 Feb 10
Posts: 4
Credit: 6,423,293
RAC: 0
Message 70039 - Posted: 15 Apr 2011, 2:19:50 UTC

That's the thing: Dr. Baker hasn't updated either one in a while.
ID: 70039 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chris Holvenstot
Avatar

Send message
Joined: 2 May 10
Posts: 220
Credit: 9,106,918
RAC: 0
Message 70043 - Posted: 16 Apr 2011, 4:59:30 UTC
Last modified: 16 Apr 2011, 5:00:12 UTC

How about it mod.sense - any word about the t0xxx type jobs which fail after only a few seconds - I am still getting a trickle of them. Matching wingman results of course. Any chance of getting the problem fixed or are they just going to let the series of jobs burn out and die an natural death?

I am also starting to see a new type of task with errors flow into my systems - with matching wingman results. Tasks have the prefix of "dck_rhoA_rhoA" and fail after zero seconds with the error message:

ERROR: Option matching -docking:no_filters not found in command line top-level context


Sample tasks would include:

414237964
414202168
414127519
414134968
414192201
414167204

additionally, I had previously mentioned in this thread a series of tasks which ran for a while (half hour and up)- end with 100 decoys generated and then fail with a validate error - matching wingman results here too.

The names for these tasks seem to all be prefixed with "ProteinG_abinitio_SAVE_ALL_OUT"

Sample tasks would include:

414750044
414771827
414705765
414691538
413762471
413761980

Thanks in advance for any information you can squeeze out of the admins on these issues - it has been almost two weeks since they were first reported and the crunchers have been provided with no feedback yet.

The admins seem to be asleep at the wheel - are they studying to be air traffic controllers when they grow up?
ID: 70043 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Michael Gould

Send message
Joined: 3 Feb 10
Posts: 39
Credit: 15,440,990
RAC: 4,348
Message 70050 - Posted: 17 Apr 2011, 21:20:16 UTC

This air traffic controller hasn't seen any of the problems on his computer you guys are talking about. I've got a "ProteinG_abinitio..." running right now, and a "TO533..." ready to report. Maybe the problem is with sleeping aerospace computer engineers (from Texas) ;-)

ID: 70050 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 12,120,035
RAC: 0
Message 70051 - Posted: 17 Apr 2011, 22:13:14 UTC - in response to Message 70043.  


I am also starting to see a new type of task with errors flow into my systems - with matching wingman results. Tasks have the prefix of "dck_rhoA_rhoA" and fail after zero seconds with the error message:

ERROR: Option matching -docking:no_filters not found in command line top-level context



This has already been answered in the minirosetta 2.17 thread.

As the first post in that thread states: that's where bugs should be reported. There's no need to have a separate one here: it just gets confusing.

ID: 70051 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chris Holvenstot
Avatar

Send message
Joined: 2 May 10
Posts: 220
Credit: 9,106,918
RAC: 0
Message 70065 - Posted: 18 Apr 2011, 16:12:17 UTC

svincint said ... As the first post in that thread states: that's where bugs should be reported. There's no need to have a separate one here: it just gets confusing.


OK - I posted fresh examples of my previously reported errors to the minirosettsa thread. Is there really hope that it will cut down on the number of weeks it takes to get a response / resolution?
ID: 70065 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chilean
Avatar

Send message
Joined: 16 Oct 05
Posts: 711
Credit: 26,694,507
RAC: 0
Message 70068 - Posted: 18 Apr 2011, 17:35:53 UTC - in response to Message 70050.  
Last modified: 18 Apr 2011, 17:39:46 UTC

This air traffic controller hasn't seen any of the problems on his computer you guys are talking about. I've got a "ProteinG_abinitio..." running right now, and a "TO533..." ready to report. Maybe the problem is with sleeping aerospace computer engineers (from Texas) ;-)



Same here... I have a "T0591..." ready to report. It DID take 6 hours... when my run time is set at 2 hrs.

But I have a "Ross2X3" that failed after 44 seconds of run-time.

By the way... David seems to keep posting new content about the research @ https://boinc.bakerlab.org/rosetta/forum_thread.php?id=1177
ID: 70068 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 70090 - Posted: 22 Apr 2011, 21:30:18 UTC

latest compute error issue: could not open file cs_frags.9mers.gz

this has killed 26 tasks on my system is so many days. 3 of them today.
no one from the team has bothered to say boo about this problem.
ID: 70090 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Chris Holvenstot
Avatar

Send message
Joined: 2 May 10
Posts: 220
Credit: 9,106,918
RAC: 0
Message 70093 - Posted: 23 Apr 2011, 4:53:42 UTC

Greg - this is one of the errors one of the errors people have been trying to get the sysadmins to address for several weeks. As you noted, there has been no status given by the project.
ID: 70093 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
James Thompson

Send message
Joined: 13 Oct 05
Posts: 46
Credit: 186,109
RAC: 0
Message 70136 - Posted: 27 Apr 2011, 0:50:38 UTC - in response to Message 70093.  

Greg - this is one of the errors one of the errors people have been trying to get the sysadmins to address for several weeks. As you noted, there has been no status given by the project.


This is fixed now, and more detail is in this thread.
ID: 70136 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Compute error



©2024 University of Washington
https://www.bakerlab.org