Message boards : Number crunching : Minirosetta v1.40 bug thread
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 · Next
Author | Message |
---|---|
A Few Good Men Send message Joined: 25 Mar 07 Posts: 14 Credit: 2,031,382 RAC: 0 |
Please send email to my account when an alternate to mini 1.40 test is available. Thanks in Advance. |
FalconFly Send message Joined: 11 Jan 08 Posts: 23 Credit: 2,163,056 RAC: 0 |
FalconFly, i noticed that you are crunching for LHC@home as well. Darn, it seems you could be right on the spot with that. Nice catch! I haven't seen any anomalies for >24hrs now, as the most recent batch of LHC WorkUnits have been processed. Given the somewhat shaky state of LHC@Home, I'd say Rosetta is off the hook concerning my recent problems :) |
sarha1 Send message Joined: 23 Sep 05 Posts: 5 Credit: 6,339,735 RAC: 0 |
Validate error. WTH? Extremely high claimed credit (100x more than expected). https://boinc.bakerlab.org/rosetta/result.php?resultid=210214915 https://boinc.bakerlab.org/rosetta/result.php?resultid=210214913 Athlon 64 3200+ 1GB RAM WIN XP prof. SP3 |
Alec Rosa Send message Joined: 11 Nov 08 Posts: 18 Credit: 2,635 RAC: 0 |
Please send email to my account when an alternate to mini 1.40 test is available. Thanks in Advance. I second that! |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Please send email to my account when an alternate to mini 1.40 test is available. Thanks in Advance. i'll stay as my error rate is low, but i have to agree, the team needs to take and revamp all these tasks with stupid errors, such as Nan's and recovering checkpoints and lock file errors along with all the other stupid problems that could be taken care of if they were tested on Ralph properly before being released to here. the idea of Rosetta is research of proteins and not research of bad programing. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1233 Credit: 14,284,221 RAC: 1,121 |
A workunit where my computer completed some models successfully without getting any credit: https://boinc.bakerlab.org/rosetta/workunit.php?wuid=191865519 |
robertmiles Send message Joined: 16 Jun 08 Posts: 1233 Credit: 14,284,221 RAC: 1,121 |
Please send email to my account when an alternate to mini 1.40 test is available. Thanks in Advance. The problems seem to be mainly in workunits that use the new features, so an option to avoid getting any of the workunits using those features would be useful. |
rochester new york Send message Joined: 2 Jul 06 Posts: 2842 Credit: 2,020,043 RAC: 0 |
am i doing something wrong here or what? https://boinc.bakerlab.org/rosetta/results.php?hostid=267483 |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
am i doing something wrong here or what? https://boinc.bakerlab.org/rosetta/results.php?hostid=267483 nothing is wrong, other than you need to try out the stuff i pointed out about lockfiles in a previous message to you. if you give that a try it should clear up the problem. the others, as i pointed out last time, seem to time out (10 days no processing or reporting) due to some unknown reason. to much work, not enough on time or cpu time being dedicated to rosetta, or just a rash of bad luck. try solving the lockfile issue and then don't accept any new work until you have completed what you have in queue and when that is done then accept new work and see what results you have. |
rochester new york Send message Joined: 2 Jul 06 Posts: 2842 Credit: 2,020,043 RAC: 0 |
i looked and cant find the procedure what do i do? am i doing something wrong here or what? https://boinc.bakerlab.org/rosetta/results.php?hostid=267483 |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
goto here for my original post and go here for the boinc wiki description. here is where you will find the files you need to remove after you shut all boinc processes down: If you are going to delete it then you can find the lockfile that is actually called boinc_lockfile and it is in boinc folder then subfolder projects and then subfolder slots.
|
robertmiles Send message Joined: 16 Jun 08 Posts: 1233 Credit: 14,284,221 RAC: 1,121 |
I just suspended BOINC entirely for my weekly antiviral and antispyware checks, then noticed that a rosetta@home workunit was still using CPU time on my computer: https://boinc.bakerlab.org/rosetta/results.php?userid=264600 I then also suspended the rosetta@home project and that specific task; this didn't stop it from using CPU time. Since this is using only one core of my dual core PC, I'm going to try running the antiviral and antispyware programs as usual, even with that workunit still running. 11/28/2008 8:40:30 AM|rosetta@home|Starting 1shfA_BOINC_ABRELAX_SPLIT_SPLIT_NOHATR_IGNORE_THE_REST-S25-9-S3-3--1shfA-_4844_644_1 11/28/2008 8:40:31 AM|rosetta@home|Starting task 1shfA_BOINC_ABRELAX_SPLIT_SPLIT_NOHATR_IGNORE_THE_REST-S25-9-S3-3--1shfA-_4844_644_1 using minirosetta version 140 |
Alec Rosa Send message Joined: 11 Nov 08 Posts: 18 Credit: 2,635 RAC: 0 |
Please send email to my account when an alternate to mini 1.40 test is available. Thanks in Advance. Very well put. So why do the project developers say nothing about this here? |
robertmiles Send message Joined: 16 Jun 08 Posts: 1233 Credit: 14,284,221 RAC: 1,121 |
Please send email to my account when an alternate to mini 1.40 test is available. Thanks in Advance. I suspect it's because they're too busy reading all the problem reports. Do you think it would be enough to move just the workunits using the new features introduced in 1.39 and 1.40 back to Ralph, so they'd still have something for the rest of the participants to do until they fix the new problems? |
robertmiles Send message Joined: 16 Jun 08 Posts: 1233 Credit: 14,284,221 RAC: 1,121 |
(Duplicate message - deleted) |
Alec Rosa Send message Joined: 11 Nov 08 Posts: 18 Credit: 2,635 RAC: 0 |
Please send email to my account when an alternate to mini 1.40 test is available. Thanks in Advance. Wouldn't that be best? |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
Please send email to my account when an alternate to mini 1.40 test is available. Thanks in Advance. I agree with you guys on this. |
rochester new york Send message Joined: 2 Jul 06 Posts: 2842 Credit: 2,020,043 RAC: 0 |
this is getting too crazy ill give it 2 more days disconnect and then ill be back in a couple weeks to see if this is back to working Please send email to my account when an alternate to mini 1.40 test is available. Thanks in Advance. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1233 Credit: 14,284,221 RAC: 1,121 |
I just suspended BOINC entirely for my weekly antiviral and antispyware checks, then noticed that a rosetta@home workunit was still using CPU time on my computer: The Ad-Aware 2008 program apparantly ran correctly even with that workunit still running, without taking longer than usual. It found about twice as many cookies as usual, which makes me suspect that I forgot to run it last week. It was unable to remove all these cookies without restarting Vista - something which happens about half the time even when all workunits respond correctly to a suspend - so I let it restart Vista. Since I have to restart BOINC manually every time Vista restarts, I was then able to run the remaining antispyware programs and the antivirus program before restarting BOINC. What filename should I expect for the cookie from Rosetta@home, so I can tell that program not to delete it? When that workunit got a CPU core again, it repeated the same problem of continuing to run even after BOINC tries to give another workunit a turn on that CPU core. I'm going to tell BOINC not to download any more Rosetta@home workunits until I have more time to watch for such behavior. |
David Baker Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 17 Sep 05 Posts: 705 Credit: 559,847 RAC: 0 |
Very sorry about all the problems, we are working to fix them as fast as possible. One source of the problems is that we are now running a broader range of applications on rosetta@home so there are more sources of error. I do apologize for the problems; we have an absolute rule to check all work units first on ralph, but there are some errors which don't get caught this way. Our top priority now is to find the source of the problems and to fix them. |
Message boards :
Number crunching :
Minirosetta v1.40 bug thread
©2024 University of Washington
https://www.bakerlab.org