Problems with Minirosetta v1.54

Message boards : Number crunching : Problems with Minirosetta v1.54

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 15 · Next

AuthorMessage
Profile Greg_BE
Avatar

Send message
Joined: 30 May 06
Posts: 5691
Credit: 5,859,226
RAC: 0
Message 59363 - Posted: 5 Feb 2009, 20:19:02 UTC

lr5_D_hybrid_rlbd_1e6i_IGNORE_THE_REST_DECOY_6250_347_0 died at 3 hrs out of 4 and also kicked up a dialogue box on my desktop.

the error - exit code -1073740791 (0xc0000409)

ID: 59363 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,179,826
RAC: 3,209
Message 59365 - Posted: 5 Feb 2009, 21:33:33 UTC - in response to Message 59355.  

mikey, whatever the problem is, it stands a good chance of clearing itself when the next Rosetta version comes out. The .exe will be a different name afterall. So, please monitor the new release thread and give another try at that time.


I will, I like the premise of Rosetta and that is what brought me here in the first place. I will certainly try again in the future, probably when you put out a new version as you suggest.
Thanks for all your help


I just had a thought...NOT dangerous this time, I am off for a few days here and I have finally figured out how to make Ubuntu Linux work for me and crunch Boinc projects too. I will try switching one of the machines that won't download the Windows app to Linux and see if that works.


Guess what...............NO PROBLEM, it is crunching just fine. Here is the pc: https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=1001897
It is on its first unit, so no results yet, but one unit is crunching just fine so far!!


I just thought of something....I wonder if changing the setting for:
Skip image file verification? to yes would have let my Windows pc's download the file? Hmmmmm
ID: 59365 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 59367 - Posted: 5 Feb 2009, 21:48:57 UTC

The image verification can't occur until the download completes. So, that's not what's causing the download problem.
Rosetta Moderator: Mod.Sense
ID: 59367 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
epcorian

Send message
Joined: 1 Jan 09
Posts: 16
Credit: 253,062
RAC: 0
Message 59371 - Posted: 6 Feb 2009, 1:21:48 UTC

Mod.Sense had asked me to posts my results in here. A little history, I've been getting Compute Error's for every Minirosetta WU I try and crunch, they usually crash and burn within the first 60 seconds or so...I am running a Q6600 with everything running at stock speeds but I was throttling my processor to use only 3 of 4 cores, so it was suggested that I let all 4 cores run unthrottled and here's what happenned:

I changed it to: "On multiprocessor systems, use at most 100% of the processors" so that it would run completely unthrottled and use all 4 cores. And I let it download minirosetta WU's and it got 5 of them and all failed after 0:33, 1:39, 0:56, 0:38, and last one at 0:51 crashed with a Vista popup saying "minirosetta_1.54_windows_x86_64.exe has stopped working"

So it didn't seem to help, I don't know what else to try but I'm little ashamed of all the compute errors when you look at my results page..so I think I may have to give up on minirosetta and just stick to Beta WU's, they seem to work great when I'm not messing around with the BOINC client.

I think it may have something to do with Vista 64. Because I have an E8500 running Vista 64 and they fail on there too but the E8500 is throttled to 1 core and is OC'ed from 3.16Ghz to 3.8Ghz (I've been told OC'ing will effect minirosetta) but the E8500 is my gaming rig so I don't mind if it doesn't crunch WU's because it's crunching games! :)
ID: 59371 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 59374 - Posted: 6 Feb 2009, 2:58:56 UTC

And epcorian is not overclocked. Running BOINC version 6.4.5

They consistently fail with Access Violations on the Mini tasks. The "Rosetta Beta" tasks are the successes you will find.

Is it possible you've got something like an antivirus application that's conflicting on Vista?

The only other thought is to go back to the prior stable version of the BOINC client. There have been a number of fishy issues with the 6.4.x level. You can download older BOINC versions here
Rosetta Moderator: Mod.Sense
ID: 59374 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
epcorian

Send message
Joined: 1 Jan 09
Posts: 16
Credit: 253,062
RAC: 0
Message 59376 - Posted: 6 Feb 2009, 3:29:44 UTC

That's right, the Q6600 isn't overclocked, the system contains a Intel DQ35JO MB, Q6600 Processor, 4GB (2x2GB) Kingston Value Ram, Corsair HX-520W PS, 36GB WD Raptor HD, 2x750GB WD HD's in RAID 1, and a Zalman HSF running Vista 64 SP1, no external video card. I use it as a home file and print server and recently a BOINC cruncher as I leave it on 24/7. No issues with Beta WU's or SETI.

I do have NOD32 installed on there but I tried disabling it (I haven't gone as far to uninstall it) and they would still fail.

Maybe I should try an older version of the BOINC client, I will give it a go this weekend and post back.

Thanks!
ID: 59376 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
NewtonianRefractor

Send message
Joined: 29 Sep 08
Posts: 19
Credit: 2,350,860
RAC: 0
Message 59382 - Posted: 6 Feb 2009, 9:38:53 UTC
Last modified: 6 Feb 2009, 9:45:30 UTC

can someone please explain what happened here?

Here is another one.
ID: 59382 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,179,826
RAC: 3,209
Message 59384 - Posted: 6 Feb 2009, 12:28:02 UTC - in response to Message 59367.  

The image verification can't occur until the download completes. So, that's not what's causing the download problem.


DARN, I was hoping that would solve my problem, oh well. Thanks!
ID: 59384 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,179,826
RAC: 3,209
Message 59385 - Posted: 6 Feb 2009, 12:33:32 UTC - in response to Message 59374.  

And epcorian is not overclocked. Running BOINC version 6.4.5

They consistently fail with Access Violations on the Mini tasks. The "Rosetta Beta" tasks are the successes you will find.

Is it possible you've got something like an antivirus application that's conflicting on Vista?

The only other thought is to go back to the prior stable version of the BOINC client. There have been a number of fishy issues with the 6.4.x level. You can download older BOINC versions here


He is running a 64 bit OS though, I read on one of the projects that you need to do something to make 32 bit units work on a 64 bit system, is that true with Rosetta units too? That is NOT true for all projects and I do not remember where I read it.
ID: 59385 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 59391 - Posted: 6 Feb 2009, 14:26:53 UTC
Last modified: 6 Feb 2009, 14:29:47 UTC

Moved NewtonianRefractor's post here. They report a validation error on a tasks that had a visit from the watchdog. They ended at target runtime plus 4hrs, but show with validation errors.
Rosetta Moderator: Mod.Sense
ID: 59391 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rembertw

Send message
Joined: 21 Apr 07
Posts: 14
Credit: 628,529
RAC: 0
Message 59394 - Posted: 6 Feb 2009, 16:32:31 UTC - in response to Message 59161.  

rembertw
Please open the advanced view of the BOINC Manager, go to the tasks tab, and note the "application" name shown, this will have the application version. The only reports of tasks running that long are from the prior version. If it is not Rosetta mini 1.54, please select that task, and abort it with the button on the left. There were some problems like that on the prior version that are corrected now.


Same problem again on at least one of my computers. This time I have more details:
Application: Rosetta Mini 1.54
Task name: lr6_E_score12_rlbd_1ail_IGNORE_THE_REST_DECOY_6254_459_0

Total runtime before manual cancellation: 72:21:22
Total Progress: 0%
Time to go: 6:42:30 (as usual on my computers)

Any comments/ideas?
ID: 59394 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 59395 - Posted: 6 Feb 2009, 17:10:50 UTC

Rembertw, which machine are you having the problem with? What version of BOINC are you running? Was this a newly installed machine? Or was it working before?
Rosetta Moderator: Mod.Sense
ID: 59395 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
svincent

Send message
Joined: 30 Dec 05
Posts: 219
Credit: 12,120,035
RAC: 0
Message 59406 - Posted: 7 Feb 2009, 1:36:32 UTC

Similar error to that reported by Paul Buck

Task: 226615095
Workunit: 206537670
Name: loopbuild_ref_tex_cst_hombench_loopbuild_tex_cst_t326__IGNORE_THE_REST_1R9GA_7_6642_18_0

Mac OSX 10.4.11

<core_client_version>6.2.18</core_client_version>
<![CDATA[

*** Probably irrelevant stuff deleted

End of unzipping.
Unpacking WU data ...
Unpacking data: ../../projects/boinc.bakerlab.org_rosetta/loopbuild_ref_tex_cst.loopbuild_tex_cst.t326_.tex.boinc_files.zip
<unzip> <-oq> <../../projects/boinc.bakerlab.org_rosetta/loopbuild_ref_tex_cst.loopbuild_tex_cst.t326_.tex.boinc_files.zip> <-d./>
Firstarg=true; pp=-d./
firstarg: <-d./>
End of unzipping.
Setting database description ...
Setting up checkpointing ...
BOINC:: Worker startup.
Starting watchdog...
Watchdog active.
# cpu_run_time_pref: 14400
Hbond tripped.

ERROR: dis==0 in pairtermderiv!
ERROR:: Exit from: src/core/scoring/methods/PairEnergy.cc line: 338
BOINC:: Error reading and gzipping output datafile: default.out
called boinc_finish

</stderr_txt>
]]>


ID: 59406 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
epcorian

Send message
Joined: 1 Jan 09
Posts: 16
Credit: 253,062
RAC: 0
Message 59409 - Posted: 7 Feb 2009, 3:40:46 UTC

So I took Mod.Sense's advice and downgraded to the 6.2.19 64-bit version of the BOINC client and so far so good with the mini's, I've crunched with 30 minutes thus far and no errors yet, much better then the 30-60 seconds I was getting before.
ID: 59409 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
rembertw

Send message
Joined: 21 Apr 07
Posts: 14
Credit: 628,529
RAC: 0
Message 59411 - Posted: 7 Feb 2009, 8:39:55 UTC - in response to Message 59395.  

Rembertw, which machine are you having the problem with? What version of BOINC are you running? Was this a newly installed machine? Or was it working before?

- This specific problem occurs with computer ID 586996
- Boinc version 6.2.14 as on most of my computers currently
- Not newly installed, but hardly a price winner with Rosetta. It crunches succesfully for other projects though

Extra comments: I have the impression that it is Rosetta that crashes. This morning I noticed 2 other tasks at +7h progress and 0% progress. When cancelling these tasks I got the Windows crash notice where I can "inform microsoft of the problem". The only "special" about this computer is that it doesn't have 24/7 internet access.
ID: 59411 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mikey
Avatar

Send message
Joined: 5 Jan 06
Posts: 1895
Credit: 9,179,826
RAC: 3,209
Message 59412 - Posted: 7 Feb 2009, 12:17:19 UTC - in response to Message 59409.  

So I took Mod.Sense's advice and downgraded to the 6.2.19 64-bit version of the BOINC client and so far so good with the mini's, I've crunched with 30 minutes thus far and no errors yet, much better then the 30-60 seconds I was getting before.


ALRIGHT!!! Glad you guys found the problem, I guess the reports of the newer versions being released without proper testing were true in your case.
ID: 59412 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Please, erase me.
Avatar

Send message
Joined: 16 Dec 07
Posts: 3
Credit: 63,423
RAC: 0
Message 59416 - Posted: 7 Feb 2009, 13:56:34 UTC

Hola,

En primer lugar, disculpas por escribir en castellano, pero mi inglés es insuficiente.

Desde agosto de 2008 me están finalizando el 99% de las tareas de Rosetta Mini con error de cálculo. Al cabo de un tiempo decidí no seguir procesando en este proyecto. Aun así, de cuando en cuando vuelvo a intentarlo, pero todo sigue igual: incluso con las nuevas versiones de Rosetta Mini, incluida esta última.

El caso es que las tareas de Rosetta Beta no me fallan, pero de ésas me envía proporcinalmente muy pocas. La pena es que en este proyecto no existe la posibilidad de seleccionar subproyectos, como sí la hay en otros muchos.

Me gustaría seguir procesando para este proyecto, pero no hay manera, y no es cuestión de tirar horas de computación desaprovechadas. Espero que este problema se resuelva pronto. Por mi parte seguiré probando de vez en cuando.

Un coridal saludo para todos,

Juan
ID: 59416 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Klimax

Send message
Joined: 27 Apr 07
Posts: 44
Credit: 2,800,788
RAC: 495
Message 59418 - Posted: 7 Feb 2009, 14:14:01 UTC

Hello.
Following task (https://boinc.bakerlab.org/rosetta/result.php?resultid=225859224) is suspended as it has produced "accepted energy": QNAN(Not a Number?) and RMSD: QO.Model number 25 step 9518. Running time: 20h 2min 21sec.
Set runtime 24h.
For now suspended.No crash before.
OS:Windows 7 beta.I can create dump file using task manager.

Should I let it try to finish?

Thanks
ID: 59418 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1232
Credit: 14,281,662
RAC: 1,150
Message 59419 - Posted: 7 Feb 2009, 15:04:48 UTC - in response to Message 59172.  

mikey, have you tried a different version of BOINC?


Yes I was originally using version 6.4.3 but upgraded to version 6.6.3 and have since downgraded back to 6.4.3. Nothing has worked. I even had a couple of computers on 6.2.19 and they couldn't finish the download either!



Just a wild shot. ..

How is your disk space?

How about BOINC settings for disk space? Are you at BOINC's limit?


No I am fine on both. Boinc still has 10 gig available to it and there is over 20 gig total available.


How many BOINC projects do you have set up? I've seen signs that BOINC divides the available space equally among projects, even if some projects don't even try to use all of their share. I'm currently allowing BOINC to share up to 30 GB among 8 BOINC projects (not all making workunits available recently). I had problems getting Rosetta@home to run workunits on both cores of my dual-core CPU at the same time before that. Also, I believe I've seen a maximum percentage of the available free space on the hard drive BOINC is allowed to use, which can reduce the limits even further.
ID: 59419 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile robertmiles

Send message
Joined: 16 Jun 08
Posts: 1232
Credit: 14,281,662
RAC: 1,150
Message 59420 - Posted: 7 Feb 2009, 15:25:32 UTC

I recently had a 1.54 workunit with a validate error for no reason I could spot in the Task ID details file. A wingman got a Success, but apparantly with a much shorter preferred workunit length than the 14 hours I request.

https://boinc.bakerlab.org/rosetta/workunit.php?wuid=204095976

Could you check for problems in parts of the workunit the wingman probably never reached?
ID: 59420 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 15 · Next

Message boards : Number crunching : Problems with Minirosetta v1.54



©2024 University of Washington
https://www.bakerlab.org