Message boards : Number crunching : minirosetta v1.19 bug thread
Previous · 1 · 2 · 3 · 4
Author | Message |
---|---|
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
huge debug dump on this task: rb_05_16_11639_20372_T0405_IGNORE_THE_REST_08_11_3323_227_0 https://boinc.bakerlab.org/rosetta/result.php?resultid=164159635 it completed most of its computing before hitting a big error: -1073741819 (0xc0000005) CPU time 8536.469 stderr out <core_client_version>5.10.45</core_client_version> <![CDATA[ <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> # cpu_run_time_pref: 14400 # cpu_run_time_pref: 28800 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x005321B4 read attempt to address 0x3FC662A7 Engaging BOINC Windows Runtime Debugger... |
The_Bad_Penguin Send message Joined: 5 Jun 06 Posts: 2751 Credit: 4,271,025 RAC: 0 |
rb_05_16_11639_20371_T0405_IGNORE_THE_REST_05_11_3322_57_0 <core_client_version>5.10.30</core_client_version> <![CDATA[ <stderr_txt> ====================================================== DONE :: 1 starting structures 9119.05 cpu seconds This process generated 4 decoys from 4 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>rb_05_16_11639_20371_T0405_IGNORE_THE_REST_05_11_3322_57_0_0</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> Validate state Invalid Claimed credit 23.6733820084193 Granted credit 0 application version 1.19 |
Path7 Send message Joined: 25 Aug 07 Posts: 128 Credit: 61,751 RAC: 0 |
Running Ubuntu 7.10 x86 this task: 1opd__BOINC_ABRELAX_SAVE_ALL_OUT_IGNORE_THE_REST-S25-11-S3-4--1opd_-_3252_12 ended with a validate error for me after 11,727.76 seconds and ended successfully on the second run after 7,642.63 seconds running on Windows XP Professional Edition. I've switched my computer off while this WU was running (no issues with that before). Have a nice day, Path7. |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
another long and scary debug thread here. i think it has to do with I was trying to install a usb card reader that caused the system to go nuts. its here you can read the post mortom rb_05_16_11639_20372_T0405_IGNORE_THE_REST_06_11_3323_425_0 |
Scott McInness Send message Joined: 15 Mar 08 Posts: 1 Credit: 393,032 RAC: 0 |
I've just updated BOINC on a PC that I haven't used for BOINC for about 12 months (wow, there's an x64 version now!) and every work unit initiated with mini 1.19 x86_64 crashes after less than a second. It also seems to run as a 32-bit process... 165005856 - Access Violation (0xc0000005) at address 0x73010175 read attempt to address 0x73010175 165012983 - Access Violation (0xc0000005) at address 0x73010175 read attempt to address 0x73010175 165017550 - Access Violation (0xc0000005) at address 0x73010175 read attempt to address 0x73010175 165018958 - Access Violation (0xc0000005) at address 0x73010175 read attempt to address 0x73010175 165019984 - Access Violation (0xc0000005) at address 0x73010175 read attempt to address 0x73010175 There is a Rosetta Beta 5.96 x86_64 task running atm (which is also running as a 32-bit process) just on 13% without problem, and SETI tasks (32-bit only) seem to work too. |
AMD_is_logical Send message Joined: 20 Dec 05 Posts: 299 Credit: 31,460,681 RAC: 0 |
This WU was marked "invalid" despite having a completely normal looking stderr. |
Speedy Send message Joined: 25 Sep 05 Posts: 163 Credit: 808,098 RAC: 0 |
Task ID 164939423 Name rb_05_19_11641_20436_T0407_IGNORE_THE_REST_04_16_3332_224_0 had a Compute error CPU time 4351.235 stderr out <core_client_version>5.10.45</core_client_version> <![CDATA[ <stderr_txt> # cpu_run_time_pref: 3600 # cpu_run_time_pref: 3600 ====================================================== DONE :: 1 starting structures 4351.19 cpu seconds This process generated 1 decoys from 1 attempts ====================================================== BOINC :: Watchdog shutting down... BOINC :: BOINC support services shutting down... called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>rb_05_19_11641_20436_T0407_IGNORE_THE_REST_04_16_3332_224_0_0</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> Validate state Invalid Claimed credit 17.070482774889 Granted credit 0 application version 1.19 Is this a goast wu? Cheers Speedy Have a crunching good day!! |
Pepo Send message Joined: 28 Sep 05 Posts: 115 Credit: 101,358 RAC: 0 |
This WU was marked "invalid" despite having a completely normal looking stderr. Bad luck, I suppose it was because of the wrong WU settings: minimum quorum: 1 IMHO you should not have got the task resent after your wingman failed - a task born to be cancelled? Devs? Peter |
Speedy Send message Joined: 25 Sep 05 Posts: 163 Credit: 808,098 RAC: 0 |
This is not a bug. I was wondering are there any plans to display what model the work unit is up to? Thanks for your hard work on this application on behalf of all of the cruncher's. Cheers Speedy Have a crunching good day!! |
nouqraz Send message Joined: 8 Apr 08 Posts: 6 Credit: 354,461 RAC: 1,070 |
One of my systems seems to be having issues runing minirosetta v1.19 WUs. It is a 4 processor Intel Xeon CPU X3210 (two dual core chips) running server 2003 R2. It seems to be crunching through Rosetta Beta 5.96 WUs no problem, but when it goes to start a mini 1.19 WU, it switches the task to "running" but CPU time is ever used and the task stays at 0%. If I suspend all of the mini 1.19 WUs that are queued up the system immediately begins crunching on any Rosetta Beta 5.96 WUs without any problem. I have left the system sitting in the "running" @ 0% state on mini units for hours and it hasn't gotten anywhere, my only option seems to be to suspend or abort the work units. I have two other machines - one an Intel P4, the other a Core 2 Quad 9300, both running XP - that seem to have no problems running mini or beta WUs. Is it possible to get the client to not receive mini WUs? Or is there some known reason behind these stalled work units that there is a workaround for? Thanks, Adam |
Jeremy Send message Joined: 15 May 08 Posts: 13 Credit: 2,636 RAC: 0 |
I have had nothing but Compute errors with the mini version of rosetta. See this page https://boinc.bakerlab.org/rosetta/results.php?userid=259031 I'd rather only have the normal ones for 2 reasons. One it keeps giving errors so the cpu time isn't putt to use. It doesn't have propper grafics, but I've read that that is not a priority. I'd like to help debugging this application by sending whatever information you need. Here is my host sheet. https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=812509 |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
5croA_BOINC_ABRELAX_SAVE_ALL_OUT_IGNORE_THE_REST-S25-7-S3-6--5croA-_3325_1_0 crashed and burned in a compute error. Long error dump yet again. <core_client_version>5.10.45</core_client_version> <![CDATA[ <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> # cpu_run_time_pref: 21600 Unhandled Exception Detected... |
Greg_BE Send message Joined: 30 May 06 Posts: 5691 Credit: 5,859,226 RAC: 0 |
another one: h001__BOINC_ABRELAX_IGNORE_THE_REST-S25-11-S3-5--h001_-_3324_45140_0 Client error Client state Done Exit status -1073741819 (0xc0000005) Computer ID 293392 Report deadline 30 May 2008 19:09:52 UTC CPU time 19774.5 stderr out <core_client_version>5.10.45</core_client_version> <![CDATA[ <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> # cpu_run_time_pref: 21600 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x005C3030 write attempt to address 0x00000004 Engaging BOINC Windows Runtime Debugger... it did grant me credit amazing enough |
Pepo Send message Joined: 28 Sep 05 Posts: 115 Credit: 101,358 RAC: 0 |
It should not. Which client, 5.10.45? The "crash on project detach" bug should be fixed in next 6.2 release (changeset [trac]changeset:15407[/trac]). Peter |
Message boards :
Number crunching :
minirosetta v1.19 bug thread
©2024 University of Washington
https://www.bakerlab.org