Message boards : Number crunching : error - exited with zero status but no 'finished' file
Author | Message |
---|---|
casio7131 Send message Joined: 10 Oct 05 Posts: 35 Credit: 149,748 RAC: 0 |
i've just come back to rosetta after a long break and i keep getting the errors like this: * Task lrm_jorj_combined_torsion_it04_run01_A_rlbd_1s12_SAVE_ALL_OUT_IGNORE_THE_REST_lr13_DECOY_19701_218_0 exited with zero status but no 'finished' file * If this happens repeatedly you may need to reset the project. (full log here: http://pastebin.com/PpmZhiwF). i've reset, and then reattached, the project but the error keeps coming back. i've managed to get some work done, but the error keeps coming back (so basically my machine isn't doing any work). if i look at the graphics, it says it's "initialising" and then i seem to get the above error after a while (haven't bothered to time it). am i just getting some dodgy workunits? or do i have a problem my machine? or is it something else? |
mikey Send message Joined: 5 Jan 06 Posts: 1896 Credit: 9,873,822 RAC: 36,527 |
i've just come back to rosetta after a long break and i keep getting the errors like this: Do you only run Rosetta or do you run other projects on this pc too? If you run other projects too make sure you have the setting under Your Account, Computing Preferences, "Leave applications in memory while suspended? (suspended applications will consume swap space if 'yes') yes" set to YES!! |
casio7131 Send message Joined: 10 Oct 05 Posts: 35 Credit: 149,748 RAC: 0 |
yes, i run other projects - mainly enigma at home, which doesn't use much memory and doesn't cause me any problems. yes, i've set leave applications in memory. here is the machine https://boinc.bakerlab.org/rosetta/show_host_detail.php?hostid=944122 - it's a quad cpu with 3300MB of ram - i assume that i've enough memory (plus it's set to use a max of 3 cpus). strange thing is that i'm sure that i've previously successfully run rosetta on this machine (maybe 1-2 years ago). anyway, i might run a memtest over lunch, just to rule that out as a problem. also, the machine is a standard dell - straight from the factory with no modifications. |
casio7131 Send message Joined: 10 Oct 05 Posts: 35 Credit: 149,748 RAC: 0 |
i ran memtest86+ for 1h10min over lunch and no errors occurred. i will try running prime95 overnight. i also had a look at the workunits that succeeded/failed to try to find any patterns: *** success 1tifA_BOINC_ABRELAX_CASP9_FS12VF_NP1_IGNORE_THE_REST_S25_15_S3_5_-1tifA-_20320_2 rb_05_06_135_365_rs_stg0_lrlx_t000__casp9_SAVE_ALL_OUT.IGNORE_THE_REST_B_20322_3140 rs_stg0_lrlx_t312__fullseq_run1_SAVE_ALL_OUT_19393_5041 rs_stg1_core_reb_sasa_vol_t293__run1_SAVE_ALL_OUT_19429_4840 *** failure lrm_jorj_combined_torsion_it04_run01_A_rlbd_1s12_SAVE_ALL_OUT_IGNORE_THE_REST_lr13_DECOY_19701_218 lrm_jorj_combined_torsion_it04_run01_A_rlbn_1a68_SAVE_ALL_OUT_IGNORE_THE_REST_NATIVE_NOCON_19702_73 rb_05_06_131_347_rs_stg0_lrlx_t000__casp9_SAVE_ALL_OUT.IGNORE_THE_REST_B_20317_1806 rb_05_06_135_365_rs_stg0_lrlx_t000__casp9_SAVE_ALL_OUT.IGNORE_THE_REST_B_20322_3141 rb_05_07_137_368_rs_stg0_lrlx_t000__casp9_SAVE_ALL_OUT.IGNORE_THE_REST_B_20327_2008 rb_05_07_137_368_rs_stg0_lrlx_t000__casp9_SAVE_ALL_OUT.IGNORE_THE_REST_B_20327_2033 the failures seem to occur for combined "SAVE_ALL_OUT" and "IGNORE_THE_REST" units, but strangely 1 of these units did succeed so there goes my pattern. |
Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0 |
Two possibilities: AV or anti-spyware scan has locked the directory - make sure to exclude the BOINC directories from the scans while BOINC is actively running You are using CPU throttling - if so, in BOINC preferences increase the "use at most x percent of CPU" to 100% and, to compensate (if you have heat or other usage concerns) decrease the number of processors BOINC is allowed to use. Best, Snags |
casio7131 Send message Joined: 10 Oct 05 Posts: 35 Credit: 149,748 RAC: 0 |
ok, i will try disabling virus scanning on the boinc data directory and see if that helps. as for the 3 cpu/throttling - i can't remember setting that myself - i think it may have been done by the installer when i recently (~3mon ago) upgraded boinc. i don't think that i have a heat or stability problem on this machine since i just ran prime95 for 22.5 hours with no errors (and the memtest86+ test seemed fine, so it seems that my system is stable enough). |
casio7131 Send message Joined: 10 Oct 05 Posts: 35 Credit: 149,748 RAC: 0 |
i've reset/reattached the project, and restarted my machine, and i still get the same errors - even with mini rosetta 214. i'm starting to realise that this machine and rosetta just don't go together for some strange reason. i'm still willing to try some more suggestions, if anybody can think of any. also, in case it might be important, my boinc data directory is on a usb stick, rather than on a proper hard drive. |
Adam Gajdacs (Mr. Fusion) Send message Joined: 26 Nov 05 Posts: 13 Credit: 2,949,983 RAC: 2,832 |
What FFT table size did you set in P95 to use for testing? You should go for at least 75% or more of the total physical memory by using the custom settings, as the default torture test uses only relatively small table sizes IIRC. I just recently ran into memory problems that memtest86 completely failed to detect, and hardly ever affected P95 as long as it was set to use only smaller FFT tables. The USB stick could also be the cause if it's started to develop cell faults because of aging or intense use (or was simply defective from the start). You could try to move the whole BOINC installation to a hard drive to see if the errors persist. I think simply copying it to somewhere and manually starting the client from there should work, for testing purposes at least (don't quote me on it tho :). |
Snags Send message Joined: 22 Feb 07 Posts: 198 Credit: 2,888,320 RAC: 0 |
If Adam Gadjac's ideas don't help here are some more suggestions from Ageless at the BOINC FAQ Service: Ageless regularly posts on the BOINC forum. The folks there also might have information on unintended effects of using a usb stick. Best, Snags |
casio7131 Send message Joined: 10 Oct 05 Posts: 35 Credit: 149,748 RAC: 0 |
it was the default/blend prime95 test which used 1.5 gb of memory (virtual size and private bytes according to process explorer), so i think it's using plenty of memory. i did a scan of the usb stick and it didn't find any errors - it's a transcend jetflash v30 4gb drive (model ts4gjfv30) then i moved the data onto the hard drive and rosetta was running fine. i then copied that boinc data back onto the usb drive and restarted boinc. now rosetta went back to its old problems. so my error is due to me having my boinc data directory on the usb drive. i might have a quick look on the boinc forum. note: * my boinc directories are: C:boincBOINC - boinc program on hdd P:boincboinc_data - boinc data on usb stick * so i don't forget how to change boinc data directories: - in regedit: hklm / software / spaces sciences lab / boinc setup => change DATADIR vaslue. - in boinc_data/slots/x directories: edit the paths in "init_data.xml" files. |
Message boards :
Number crunching :
error - exited with zero status but no 'finished' file
©2025 University of Washington
https://www.bakerlab.org