Problems and Technical Issues with Rosetta@home

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home

To post messages, you must log in.

Previous · 1 . . . 291 · 292 · 293 · 294 · 295 · 296 · 297 . . . 313 · Next

AuthorMessage
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2166
Credit: 41,629,484
RAC: 5,494
Message 109787 - Posted: 25 Sep 2024, 14:36:01 UTC

Somehow grabbed 16 rb tasks this morning to fill my cache
Just checked how many tasks were issued and it appears to be next to none - seems I was just lucky with the odd few
Then I saw boinc-process is down again - not so lucky after all <sigh>
ID: 109787 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1743
Credit: 18,534,891
RAC: 3,108
Message 109792 - Posted: 30 Sep 2024, 7:33:47 UTC

A glitch is back on the main page
Notice: Undefined variable: stats in /projects/boinc/rosetta/html/user/index.php on line 81
Just under the Server Status heading.
Grant
Darwin NT
ID: 109792 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1743
Credit: 18,534,891
RAC: 3,108
Message 109793 - Posted: 1 Oct 2024, 7:31:00 UTC
Last modified: 1 Oct 2024, 7:50:19 UTC

Glitch fixed, and lo and behold- new work!


And an interesting batch it is- It's Beta 6.06 work, but they're using 1.4 to 1.7GB of RAM each, and it looks like their target Runtime is 8 hours (unlike the usual 200-400MB of RAM and 3 hrs runtime of previous Beta Tasks.
Grant
Darwin NT
ID: 109793 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bill Swisher

Send message
Joined: 10 Jun 13
Posts: 43
Credit: 35,474,078
RAC: 33,162
Message 109795 - Posted: 1 Oct 2024, 16:30:53 UTC - in response to Message 109793.  
Last modified: 1 Oct 2024, 16:32:11 UTC

Ouch! At first glance this beta does not seem to play well with my processors. I had two of them go slightly insane, so to speak, btw the only two to download the beta. Both the computers are running openSUSE Leap 15.6, the first (an old AMD Ryzen Threadripper 2950X) jumped up to 99 active users and started swapping like mad (32Gb of memory). I had to hit the power button to get to where I could suspend boinc. The second (a newer AMD Ryzen 9 7950X) only jumped up to 64 users before I suspended boinc.
I'm heading out of town for a couple of days so all the machines aren't getting any Rosetta until I get back and can watch them.
ID: 109795 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2166
Credit: 41,629,484
RAC: 5,494
Message 109796 - Posted: 1 Oct 2024, 21:16:44 UTC - in response to Message 109793.  

Glitch fixed, and lo and behold- new work!

And an interesting batch it is - It's Beta 6.06 work, but they're using 1.4 to 1.7GB of RAM each, and it looks like their target Runtime is 8 hours (unlike the usual 200-400MB of RAM and 3 hrs runtime of previous Beta Tasks.

I polled 2hrs before your msg and got nothing, then not again until 3hrs ago - and only just noticed. Argh
Not many left to grab now either - a small batch.

On runtime, I've said before I use a 12hr runtime, but didn't mention that the last batch of 16 I sneaked a few days ago only ran 8hrs too.
Not sure if that's a coincidence as my 12hr setting usually overrides the default set to the individual tasks.

Anyway, work is work. I'll take whatever I can get.
ID: 109796 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jean-David Beyer

Send message
Joined: 2 Nov 05
Posts: 200
Credit: 6,668,482
RAC: 3,812
Message 109797 - Posted: 1 Oct 2024, 22:03:33 UTC - in response to Message 109795.  

Ouch! At first glance this beta does not seem to play well with my processors.


The first three I got ran just fine. My machine is running Red Hat Enterprise Linux release 8.10 (Ootpa)
us ing kernel 4.18.0-553.22.1.el8_10.x86_64

1583987342 	1409332410 	1 Oct 2024, 10:50:55 UTC 	1 Oct 2024, 18:23:18 UTC 	Completed and validated 	26,366.61 	25,872.01 	369.98 	Rosetta Beta v6.06
x86_64-pc-linux-gnu
1583987355 	1409332393 	1 Oct 2024, 10:50:55 UTC 	1 Oct 2024, 18:26:58 UTC 	Completed and validated 	27,345.06 	26,815.76 	383.71 	Rosetta Beta v6.06
x86_64-pc-linux-gnu
1583987363 	1409332409 	1 Oct 2024, 10:50:55 UTC 	1 Oct 2024, 18:23:18 UTC 	Completed and validated 	26,759.98 	26,251.39 	375.50 	Rosetta Beta v6.06
x86_64-pc-linux-gnu

ID: 109797 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Klimax

Send message
Joined: 27 Apr 07
Posts: 44
Credit: 2,801,226
RAC: 45
Message 109798 - Posted: 2 Oct 2024, 5:01:05 UTC - in response to Message 109793.  

Glitch fixed, and lo and behold- new work!


And an interesting batch it is- It's Beta 6.06 work, but they're using 1.4 to 1.7GB of RAM each, and it looks like their target Runtime is 8 hours (unlike the usual 200-400MB of RAM and 3 hrs runtime of previous Beta Tasks.

So that explains why WUs are failing on one of my computers. 20 threads and only 16GB of RAM and fairly small paging file. They could have warned us...
ID: 109798 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1743
Credit: 18,534,891
RAC: 3,108
Message 109799 - Posted: 2 Oct 2024, 6:46:45 UTC - in response to Message 109798.  
Last modified: 2 Oct 2024, 6:53:06 UTC

So that explains why WUs are failing on one of my computers. 20 threads and only 16GB of RAM and fairly small paging file. They could have warned us...
It's been the general rule of thumb since i've been here (a bit over 4 years)- 1.5GB of RAM per core/thread is needed in order to do Rosetta work.
It's only been recently with the Beta application that Tasks have used less (there were batches of Rosetta 4.20 work that have Tasks that used 2- 4GB each).


Edit-
Interestingly- on one system all running tasks are using up to 1.6GB of RAM, on the other only 2 are using more than 1GB of RAM, the rest 400-700MB.
Grant
Darwin NT
ID: 109799 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1743
Credit: 18,534,891
RAC: 3,108
Message 109800 - Posted: 2 Oct 2024, 6:49:47 UTC - in response to Message 109797.  

Ouch! At first glance this beta does not seem to play well with my processors.
The first three I got ran just fine. My machine is running Red Hat Enterprise Linux release 8.10 (Ootpa)
us ing kernel 4.18.0-553.22.1.el8_10.x86_64
128GB of RAM on a 16 core/thread system leaves plenty of RAM available for the system even when all cores/threads are doing Rosetta work.
Grant
Darwin NT
ID: 109800 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Klimax

Send message
Joined: 27 Apr 07
Posts: 44
Credit: 2,801,226
RAC: 45
Message 109801 - Posted: 2 Oct 2024, 8:56:55 UTC - in response to Message 109799.  

So that explains why WUs are failing on one of my computers. 20 threads and only 16GB of RAM and fairly small paging file. They could have warned us...
It's been the general rule of thumb since i've been here (a bit over 4 years)- 1.5GB of RAM per core/thread is needed in order to do Rosetta work.
It's only been recently with the Beta application that Tasks have used less (there were batches of Rosetta 4.20 work that have Tasks that used 2- 4GB each).


Edit-
Interestingly- on one system all running tasks are using up to 1.6GB of RAM, on the other only 2 are using more than 1GB of RAM, the rest 400-700MB.

Argh. I just ( after writing a reply) realized what happened. NumberFields uses OpenCL for multiprecision arithmetic and OCL compiler will during compilation use up lots of RAM (sharp increase to few GBs, after it it will return back to fairly small footprint). So I have enough of virtual memory, when it's not being exhausted by another project...
ID: 109801 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1743
Credit: 18,534,891
RAC: 3,108
Message 109802 - Posted: 2 Oct 2024, 10:32:01 UTC

The boinc-process host has died again, so Validations are building up.
Grant
Darwin NT
ID: 109802 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kotenok2000
Avatar

Send message
Joined: 22 Feb 11
Posts: 273
Credit: 511,834
RAC: 207
Message 109803 - Posted: 2 Oct 2024, 11:48:48 UTC
Last modified: 2 Oct 2024, 11:54:44 UTC

Graphics process crashes

e:\programdata\BOINC\slots\5>E:\programdata\BOINC\projects\boinc.bakerlab.org_rosetta\rosetta_graphics_6.06_windows_x86_64.exe -database e:\programdata\BOINC\projects\boinc.bakerlab.org_rosetta\database_f5ae1de8e1\database
********  (C) Copyright Rosetta Commons Member Institutions.  ***************
* Use of Rosetta for commercial purposes may require purchase of a license. *
********  See LICENSE.md or email license@uw.edu for more details. **********
core.init: Checking for fconfig files in pwd and ./rosetta/flags
core.init: Rosetta version: 2024.24.post.dev+4.main.f5ae1de8e1 f5ae1de8e146ed3da2662da903342c9c1ad0b046 https://github.com/RosettaCommons/rosetta 2024-08-12T12:35:30
core.init: Rosetta extras: []
core.init: command: E:\programdata\BOINC\projects\boinc.bakerlab.org_rosetta\rosetta_graphics_6.06_windows_x86_64.exe -database e:\programdata\BOINC\projects\boinc.bakerlab.org_rosetta\database_f5ae1de8e1\database
basic.random.init_random_generator: 'RNG device' seed mode, using 'RtlGenRandom', seed=2029587963 seed_offset=0 real_seed=2029587963
basic.random.init_random_generator: RandomGenerator:init: Normal mode, seed=2029587963 RG_type=mt19937
Attached shared memory segment
core.chemical.GlobalResidueTypeSet: Finished initializing fa_standard residue type set.  Created 985 residue types
core.chemical.GlobalResidueTypeSet: Total time to initialize 1.737 seconds.

e:\programdata\BOINC\slots\5>cat stderrgfx.txt
14:43:58 (40372): Starting graphics application.
Opened semaphore

ERROR: The residue SER:NtermTruncation could not be generated.  Has a suitable params file been loaded? (Note that custom params files not in the Rosetta database can be loaded with the -extra_res or -extra_res_fa command-line flags.)
ERROR:: Exit from: src/core/chemical/ResidueTypeSet.cc line: 116
14:44:00 (40372): called boinc_finish(0)

e:\programdata\BOINC\slots\5>
ID: 109803 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2166
Credit: 41,629,484
RAC: 5,494
Message 109804 - Posted: 2 Oct 2024, 18:05:49 UTC - in response to Message 109802.  

The boinc-process host has died again, so Validations are building up.

Just as well tasks ran out to download at about the same time... <sigh>
ID: 109804 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
kasdashdfjsah

Send message
Joined: 15 Jan 24
Posts: 10
Credit: 0
RAC: 0
Message 109806 - Posted: 2 Oct 2024, 20:43:29 UTC - in response to Message 109804.  
Last modified: 2 Oct 2024, 20:44:21 UTC

Yeah, no tasks for me either.

Also, please have the Ralph@home project removed from the website, since it's not active anymore, so people don't waste time reading about it and trying to create and account which doesn't work.
ID: 109806 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1743
Credit: 18,534,891
RAC: 3,108
Message 109808 - Posted: 3 Oct 2024, 10:42:01 UTC - in response to Message 109802.  

The boinc-process host has died again, so Validations are building up.
Still dead, backlog up to 109,000.
Grant
Darwin NT
ID: 109808 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2166
Credit: 41,629,484
RAC: 5,494
Message 109809 - Posted: 3 Oct 2024, 23:02:26 UTC - in response to Message 109808.  

The boinc-process host has died again, so Validations are building up.
Still dead, backlog up to 109,000.

Midnight in the UK, I'm back from work and all servers running and the backlog fully cleared.
No new tasks yet, but credits rolling in at least
ID: 109809 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1743
Credit: 18,534,891
RAC: 3,108
Message 109813 - Posted: 6 Oct 2024, 3:30:50 UTC

Web site took ages to come up, forums extremely sluggish, Server Status page showing all green however i'm unable to upload any results.
Grant
Darwin NT
ID: 109813 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Grant (SSSF)

Send message
Joined: 28 Mar 20
Posts: 1743
Credit: 18,534,891
RAC: 3,108
Message 109814 - Posted: 6 Oct 2024, 10:34:29 UTC

And now it's all working again.
Grant
Darwin NT
ID: 109814 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Bill Swisher

Send message
Joined: 10 Jun 13
Posts: 43
Credit: 35,474,078
RAC: 33,162
Message 109830 - Posted: 9 Oct 2024, 18:20:55 UTC

Four days, maybe it's been five as I've lost count, and no new work.
ID: 109830 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Sid Celery

Send message
Joined: 11 Feb 08
Posts: 2166
Credit: 41,629,484
RAC: 5,494
Message 109831 - Posted: 9 Oct 2024, 18:32:52 UTC - in response to Message 109830.  
Last modified: 9 Oct 2024, 18:35:34 UTC

Four days, maybe it's been five as I've lost count, and no new work

I think it's now 7 days since the final tasks of the last batch were all picked up, going by the comments above
Not to mention WCG has been down for maintenance the last few days and I've got 150 tasks waiting to upload there too
With the weather turning colder recently I could do with some tasks from somewhere to warm the place up a little
ID: 109831 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 . . . 291 · 292 · 293 · 294 · 295 · 296 · 297 . . . 313 · Next

Message boards : Number crunching : Problems and Technical Issues with Rosetta@home



©2025 University of Washington
https://www.bakerlab.org