Are really low energy structures worth it if the RMSD isn't so good?

Message boards : Rosetta@home Science : Are really low energy structures worth it if the RMSD isn't so good?

To post messages, you must log in.

AuthorMessage
Otto

Send message
Joined: 6 Apr 07
Posts: 27
Credit: 3,567,665
RAC: 0
Message 47976 - Posted: 23 Oct 2007, 22:16:13 UTC

What has always puzzled me is whether finding really low energy structures is worthwhile if the RMSD accompanying it isn't so spectacular. And the same could be asked conversely - is a very good RMSD worth it if the energy structure isn't? What kind of relationship is there between RMSD and energy structure in terms of relating to real-world proteins? Are the computed findings only useful for real-life action if BOTH the energy structure AND the RMSD are extremely low? (For example, the RMSD near-zero while the energy structure way, way below zero - of course depending on a given protein.)
ID: 47976 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 47982 - Posted: 24 Oct 2007, 5:03:24 UTC
Last modified: 24 Oct 2007, 5:06:52 UTC

You need to picture it this way, you have no RSMD, because the protein has an unknown structure. In other words, you have to map because we're in the middle of an uncharted forest... now how do you find your way out?? And find a way out without running around in circles and expending needless energy climbing the mountainous terrain.

The energy level is the only predictor you have to go by. It is kind of like telling you the GPS coordinate on the planet, but you don't have a map to tell you where that really is, or where the best route out of the forest might be. But if you keep following lower energy levels, you will find your way out... which is like saying if you keep following the water downhill, eventually you will find a direct line out of the forest. Perhaps not the fastest route out, but one of many ways that achieve the objective. And where the river exits the forest (even though you haven't seen a river yet) has proven in the past to a very good route to take. Saving much needless climbing of mountains.

You see, looking at RSMD is "cheating". In the real application of Rosetta, you will not know what the RSMD is, because you will study proteins that have an unknown structure. It is like having a map to the forest. You only have that if you already know the answer to the protein's structure. And if I already have the structure, then there is no point in trying to discover it. The reason Rosetta studies these known structures is to test their approach to solving the problem. It tells the scientists, as the model progresses, how rapidly they are approaching the correct structure. And whether their new approach is producing a model that is better then there last approach produced.

You might want to compare the "really low" energy levels of your models with those of others working on the same protein. Check out the graphs here.
Rosetta Moderator: Mod.Sense
ID: 47982 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile adrianxw
Avatar

Send message
Joined: 18 Sep 05
Posts: 653
Credit: 11,840,739
RAC: 42
Message 47986 - Posted: 24 Oct 2007, 7:28:50 UTC

I find myself again saying to be careful taking the analogy too far. In a forrest, on a planet, a river is likely to run to the coast.

In a protein energy landscape, that is absolutely not certain, or even very likely. The protein landscape is full of dips and hollows.

I realise that the planet anlogy is easy for people to grasp, but I have seen several threads now where people have got so far into the details of the analogy, that the details of the REAL problem have been totally lost.
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
ID: 47986 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Otto

Send message
Joined: 6 Apr 07
Posts: 27
Credit: 3,567,665
RAC: 0
Message 47991 - Posted: 24 Oct 2007, 10:14:22 UTC

Ok, thanks for the answers.
ID: 47991 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
buren

Send message
Joined: 18 Nov 07
Posts: 21
Credit: 132,158
RAC: 0
Message 48865 - Posted: 20 Nov 2007, 15:57:36 UTC - in response to Message 47991.  

Currently the models finish quite fast on modern PCs, about 1-2h each. Are the times only that short for the test runs or will they be that short with real untested structures as well?

In that case it won't take that long to test a whole lot of proteines, making the project not really dependened on DC, so I guess it will take longer with the real ones. But why are the test runs that short?
ID: 48865 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Luuklag

Send message
Joined: 13 Sep 07
Posts: 262
Credit: 4,171
RAC: 0
Message 48866 - Posted: 20 Nov 2007, 17:18:13 UTC - in response to Message 48865.  

Currently the models finish quite fast on modern PCs, about 1-2h each. Are the times only that short for the test runs or will they be that short with real untested structures as well?

In that case it won't take that long to test a whole lot of proteines, making the project not really dependened on DC, so I guess it will take longer with the real ones. But why are the test runs that short?


the models aint finished, you just set the preference time that short, there are people who give 48 hours for a WU.
ID: 48866 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 48869 - Posted: 20 Nov 2007, 17:37:05 UTC

Rosetta tasks will run for as long as the runtime preference established in the Rosetta preferences... give or take the time of one complete model.

Some combinations of protein and the method being used take over an hour per model. Others only take 5 or 10 minutes.

The models for unknown protein structures take the same amount of time to complete. But it takes finding the best model out of 10,000 or even 100,000 to feel you are getting close to the correct prediction. So it takes all of us to come up with that one silver bullet.

The problem with crunching unknown structures is... well, how do you know if your answer is correct? Or how close your new energy calculations have brought you to where you want to be? So, instead, you crunch structures that are known, but you don't peek at the answer while working on the models. The graphic shows the RSMD, which is a comparision to the known native structure, but that information was not used to guide the course the model will take.


Rosetta Moderator: Mod.Sense
ID: 48869 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
buren

Send message
Joined: 18 Nov 07
Posts: 21
Credit: 132,158
RAC: 0
Message 48953 - Posted: 22 Nov 2007, 17:15:51 UTC

Okay, I read about being able to set the time per WU but couldn't find the preferences and so I thought it was a fixed time.

What's prefered, more WU less detailed or less WU more detailed? I guess the longer the WU runs the better you know if the model works well, but probably there is a time after which you mostly know if the model is useful or not. So setting the time too high might waste resources.

Most of my models did end with RMSD of around 10 after 2h, I don't know how exactly the RMSD is calculated and what RMSD are acceptable but at least the structures still looks way different to the eye from the real life structure. So have there been any improvements so far and how good must a model be i.e. how close to the real structure to be considered "it".
ID: 48953 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Keck_Komputers
Avatar

Send message
Joined: 17 Sep 05
Posts: 211
Credit: 4,246,150
RAC: 0
Message 48971 - Posted: 23 Nov 2007, 8:28:49 UTC - in response to Message 48953.  

What's prefered, more WU less detailed or less WU more detailed? I guess the longer the WU runs the better you know if the model works well, but probably there is a time after which you mostly know if the model is useful or not. So setting the time too high might waste resources.

The project would probably prefer a longer run time to reduce server traffic. The run time setting has no effect on the science.
BOINC WIKI

BOINCing since 2002/12/8
ID: 48971 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
juniper

Send message
Joined: 17 Aug 07
Posts: 4
Credit: 4,724
RAC: 0
Message 48977 - Posted: 23 Nov 2007, 20:27:44 UTC - in response to Message 48971.  

The run time setting has no effect on the science.

Surely that can't be true? Otherwise why not run dozens of WUs for 5 minutes each, rather than 1 WU every 2 (or more) hours?
Or is it the case that running a WU for a very short period of time will result in it being issued again to another cruncher until a certain total amount of time has been spent on a given WU?
ID: 48977 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Mod.Sense
Volunteer moderator

Send message
Joined: 22 Aug 06
Posts: 4018
Credit: 0
RAC: 0
Message 48980 - Posted: 23 Nov 2007, 22:56:06 UTC

juniper, you are confusing the term "models" and "WUs" (work units... now BOINC calls them "tasks"). When you select a runtime preference, you are setting a preferred runtime per task. That runtime is achieved by continuing to crunch models until we arrive as close to the preference as possible. All completed models are then reported back.

So, if you run the default 3 hour runtime preference and complete 10 models, or if you run with a preference of 12 hours and you complete 40 models... the science done on each model is the same. Obviously the project would rather get back 40 models then 10, but on the other hand, the machine that only did 10 models still has 9 more hours left to do something with.

So, you see, running longer doesn't change how each model is done. The science is the same.
Rosetta Moderator: Mod.Sense
ID: 48980 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
buren

Send message
Joined: 18 Nov 07
Posts: 21
Credit: 132,158
RAC: 0
Message 49039 - Posted: 25 Nov 2007, 13:07:25 UTC
Last modified: 25 Nov 2007, 13:08:05 UTC

Okay, with DC it probably really doesn't matter how many models a single computer runs for each structure/WU. I forgot that other computers can resume the same WU with different models.

So the total number of tested models per WU stays the same. Because if one computer only tests 5 models per WU another computer might check the remaining 45.

Does anyone know how much models per structure are actually tested? Or does it depend on the structure?
ID: 49039 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile dcdc

Send message
Joined: 3 Nov 05
Posts: 1831
Credit: 119,627,225
RAC: 11,586
Message 49043 - Posted: 25 Nov 2007, 14:16:08 UTC

it ranges from thousands to millions - looks like the record so far is 28 million:

CNTRL_01ABRELAX_SAVE_ALL_OUT_-1ubi_-_filters 28,755,667
CNTRL_01ABRELAX_SAVE_ALL_OUT_-1di2_-_filters 6,936,746
CNTRL_01ABRELAX_SAVE_ALL_OUT_-1cc8A-_filters 5,193,077
CNTRL_01ABRELAX_SAVE_ALL_OUT_-1bq9A-_filters 4,338,378

ID: 49043 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile rochester new york
Avatar

Send message
Joined: 2 Jul 06
Posts: 2842
Credit: 2,020,043
RAC: 0
Message 71774 - Posted: 9 Dec 2011, 23:22:32 UTC

ID: 71774 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote

Message boards : Rosetta@home Science : Are really low energy structures worth it if the RMSD isn't so good?



©2024 University of Washington
https://www.bakerlab.org