Message boards : Number crunching : Output versus work unit size
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
rjs5 Send message Joined: 22 Nov 10 Posts: 273 Credit: 23,054,272 RAC: 6,536 |
Whether it is a good idea or not depends on how frequently tasks are getting preempted. I recommend people set it so it DOES leave them "in memory", and they get swapped out if the system gets busy with other work. By NOT keeping tasks in memory, you are expressing a willingness to throw away partially completed work, i.e. willingness to lose credit in favor of more quickly getting out of the way when other demands arrive on the machine. The i7 4771 Windows machine is the one giving you the problem. Looking at the resent results, there are some things that stick out. 1. you have hyperthreading turned OFF. IMO, turning OFF hyperthreading has never been a "throughput" win. An individual job make take more time, but the AVERAGE time of running twice as many jobs is lower. Enable hyperthreads unless you are SURE that it is a problem. I might leave the 3 Rosetta WU limit in place for awhile to see what changes. 2. Looking at the current results that I can see, it looks like Rosetta 4.07 Windows is the source of the problem. I would open the Windows TASK MANAGER and look at and monitor the DETAILED information to see if other programs are taking unexpected resources. Windows is USUALLY faster than Linux UNLESS the Linux version is using HUGE PAGES (unlikely). The newer Haswell CPU (4771) does improvements in the instruction set, but that is UNLIKELY the source of a 400% performance improvement (180 -> 774 credits) Results summary sorted by CREDITS. i7 3770 Credit Peak working set size Peak swap size Peak disk usage Application version 949.7 513.58 660.98 536.76 Rosetta v4.07 x86_64-pc-linux-gnu 908.77 501.71 648.84 536.96 Rosetta v4.07 x86_64-pc-linux-gnu 890.51 494.95 642.43 537.02 Rosetta v4.07 x86_64-pc-linux-gnu 856.81 513.75 661.08 536.75 Rosetta v4.07 x86_64-pc-linux-gnu 806.6 610.37 755.43 549.96 Rosetta v4.07 x86_64-pc-linux-gnu 714.55 485.88 552.25 536.42 Rosetta v4.07 i686-pc-linux-gnu 426.06 427.27 484.19 436.63 Rosetta Mini v3.78 x86_64-pc-linux-gnu i7 4771 Credit Peak working set size Peak swap size Peak disk usage Application version 854.55 422.21 407.02 432.13 Rosetta Mini v3.78 windows_x86_64 774.94 410.25 393.73 425.93 Rosetta Mini v3.78 windows_x86_64 464.71 285.15 269.19 415.76 Rosetta Mini v3.78 windows_x86_64 183.17 647.66 627.78 528.75 Rosetta v4.07 windows_x86_64 180.90 443.88 426.99 512.69 Rosetta v4.07 windows_intelx86 179.00 791.43 776.73 547.89 Rosetta v4.07 windows_intelx86 178.36 429.46 414.06 514.25 Rosetta v4.07 windows_intelx86 177.59 525.55 505.02 514.56 Rosetta v4.07 windows_x86_64 176.21 764.33 744.05 527.87 Rosetta v4.07 windows_x86_64 172.69 668.47 652.33 546.11 Rosetta v4.07 windows_intelx86 172.02 511.33 496.50 525.72 Rosetta v4.07 windows_intelx86 167.39 851.91 833.11 545.81 Rosetta v4.07 windows_x86_64 166.32 787.81 768.35 544.73 Rosetta v4.07 windows_x86_64 164.31 787.73 768.40 545.32 Rosetta v4.07 windows_x86_64 161.73 489.06 468.77 514.08 Rosetta v4.07 windows_x86_64 |
mmonnin Send message Joined: 2 Jun 16 Posts: 59 Credit: 24,222,307 RAC: 66,706 |
Yes, R@h is memory intensive. Any memory intensive application is potentially going to be labelled as not playing well with others. It is just how memory contention works in a system. So I don't see a specific problem with your scenario. But wanted to assure you that the developers do look at memory usage and attempt to improve the algorithms used to dial back the use of memory where possible. Also wanted to point out that you said in prior posts that R@h doesn't play well with others, which always sounds like a skirmish for resources and people often invent logic that says it is the application being aggressive, when in fact such things are controlled by the operating system. But I wanted to point out that your last post essentially now boils down to you saying that R@h doesn't play well with itself either. So, at least there is no bias on what is being impacted. As you say, L2 cache contention is going to crop up with any memory intensive application. The larger the L2 cache, the faster any memory intensive application will run. It'd be great if we could at least select one or both apps in preferences. I'd assume that would limit the models to an extent. Is it that hard to take an existing functioning app as a baseline app any new models/code added that instead of piling it all into one app? Some projects have many apps that do different things. PrimeGrid has different algorithms (right term?) to find primes set as different kind of apps. Maybe then we wouldn't have like a gig download for two apps plus the task files. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,623,704 RAC: 7,594 |
It'd be great if we could at least select one or both apps in preferences. I'd assume that would limit the models to an extent. Something like gpugrid, with "Long runs" and "Short runs" wus? Is it that hard to take an existing functioning app as a baseline app any new models/code added that instead of piling it all into one app? Some projects have many apps that do different things. PrimeGrid has different algorithms (right term?) to find primes set as different kind of apps. Maybe then we wouldn't have like a gig download for two apps plus the task files. Fork the code to produce different, specialized apps (one for "folding", one for "ab initio", etc) may be a solution. But i don't know how much complex could be |
mmonnin Send message Joined: 2 Jun 16 Posts: 59 Credit: 24,222,307 RAC: 66,706 |
It'd be great if we could at least select one or both apps in preferences. I'd assume that would limit the models to an extent. There are currently two apps, Rosetta and Rosetta Mini but no way in the preferences to select one or the other. Off the top of my head I don't recall another project that doesn't allow selecting between its apps. In the recent past, Rosetta ap has had a much higher change of running for much longer than set in preferences and returning results. Could I not select them? No. Rosetta also had comp errors at like 5sec on one computer with error 193. Mini was fine. This project already allows for multiple length options. |
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1994 Credit: 9,623,704 RAC: 7,594 |
There are currently two apps, Rosetta and Rosetta Mini but no way in the preferences to select one or the other. Off the top of my head I don't recall another project that doesn't allow selecting between its apps. In the recent past, Rosetta ap has had a much higher change of running for much longer than set in preferences and returning results. Could I not select them? No. Rosetta also had comp errors at like 5sec on one computer with error 193. Mini was fine. Ok, i understand. I know that 4.xx branch is in development, but i don't know if 3.xx is still developed or debugged. I thinked, months ago when 4.x started, that r@h would have abandoned the 3.x version. But 3.x is still here, so i don't know what they want to do with |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
FWIW, I am in the early testing phase of running a Coffee Lake (i7-8700) CPU on Rosetta (Ubuntu 18.04). At the moment, the 12 cores are divided equally between Rosetta and Universe, with one core being reserved to support a GPU on GPUGrid. The results are encouraging thus far, though I may have to change the mix of projects to optimize it a little more: https://boinc.bakerlab.org/rosetta/results.php?hostid=3399951&offset=0&show_names=0&state=4&appid= In general, my Haswell gives more consistent output than my Ivy Bridge, and the Coffee Lake may do even better. Intel probably improved the cache performance in the later chips, which may account for it. Universe has very small work units, about 3 MB, while I was earlier also running Einstein with Rosetta, which is about the same size. But it seems to be not as simple as just work unit size as I had thought. Even leaving cores free does not necessarily fix it. Maybe how the cache is shared among the cores has something to do with it, which is far beyond my ability to investigate. But chip architecture seems to affect it somehow. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2125 Credit: 41,228,659 RAC: 8,784 |
Whether it is a good idea or not depends on how frequently tasks are getting preempted. I recommend people set it so it DOES leave them "in memory", and they get swapped out if the system gets busy with other work. By NOT keeping tasks in memory, you are expressing a willingness to throw away partially completed work, i.e. willingness to lose credit in favor of more quickly getting out of the way when other demands arrive on the machine. Again relating to issues long ago that I can't properly recall, leaving "non-GPU tasks in memory while suspended" solved problems for me and is the preferred option on all my machines |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
My Coffee Lake (i7-8700) credits running under Ubuntu 18.04 range from excellent (1,727.88 points) to miserable (178.94 points), with everything in between. https://boinc.bakerlab.org/rosetta/results.php?hostid=3399951&offset=20&show_names=0&state=4&appid= On the other hand, my more modest Ivy Bridge (i7-3770) machine running under Win7 64-bit has been much more consistent, averaging around 700 points or so. https://boinc.bakerlab.org/rosetta/results.php?hostid=3381276&offset=0&show_names=0&state=4&appid= These machines have sometimes run only Rosetta, but sometimes other projects as well. Universe (BHspin v2) is probably the best, since it uses very little memory, only about 3 MB, and is very stable (does not crash). But they have had all the cores busy with something, and none left free (they are dedicated machines which I do not use for desktop purposes). The Ubuntu machine seems to react most negatively to something. And that something may be crashes of work units, whether of Rosetta itself (as with the 4.07 x64 work units), or of GPUGrid Quantum Chemistry, which has had its own problems recently. Even running just Rosetta (or Rosetta and Universe) on the Ubuntu machine does not entirely stabilize it. Otherwise, I see no real rhyme or reason to it. But for whatever reason, Windows is more stable. There is no point wasting a very capable Coffee Lake chip when a lowly Ivy Bridge can average about as well. Until the crashes end, I think I will take the i7-8700 off of Rosetta and use it elsewhere, and maybe try again later. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
My Ivy Bridge is running on only seven cores now, with the other one free (Win7 64-bit). That is why the credits are a little high, since the cores are not fully loaded. I could probably get a little more total output by running on eight cores, but I like it this way, for a while at least. And it may help to ensure a little extra cache is available, for whatever that is worth (not clear at present). https://boinc.bakerlab.org/rosetta/results.php?hostid=3381276&offset=0&show_names=0&state=4&appid= I will just let it run until something comes along to break it. |
shanen Send message Joined: 16 Apr 14 Posts: 195 Credit: 12,662,308 RAC: 0 |
Frankly amusing to see people worrying about such details as regards THIS particular BOINC project. I actually stopped by to wonder how many other volunteers are nuking 3-day projects on sight. #1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech) |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
It is one of the most important projects for its science and potential benefits. They just rush the work into production without testing it thoroughly I believe. I was surprised that the OS makes a difference, as well as the CPU type (Intel v AMD). They really should look into that for their own benefit. But it is a great project otherwise. |
rjs5 Send message Joined: 22 Nov 10 Posts: 273 Credit: 23,054,272 RAC: 6,536 |
It is one of the most important projects for its science and potential benefits. They just rush the work into production without testing it thoroughly I believe. Rosetta IS important, as is BOINC. There are many small things that can impact performance substantially. Every CPU clock time can be broken down into the parts getting/decoding the instructions, getting the data and executing/storing the results. One example, .... Sandy Bridge had a major change in CPU cache. Before, if one CPU had a modified data value and another CPU wanted it ... the CPU would write it to memory and then the new CPU would READ it from memory. Sandy Bridge changed that and the owning CPU just handed the modified value to the new CPU and invalidated that line of its cache. Big performance difference. Intel CPU before Sandy Bridge suffer. There is a professor Agner Fog at the University of Denmark who has done a lot of work at comparing CPU performance at the INSTRUCTION level and has published some interesting data. How about the CYCLE counts of all the Intel/AMD CPU so you can see some of the differences. http://www.agner.org/optimize/instruction_tables.pdf |
shanen Send message Joined: 16 Apr 14 Posts: 195 Credit: 12,662,308 RAC: 0 |
It is one of the most important projects for its science and potential benefits. They just rush the work into production without testing it thoroughly I believe. You are repeating one of my oft-repeated concerns: Any programs that are supposed to be producing scientific results need to be "tested thoroughly" or the research itself becomes questionable. The project staff seems to have a rather cavalier attitude towards testing, but maybe that's only on the side of the software that the volunteers see. Looks buggy to us, but maybe it's perfect on the results side. (But I doubt it and I strongly hope that they are running all crucial results several times in several ways.) From what I've seen, if I were still a senior referee for the IEEE Computer Society and if I was reviewing a paper that relied on data from Rosetta@home calculations, I would start out with a highly skeptical attitude. At a minimum I would want to know that the code was audited, but more likely I would ask for replication of the key calculations by some other researchers. Right now I'm just a volunteer, and my main annoyance is the 3-day deadlines. I'm mostly nuking those pending tasks on sight and NOT feeling sorry about wasting the project's bandwidth. Not at all. #1 Freedom = (Meaningful - Constrained) Choice{5} != (Beer^3 | Speech) |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
I was going on the opposite assumption: they have focused too much on getting the science right, at the expense of basic computer operation. At least I am in no position to suggest otherwise, given their eminent position in the field. I am sure they have plenty of peer review when they publish to validate their results; not that they could not benefit from a basic computer science review as you suggest also. But while we are on the subject, I thought I would fire up my i7-8700 on Ubuntu 18.04 again, having made the fix for the x64 crashes as taught by Juha (great work) and implemented by rjs5 (https://boinc.bakerlab.org/rosetta/forum_thread.php?id=12242&postid=88954#8895). At the moment, things are going fine, and the output is much more constant. Whether that is a long term fix we will see. If so, it would suggest that the crashes somehow affected the running work units. So basic computer operation is not to be neglected either of course. (I started out running with Universe BHspin v2 too, but now all 12 cores are on Rosetta; I don't even have a GPU installed.) https://boinc.bakerlab.org/results.php?hostid=3399951&offset=0&show_names=0&state=4&appid= EDIT: The initial group are all x64, so an alternate explanation is just that the x64 have more consistent output than the 32-bit ones. It will take a while to see. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
FWIW, I just can not get Rosetta to run consistently on my i7-8700 (Ubuntu 18.04). It will start out great when I first attach (1200 points for the 24 hour work units), and then go downhill. It is now at around 170 PPD. https://boinc.bakerlab.org/results.php?hostid=3399951&offset=0&show_names=0&state=4&appid= Leaving cores free, or running with or without other projects does not seem to help. On the other hand, my Windows 7 64-bit machine (i7-3770) does consistently well, at around 800 PPD (6 cores, leaving 2 free). So that is how I will go. https://boinc.bakerlab.org/results.php?hostid=3381276&offset=0&show_names=0&state=4&appid= |
Mod.Sense Volunteer moderator Send message Joined: 22 Aug 06 Posts: 4018 Credit: 0 RAC: 0 |
@Jim1348, that sounds odd. Does the machine start thrashing memory over time? The credit system is structured in such a way that if the particular work unit you were crunching were harder, then the credit per model would be higher. In other words, the machines of other users did not find those WUs any harder than normal, but for some reason your machine isn't getting as much work done. You used the phrase "...when I first attach...", so I wasn't certain if you meant the client attaching to BOINC Manager, or BOINC Manager attaching to the project. If you remain attached to the project, and power off the machine for 30 seconds and then restart things, does your credit per hour improve again? If reboot resumes more normal credit, this would tend to indicate something is up with your machine. Perhaps there is a memory leak somewhere. Given that the work units are beginning anew each day, it would seem more likely any such memory leak is in the operating system somewhere. Rosetta Moderator: Mod.Sense |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Those are all good questions. The credit has usually been good when I first attach to the project, but then becomes erratic before finally settling down to a low value. Rebooting the machine does not help. I would normally think that it is something on my machine also, but I can't find it. It is a dedicated machine with only BOINC running; not even a GPU now. I thought perhaps the work units were different between Windows and Linux, and that somehow the BOINC credit system is to blame. But it would have to be a large discrepancy for that to be the case. I have plenty of memory - 32 GB on both machines by the way, and devote several GB to a write cache on the Ubuntu machine. There is always plenty of free memory when I check it. PS - One of these days I am going to turn off hyper-threading and see what that does. But it is a bit of a nuisance to attach a monitor, etc. and I won't get around to it for some time. But if something dramatic happens, I will post about it. Until then, it will have to be a known unknown. |
rjs5 Send message Joined: 22 Nov 10 Posts: 273 Credit: 23,054,272 RAC: 6,536 |
Those are all good questions. The credit has usually been good when I first attach to the project, but then becomes erratic before finally settling down to a low value. Rebooting the machine does not help. I would normally think that it is something on my machine also, but I can't find it. It is a dedicated machine with only BOINC running; not even a GPU now. Is the disk write caching in the default Ubuntu state or did you change the settings? Most drives already have a GB or so of write caching on the drive and for SSD ... I don't think it is really needed. Did you measure the impact of changing the setting (if you did)? I have not seen Rosetta behavior that would benefit from write caching. |
Jim1348 Send message Joined: 19 Jan 06 Posts: 881 Credit: 52,257,545 RAC: 0 |
Is the disk write caching in the default Ubuntu state or did you change the settings? The large cache was not set for Rosetta but for other projects that have high write rates, in order to protect the SSD. However, I have never found that too much hurts anything, though you are right that Rosetta does not need it. Here are the settings: Swappiness: to never use swap: sudo sysctl vm.swappiness=0 Set write cache to 12 GB/12.5 GB: for 32 GB main memory sudo sysctl vm.dirty_background_bytes=12000000000 sudo sysctl vm.dirty_bytes=12500000000 sudo sysctl vm.dirty_writeback_centisecs=500 (checks the cache every 5 seconds) sudo sysctl vm.dirty_expire_centisecs=720000 (flush pages older than 2 hours) Insofar as I know, it just means that Rosetta will be operating mainly out of the DRAM cache rather than accessing the SSD most of the time. I will set it back to the default and try again in a few days. |
rjs5 Send message Joined: 22 Nov 10 Posts: 273 Credit: 23,054,272 RAC: 6,536 |
Is the disk write caching in the default Ubuntu state or did you change the settings? I don't think the SSD needs protecting. Only a large fraction of the SSD is active at any one time. The SSD BIOS has "wear algorithms" built into the drive that map the LOGICAL drive block number to a PHYSICAL block. The wear algorithms move stuff around so wear is uniform AND any reliability problem is automatically handled. Any block that exhibits write or retention problems will be detected with their multiple bit detection algorithms is removed from the active drive. The Linux kernel guys have probably implemented the write caches properly, but I always worry about someone using memory copy code that purges the CPU caches. I like that you will be looking at he performance. IMO, Rosetta spends entirely too much time trying to make things "fair" and it just takes too much time to explain AND results are unstable. |
Message boards :
Number crunching :
Output versus work unit size
©2024 University of Washington
https://www.bakerlab.org