[TA]Assimilator1 2007-03-26 19:39:15 | Turtleblue Theirs something wrong with your P4 3GHz scores ,if you look at [TA]z's score its almost double yours! :Q Are you sure nothing else was gobbling up CPU power? Jon Interesting that dropping the RAM speed had such a small impact ,do you know what the RAM timings were at the 2 different RAM speeds? Don't forget when benchmarking DPAD to be sure sample files downloading is switched off in the config.txt. Nice boost btw ,is the 165 dual core? Pascal You won't get errors in results from overclocking as long as the rig has been tested properly ,starting with a 24hr stress test with Prime 95. Also I posted the 'how to use muon1 bench program' instructions further up ,dated 27/1/07. AFAIK their isn't a problem with the benchmarking program (but you never know!) ,I've run the program on my XPM on win2k with no probs ,so its not that. Insaniti Great single core CPU scores! ,& you're running with a ~15% disadvantage compared to the scores in the chart (due to the new DPAD version). K`Tetch Good idea on using as std simulation ,similiar to what was done for SETI classic by TLC ,they grabbed a 'typical' Work Unit & it became widely used for benchmarking SETI. Couldn't we do the same? just grab a typical 'WU' & use it for benchmarking? ,is that possible? |
iNSaNiTi 2007-03-26 22:50:57 | Hey Thnkz for the nice comment And u mean that i must try the new muon version ? ( simply... why me got disadvantage ? ) Does anyone knows who is updating the chart ? OR should i update it ? Greetz, |
TurtleBlue 2007-03-27 02:22:05 | Hi, Assimilator1, Thanks for the query yeah, [TA]z sure looks golden! I've used PC Wizard for some of the info about my 3.06ghz box Nothing else was running at the time I was using Muon1benchmark & dPad (latest version) Window Task Manager shows 100% CPU utilization an about 40% of the 1 gig of ddr mem This is running the 3.06ghz WITHOUT Hyperthreading. When I first tried with Hyperthreading the CPU utilization was never 50/50 (like in F@H which was set at 45/45) and did not return better results than under no hyperthreading I have my nephew's DOA Athlon Thunderbird PC which I built some years ago at home now and once I get the Raritan KVM 4 port I bought on eBay this week (to hook up my 15" NEC LCD with my 2.4ghz northwood box & IBM A21m 500mhz Pentium III ThinkPad) will see if I can bring it back to life (one of the 133 mem chips was bad when I tried them at work on my HP desktop PIII 1.0ghz box) and see if it can beat my pathetic 3.06ghz northwood! If I can't bring that Frankenstein box to life will use it's workable guts to construct a budget (sub $200 ) Sempron setup or something better (After AMD drops prices around April 7th, so I heard...) and maybe let it cook some dPad for about a week or 2 & do a "dump". I will also have to learn how to Overclock as well so my results may look a little better. |
[TA]JonB 2007-03-27 11:46:05 | I seem to have reached the point of instability/stability with my current DDR400 memory. It seems to be my limiting factor. So, here's where my tweaking leaves me: Opteron 165, dual-core Toledo, air cooled, temp max at 49C, on an ASRock Dual-Sata939 motherboard. FSB is 265mhz, HyperTransport at 4x multiplier, so 1060mhz. CPU Mhz is x9, so 2.38mhz (up from 1.8mhz stock timings) Memory is running at 265mhz with timings of 2.5,2,2,2,5,7,1T (btw, 1T didn't improve DPAD over 2T) Best Muonbench with with these settings is officially declared as: 441 so, until I get some PC5000 RAM, I'm leaving it alone. |
Pascal 2007-03-28 07:35:05 | Assimilator1, I did not overclock my CPUs, but now I know where it comes from. Later I'll give you the data. |
Pascal 2007-03-28 15:33:08 | Ok, I have analysed it. On my second machine Athlon XP 2400+, muon1 crashes, if I start the muon1bench. Not directly, but perhaps when the bench reads the result.dat file. On the first machine Athlon 64, 4200+, I am not able to start the bench twice, as I have two muon clients started. The second bench crashes - after a while. How do you deal with this? I think there is some documentation needed. Thanks. |
Pascal 2007-03-28 17:42:16 | Raw results for Athlon 64 X2 4200+, Windsor 2x 512 kb Cache, 2x2200 MHz data for one core, as the benchmark does only work for one core: 63569,193011.7,0.00 64020,193154.5,317.00 64920,193296.5,210.74 65371,193421.3,227.32 65821,193503.5,218.35 66722,193645.2,200.89 67173,193745.3,203.56 68074,193880.0,192.75 68524,193960.9,191.55 68975,194104.1,202.07 69425,194122.6,189.69 69876,194242.9,195.21 70327,194323.9,194.19 70777,194397.7,192.29 71949,194590.8,0.00 72550,194718.1,211.97 73151,194787.2,163.51 73751,194966.7,208.64 74352,195108.0,215.30 74952,195208.0,205.55 75553,195313.0,200.43 76153,195392.2,190.64 76754,195473.3,183.69 77354,195605.8,187.80 77955,195760.7,194.81 78555,195841.7,189.37 79156,195979.9,192.76 722,196199.5,0.00 1323,196302.3,171.20 1923,196381.4,151.46 8457,14086.4,0.00 9057,14166.9,134.15 11457,14569.4,160.99 12057,14713.7,174.23 27841,17927.7,0.00 28441,18072.0,240.47 29641,18214.8,159.48 33242,18925.4,184.74 36242,19639.3,203.74 36842,19715.3,198.60 38042,19931.4,196.42 40442,20334.7,191.01 41042,20479.8,193.32 41642,20631.6,195.91 42242,20682.3,191.27 42843,20868.4,196.03 43443,20987.8,196.14 Average is to be computed, multiply it by two and you have a bench for the above mentioned CPU. |
rhughart 2007-04-05 00:25:01 | Here is mine, C2D E6400 2.13 @ 2.93GHz (366 FS, : 33105,19273.3,519.01 33405,19467.8,520.94 33705,19595.9,519.56 34905,20303.0,523.43 35205,20324.6,517.25 36405,21038.7,521.29 36705,21203.0,521.62 37005,21347.9,521.13 37305,21504.9,521.16 37605,21641.0,520.33 37905,21785.7,519.86 38205,22007.4,522.50 39105,22430.0,520.65 39405,22627.7,522.24 39705,22802.4,522.92 40605,23206.2,520.47 40905,23351.8,520.09 41505,23755.6,523.34 41805,23902.2,522.98 42706,24305.8,520.69 43606,24830.6,522.54 43906,24910.2,520.02 44206,25065.5,520.00 44806,25469.0,522.90 45106,25573.1,521.24 Pretty good until you compare it to similarly clocked X2. |
[XS]riptide 2007-04-08 16:25:53 | 301439 344347.6 601.33 301740 344459.8 601 302041 344701.5 601.29 302342 344863 601.19 302643 345036.5 601.16 302944 345231.9 601.23 303245 345414.2 601.23 303546 345602.5 601.27 303847 345696.2 600.86 304148 345957.9 601.24 304449 346122.2 601.16 304750 346299.8 601.14 305051 346328.5 600.43 305653 346814.2 601.01 305954 346975.5 600.92 306255 347086.2 600.59 306857 347598.2 601.29 307158 347761.6 601.21 307459 347947.1 601.23 307760 348072.5 600.97 308061 348234.1 600.88 308362 348470.3 601.14 308663 348670.8 601.23 308964 348823.8 601.1 309265 348921.5 600.72 309566 349202.9 601.18 309867 349393.8 601.23 310168 349555.7 601.14 310469 349747 601.19 310770 349907.3 601.09 311071 350084.3 601.07 311372 350263.1 601.06 311673 350448.3 601.08 311974 350582.9 600.87 312275 350807.1 601.07 312576 350962.6 600.95 312877 351180.3 601.12 313178 351328.4 600.97 313479 351520.3 601.02 313780 351686.3 600.95 314081 351851.9 600.89 314382 352074.1 601.07 314683 352260.9 601.1 314984 352447.3 601.12 315285 352610.2 601.04 315586 352773.6 600.96 315887 352968.2 601.02 316188 353131.7 600.95 317091 353714.5 601.13 317392 353874.5 601.03 317693 354005.8 600.81 317994 354140.6 600.61 318295 354410.4 601 318597 354514 600.66 E6600 @ 3.6. 1 instance of DPAD at roughly 90-95% CPUusage with 5-10% running 2 x instances of Seventeenorbust. |
[XS]riptide 2007-04-08 16:38:36 | I also have a QX6700 on this at 3.2 at the moment although no bench figures as of yet. |
iNSaNiTi 2007-04-09 11:47:05 | Uptime (secs),Mpts in file,Estimate kpts/sec 18547,4212560.5,0.00 19757,4212990.5,355.29 20967,4213425.5,357.34 22178,4213859.2,357.68 23388,4214289.7,357.18 24598,4214722.4,357.24 25809,4215154.6,357.21 27019,4215587.5,357.27 33372,4217752.1,350.18 34583,4218172.9,349.99 35793,4218606.5,350.57 36701,4219018.2,355.71 38213,4219449.8,350.31 39120,4219852.6,354.44 40331,4220283.9,354.54 41541,4220717.5,354.73 42752,4221150.4,354.88 43962,4221583.9,355.03 45474,4222014.9,351.11 46684,4222445.2,351.30 47895,4222879.1,351.59 49106,4223312.1,351.83 50316,4223744.9,352.05 51527,4224175.7,352.19 52738,4224602.3,352.19 53948,4225035.3,352.38 55158,4225464.1,352.45 56368,4225897.4,352.63 57578,4226320.1,352.53 58788,4226753.1,352.69 59998,4227184.9,352.81 61208,4227594.1,352.40 62418,4228028.3,352.57 63628,4228460.3,352.69 64838,4228890.4,352.76 70893,4231048.4,353.18 72103,4231482.9,353.32 73313,4231916.3,353.42 74524,4232348.4,353.50 75734,4232783.1,353.62 76945,4233215.9,353.70 78155,4233647.0,353.75 * DFI Lanparty nF4 Ultra-D rev A3 (s939) * * MP: 317 x9 * * AMD Athlon 64 3000+ 1800MHz @ 2853MHz * * 2x 512Mb G.Skill @ 285Mhz 2,5-3-3-5 1T * Average is again even higher with less Mhz than before, but tighter timings. Muonbench was still rising but dindt have more time Hopefully next try some better results... (expecting around 355 kpts) Screen : [url=http://imageshack.us][img=http://img138.imageshack.us/img138/6205/38519380br6.jpg][/url] |
Haiya-Dragon 2007-04-10 00:01:58 | Time to increase the graph again Stephen 2x Xeon 5150 (Woodcrest) 2.66GHz dual core 125657,100230.6,776.22 125957,100311.2,774.81 126257,100714.8,776.39 126557,100795.4,774.98 126858,101223.0,776.77 127158,101480.9,777.00 127458,101624.0,776.18 128058,102026.4,775.60 128358,102429.7,777.15 128658,102685.8,777.35 128958,102774.7,776.05 129259,102919.0,775.26 130159,103645.6,775.51 130459,104049.0,777.03 130759,104178.0,776.10 131059,104405.3,776.05 131360,104550.0,775.28 131660,104971.7,776.94 132260,105466.7,777.19 |
[TA]Assimilator1 2007-04-11 19:28:47 | Awesome score! ,what is a woodcrest core? Insaniti The current DPAD client is ~15% slower than the older one used in the majority of benchmarks on that graph. And Stephen was keeping that chart upto date. TurtleBlue Is the 845GE chipset dual channel RAM? Have you confirmed that your CPU is actually running at 3.06GHz when crunching DPAD? (using CPU-Z for example) Pascal Have you done a fresh install of DPAD as white panther suggested? Btw here's the only documentation for the benchmark:- How do I use the Muon1Bench.exe program? First, make sure sample files downloading is switched off in config.txt. Then put the program in your Muon1 directory (same place as muon1.exe) and simply start and leave it running while Muon1 is working. It will produce a file BenchCSV.log which has three columns: one logging the time in seconds, the second showing the number of new Mpts Muon1 has produced and the third showing a running estimate of the speed in units of kpts/sec (1000 kpts = 1 Mpts). Once you've run it for a day or so (enough for the estimate to be stable), post your results to the benchmarking thread in the forum. ***************************************************************************************************** I don't know what you do for dual cores but other people have already benchmarked DPAD with dual cores, ask them rhughart That chart shows benchmarks using an older & slightly faster version of DPAD. On my XPM the older version was 15% faster ,IF that happens to all CPUs with DPAD then your score would of been nearer 600 ,which is a lot more competitve Interesting though that the C2D doesn't dominate here. |
iNSaNiTi 2007-04-11 19:58:37 | Tnkz [TA]Assimilator1 for your explanation's Great support !!! Haiya-Dragon u have amazing kpts average with a woodcrest core. and only at 2.66Ghz. Thats +388 kpts a core ...superb. Imagine @ same Mhz as mine CPU, that would go over 400 kpts a core. Are there any faster (kpts a core) dual/quad CPU's ? C2D isnt heheheh Greetz, iNSaNiTi |
Pascal 2007-04-12 16:02:36 | [TA]Assimilator1, thanks for this information. But - as long as there is no official description how to use the benchmark on multi-core CPUs, and I believe I am not the only one with a dual core, I am not encouraged to do anything upon this is definitely given. Please also thing about that there is no MUON client in version 5 up to now available. This was told more than a year ago. So you rather should think about the project and about where to spend your CPU power. Thanks. Pascal |
TurtleBlue 2007-04-16 00:37:25 | Thanks [TA]Assimulator1 for the post reply & the great suggestions... TurtleBlue Is the 845GE chipset dual channel RAM? Have you confirmed that your CPU is actually running at 3.06GHz when crunching DPAD? (using CPU-Z for example) I'll check this soon...Tax Deadline is NOW!!! (Yeah, I know...should have started this months ago, LOL - am expecting refund {got a 2006 Prius last July! - Tax Credit!}) |
TurtleBlue 2007-04-16 00:48:51 | Ok, [TA]Assimilator1, on the question of: TurtleBlue Is the 845GE chipset dual channel RAM? I went to Crucial's web site and did a memory/system scan - Crucial reported that the 845GE chipset DOES NOT support dual channel RAM. Does this really impact dpad processing in a big way (Looking to fill a DOA AMD Thunderbird 1.0ghz with some new CHEAP AMD parts)? Will check with CPU-Z soon... Thanks, TurtleBlue |
TurtleBlue 2007-04-17 02:54:34 | ok, [TA]Assimulator1 this post is in reponse to your 2nd and last Query: TurtleBlue Have you confirmed that your CPU is actually running at 3.06GHz when crunching DPAD? (using CPU-Z for example) I have CPU-Z up and running while dPad is crunching and CPU-Z reports that under the cpu tab in the Clocks (core#0) section it shows 3066.6MHZ for core speed, with a multiplyer of x23.0, Bus speed is 133.3MHZ and Rated FSB is 533.3MHZ Thanks, Turtleblue |
Stephen Brooks 2007-04-18 14:39:52 | Which two versions are you referring to, one of which is 15% slower than the other? I can believe this has happened, as I added some detail to the collision detection in the code, but would like to know exactly which release you observed the effect on. |
iNSaNiTi 2007-04-19 00:12:14 | Meself noticed a diff in V 4.43d compared to V 4.42c. After testing (bench mpts) with the sam Lattice, my foundings where same as [TA]Assimilator1 But im not really sure, if i only look to the mpts benchmarks its maybe 5 up to 15 % slower (depends on the diff lattices) then the old V 4.42c. This is my opinion and are my foundings, but it certainly could be that there are others who dont have any of this. Greetz, [iNSaNiTi] |
iNSaNiTi 2007-04-19 00:14:09 | OWz, i forgot to tell that i didnt have exact evidance en info on this ( no screens, numbers...) im sorry maybe [TA]Assimilator1 does Bye again... |
waffleironhead 2007-04-21 05:03:55 | Uptime (secs),Mpts in file,Estimate kpts/sec 1274,62601.3,0.00 1575,62649.5,160.60 1875,62714.8,189.09 3075,62951.9,194.70 3375,63001.2,190.35 4576,63238.4,192.98 4876,63301.9,194.53 5176,63366.4,196.10 5476,63413.4,193.28 6677,63650.8,194.27 6977,63695.1,191.82 8177,63932.6,192.86 9378,64169.5,193.53 10578,64407.0,194.08 11779,64644.6,194.52 12979,64881.3,194.79 13280,64920.2,193.16 this is a p4 prescott 3.0 fsb 200 |
iNSaNiTi 2007-05-10 15:04:20 | Bench : Uptime (secs),Mpts in file,Estimate kpts/sec 1545,708422.2,0.00 2746,708853.9,359.54 3946,709284.9,359.25 5147,709717.8,359.68 6348,710148.8,359.50 7248,710543.3,371.91 8749,710974.9,354.34 9950,711405.3,354.93 11150,711838.6,355.67 12351,712271.4,356.20 13552,712704.4,356.64 14753,713135.4,356.85 15953,713565.8,356.99 17154,713996.8,357.14 18355,714429.4,357.36 19555,714862.2,357.57 20756,715294.7,357.73 26760,717458.3,358.37 33063,719620.3,355.29 33964,720038.8,358.33 Setup : * DFI Lanparty nF4 Ultra-D rev A3 (s939) * * MP: 321 x9 * * AMD Athlon 64 3000+ 1800MHz @ 2889MHz * * 2x 512Mb G.Skill @ 289Mhz 2,5-3-3-6 1T * Raised Fsb +4 Some Tighter Mem timings except tras +1 If i had to do nothing on this machine i would run it for longer, so maybe another day ( expecting hopefully around 359 mpts) Greetz, iNSaNiTi |
Stephen Brooks 2007-05-11 12:58:42 | Here are all the results since the introduction of v4.43d. The "efficiency" figures are interesting. Newer Core 2s etc. may have more raw power, but per clock per core, the old Athlons with very overclocked RAM frequencies seem to do best. I had to move the legend out of the way to fit Badger's bar in. So Muon1 seems to be very much RAM bound in some way. |
Stephen Brooks 2007-05-11 13:05:07 | Note also... the ordering in kpts/s is linear in number of cores, there's a couple 4-core machines, then some dual core, then all the single cores, with no overlap (though iNSaNiTi is trying his best). |
iNSaNiTi 2007-05-11 13:42:23 | Sow awesome Stephen !!! Tnkz for the chart its superb |
iNSaNiTi 2007-05-11 14:10:26 | I forgot to tell that in my last and highest bench, i didnt run dual channel. Sow i think bandwith (Mb/s) has little or no effect on muon1. |
[TA]Assimilator1 2007-05-11 16:42:29 | Pascal >>>>But - as long as there is no official description how to use the benchmark on multi-core CPUs, and I believe I am not the only one with a dual core, I am not encouraged to do anything upon this is definitely given.<<<< Eh? what does that matter as long as it works?? Turtle Blue Single channel RAM has a big impact on P4s ,the reduced bandwidth really hampers them IIRC ,(though not as much as SDRAM!) Ath64s OTOH are only mildy affected by SCh RAM ,about 5-10% ,again IIRC Stephen I'm not 100% sure about this but I'm fairly certain my 1st benchmarks were done on v442c ,with the 2nd lot on the current version 443d. You could probably get confirmation of this by looking at out post dates & when you released new clients. So far I've only tested this on my XPM (which btw has a FSB/RAM of 178 MHz) ,I'll test it on my Sempron soon. Thanks for the new chart btw iNSaNiTi You're getting very close to X2 scores Btw you say findings not foundings |
Stephen Brooks 2007-05-11 17:06:01 | --[Stephen: I'm not 100% sure about this but I'm fairly certain my 1st benchmarks were done on v442c ,with the 2nd lot on the current version 443d. You could probably get confirmation of this by looking at out post dates & when you released new clients.]-- I only used ones posted here after the release date of 4.43d for this graph. Are you saying there is another number I should be putting on that chart? I only see one bar on there for you at the moment (the XP-M). |
iNSaNiTi 2007-05-11 18:13:23 | [TA]Assimilator1 : I dont get that ! Do u mean this literal ? What did i wrong ? :*( |
Haiya-Dragon 2007-05-12 11:19:27 | I ran a small experiment last night, because I wasn't satisfied with the 4 thread performance of muon. I made a copy of the muon directory and set up the two clients to use 2 threads each. The results are, well.. 110kpts/sec higher as a single client with 4 threads! Client 1: 299606,21865.2,442.13 299906,21959.4,440.77 300206,22123.7,441.89 300506,22218.7,440.59 301406,22695.4,443.29 302607,23265.6,444.52 303807,23823.6,445.29 304107,23908.5,443.79 304407,23971.9,441.65 304707,24200.9,444.58 avg: 442.85 kpts/sec Client 2: 301146,304098.6,443.98 301447,304244.1,444.37 301747,304344.1,443.29 302048,304505.0,444.17 302649,304659.4,440.64 302949,304900.8,444.03 303550,305155.3,443.66 304151,305301.2,440.04 304451,305461.1,440.86 304752,305676.3,443.30 avg 442.83 kpts/sec So that makes 885.68 kpts/sec total for the dual Xeon 5150 Woodcrest. Now I wonder why this is happening.. it could either be the cpu scheduling for 4 threads being inefficient or it's muon.. |
iNSaNiTi 2007-05-12 11:48:13 | i was wrong a few post's ago bout Haiya-Dragon's Cpu, cauze i thought it was a dual core but it seems to be a quad So i have to take my words back... Im srry |
Stephen Brooks 2007-05-12 17:55:43 | There seems to be some inefficiency because there are small parts where the program will go back down to 1 thread again. My next workstation will be a quad-core so I can investigate this further when I get it. The parts of the simulation where you have a low number of particles may be where it reverts to single-thread, or alternatively the "initialisation" stage. Running 2 clients in parallel seems a good idea. You may even want to increase the number of threads each one uses to 4, so if one has gone back to single-thread for an initialisation stage, the other can use the full CPU. Ideally I'd figure out a way for a single instance to replicate this behaviour (somehow). More investigation when I get my hands on the 4-core workstation and a 16-core opteron is installed for us as an HPC server. |
[DPC] Eclipse~NaWA 2007-05-13 13:01:44 | I benched my system too and took the same approach as Haiya-Dragon did: 2 Clients each set to 2 threads. info: Clientname: [DPC] Eclipse~Flandre CPU: 2x AMD Opteron 2212 2.00 GHz (dualcore) Corename: Santa Rosa The results: Client 1: 313703,10865.1,380.94 314003,10953.5,379.97 314303,11115.1,381.73 314903,11269.3,379.02 315203,11423.6,380.48 315503,11536.2,380.42 316704,11999.6,380.65 317004,12090.7,379.87 317304,12252.5,381.47 317604,12328.4,380.19 avg: 380.47 Kpts/sec Client 2: 312811,10521.0,379.42 313111,10615.3,378.67 313411,10803.7,381.53 313711,10881.0,380.12 315811,11682.2,380.22 316112,11810.4,380.71 316412,11885.7,379.37 316712,12047.5,381.00 317312,12263.6,380.58 317612,12379.0,380.62 avg: 380.22 Kpts/sec Total: 760.70 Kpts/sec ((Kpts/sec)/core)/Ghz: 95.09 |
[TA]Assimilator1 2007-05-14 13:21:14 | Stephen >>>>Are you saying there is another number I should be putting on that chart?<<<< Nope , i was refering to your previous post about different DPAD clients & the 15% speed difference on my XPM Btw a 16core Opteron should bang the Mpts & rocket you up the charts Oh maybe you could add to the DPAD FAQ that running 1 client per core is the most efficient way to use multicore/CPU PCs. iNSaNiTi Don't panick ,I was just picking on your English ,you used a wrong word that's all. Eclipse~NaWA What about running 4 clients? |
Stephen Brooks 2007-05-14 14:51:51 | Well interestingly I did try this 16 core machine on Muon1 once, but it only gave a 4.5x speedup. I suspect it is because it's 8xOpteronx2 i.e. 8 sockets, and a lot of time was being wasted distributing *some variable* to the other 7 L2 caches every time one socket changed it. When we get the 16core installed in the server room, I might have another look and see if I can remove the "shared variable" that's causing this overhead for multi-socket systems. It doesn't seem to affect the multi-core systems so much because the caches are closer together (and in AMD's case for Barcelona/Phenom, actually shared at L3). |
iNSaNiTi 2007-05-14 15:04:01 | [TA]Assimilator1 Haha ok , first i didnt know about whot ur tallking about so thats why i freaked out but, now i C that i have said ''foundings'' a few post's ago and now its not hard at all to understand what u ment. Yeah my english is really build on weak foundings Srry again for the off-talkz... |
Pascal 2007-05-14 15:10:56 | How do I use the benchmark and the client correctly on an AMD Athlon X2? There is no description anywhere how to use. Thanks in advance. |
[DPC] Eclipse~NaWA 2007-05-14 16:55:01 | assimilator1 going to test with 4 clients then later. i think it's prolly best to set them to 1 thread each too.. anyway, i tested muon with 2 clients en a variable thread rate.. my findings are available here: http://gof-lan.mine.nu/muon.htm still benching 2 clients auto threads atm. stats of that will be up there in a few hours. It doesn't look that great yet. score will be between single client auto thread and 2 client 2 threads. |
Stephen Brooks 2007-05-14 17:36:49 | This is going to be interesting - you've got a 2xOpteronx2 system, so perhaps when you do 2 clients with 2 threads each, one lives on each socket? --[How do I use the benchmark and the client correctly on an AMD Athlon X2?]-- Besides the simple explanation at the bottom of this page, on a dual-core machine you can either start one client with auto threads (this is what most people do) and run the benchmark program alongside for a couple of days. Or you can create two Muon1 directories, each with a benchmark in them and set to 1 thread, this may end up being slightly faster when you add the totals together. |
[DPC] Eclipse~NaWA 2007-05-14 23:38:24 | i did the 4 client/1 thread test and this are the results: Clientname: [DPC] Eclipse~Flandre CPU: 2x AMD Opteron 2212 2.00 GHz (dualcore) Corename: Santa Rosa Client 1: 436730,1928.3,207.94 437030,1954.0,203.05 439131,2386.3,203.66 441231,2849.4,206.68 441531,2930.3,208.25 443932,3398.9,206.08 444532,3553.2,208.12 444832,3580.2,205.81 445132,3673.0,207.80 445432,3691.3,205.03 avg: 206.24 Kpts/sec Client 2: 436439,1856.9,204.99 437339,2015.7,200.70 439439,2484.3,206.51 440039,2633.9,209.46 440640,2718.6,205.06 441240,2834.0,204.28 441840,2998.4,208.26 442140,3074.1,209.49 442740,3165.6,206.49 443040,3259.8,209.24 avg: 206.45 Kpts/sec Client 3: 433144,1529.9,207.97 437045,2369.5,211.12 437945,2539.5,209.10 438845,2703.8,206.89 439145,2773.3,207.56 439445,2834.7,207.48 440045,2993.8,210.36 442146,3463.0,212.30 444546,3931.6,209.82 444846,4003.4,210.34 avg: 209.29 Kpts/sec Client 4: 437353,2435.9,211.45 441254,3241.2,209.98 441854,3389.4,211.58 442154,3411.3,208.64 442754,3572.6,211.09 443354,3660.3,208.54 443954,3823.2,210.92 444554,3918.9,209.05 445154,4013.8,207.26 445754,4173.4,209.25 avg: 209.78 Kpts/sec Total: 831.76 Kpts/sec ((Kpts/sec)/core)/Ghz: 103.97 this setup is a total record on this machine, it tops my last record (2 clients/2 threads) with 70 Kpts/sec. for more benches i did just check: http://gof-lan.mine.nu/muon.htm |
[OCAU] badger 2007-05-15 04:46:04 | I just got my new dual opteron 2216 machine (ie 2x 2cores at 2.4 GHz), so I'm trying it out on muon. It's running fedora core 6, so I'm running muon1 with wine. Seems to work ok, I got a few crashes yesterday. The multi threading seems to be strange, when I ran one instance with threading set to 4 or 'auto' it only used about 50% of all 4 cpus. I tried running 2 threads and it still ran on all 4 cores according to the system monitor. I stepped up to 3 instances with 4 threads each and this gives a consistant 80% load on all 4 cpus. Muonbench instantly crashes under wine, so I didn the calculation by hand over nearly 3 hours to get 609 kpts/s I'll have a go at 4x single threaded instances and see what I get. My barton 2500 got some new ram, which is faster, but in single channel. It's running at 11x199 (2189 MHz) now, so I'll give it another benchmark when I get home.. |
[OCAU] badger 2007-05-15 07:15:57 | OK I ran the 4 instances single threaded for a bit over an hour, it still didn't give 100% cpu load, about 95% on each core, and the total results: core 1 : 270kpts/s, core 2 : 385 kpts/sec, core 3 179kpts/s, core 4 : 180 kpts/s for a total of 1015 kpts/s. I'll try it over a longer period to get a better estimate... My previous post (the 4 thread x 3 instance) I forgot to add the points on the currently running sims, so it was actually 641 kpts/s, not 608. |
[DPC] Eclipse~NaWA 2007-05-15 08:31:23 | i've let it run for more hours now and the more final results are this: client 1: avg: 210.24 Kpts/sec client 2: avg: 210.90 Kpts/sec client 3: avg: 211.42 Kpts/sec client 4: avg: 211.43 Kpts/sec Total: 843.99 Kpts/sec ((Kpts/sec)/core)/Ghz: 105.50 it's using most of the cpu power as you can see here: http://gof-lan.mine.nu/muon4.jpg (http://gof-lan.mine.nu/muon.htm) |
sssf 2007-05-15 08:45:22 | From the numbers [DPC]Eclipse~NaWA created it's really strange to see the behavior of the threads auto setting. I assume for a quad core machine the auto setting is going for 4 threads (Stephen, can you confirm?), so running 2 clients on the auto thread setting results in 4 threads for both clients. The suggestion Stephen did adding threads to make use of the time the client runs with one thread is clearly not paying out in performance: look at the speed drop running 2 clients with 2 threads and 2 clients running 3 threads. But on the other hand, the behavior is strange if the auto setting runs 4 threads on the PC. [DPC]Eclipse~NaWA: did you fix the affinity of the clients to specific cores (in any test)? It might increase performance for running multiple clients. What are the results for dual core machines running 2 clients (on auto and 1 thread setting) compared to running one client (on auto, 2 threads)? |
Stephen Brooks 2007-05-15 11:48:51 | I suppose the only thing to notice is that "Threads: auto" on a 4-core machine will try and run single simulations faster rather than doing 4 at once. If you create 4 clients with 1 thread each, they will run completely in parallel (of course). It appears that a single client with 4 threads runs simulations serially and about 3x as fast, though it would be good to improve this when I get my hands on a 4-core machine to do some profiling. |
TurtleBlue 2007-05-15 12:03:54 | My latest addition to the family (Built in April 2007): Am setting up a budget box using a DOA Athlon 1.0ghz Thunderbird that I built for my nephew some years ago. Reusing the Inwin 500 Miditower, floppy, a Plextor 716 burner I had laying around, an Antec True Blue 430w 20 pin power supply, Windows XP O/S and a round Primary/Secondary IDE cable. New budget items to complete the build: Zalman CNPS9700LED ASRock AM2NF6g-VSTA mobo AMD 1.9ghz brisbane AM2 cpu Seagate 80 gig Sata2 perpendicular drive and a Crucial Ballistix DDR2 PC2-6400 1gig kit (512MBx2) Am only running dPad (for now, plan to bounce back to F@H) on this box and nothing else. Overclock info from PC Wizard 2007 (Current): Overclock Info: General - Processor: Frequency: 2137.46 MHz - [initial: 1900 MHz] FSB Frequency: 225 MHz - [initial: 200 MHz] Multiplyer: 9.5x - [initial: 9.5x] Chipset: DDR2-SDRAM: PC2-3400 (214MHz) Timings: 4-4-4-12(CL-RCD-RP-RAS) RAM Ratio: CPU/10 Activated Channels: Dual Memory: DDR2-SDRAM PC2-6400 @ 399 MHz (CPU/10) DDR2-SDRAM PC2-6400 @ 399 MHz (CPU/10) Video: GPU Frequency: 425.00 MHz Memory GPU: 400.00 MHz Sensor: Processor Temp: 34.5c Mainboard Temp: 32.0c Processor Fan: 1480 rpm Ran MuonBench at original equipment configuration and the second part is when I tried some O/C'ing (Not too fully verse in the intracacies like some of you pro's!) Uptime (secs),Mpts in file,Estimate kpts/sec 2502,21860.9,0.00 2802,22044.5,611.87 3102,22114.3,422.24 3402,22176.5,350.59 3703,22332.5,392.91 4303,22450.7,327.59 4603,22583.1,343.83 4903,22741.1,366.67 5203,22811.6,352.03 5503,22938.2,359.02 5803,23047.6,359.53 6103,23111.0,347.17 6403,23249.7,356.02 6703,23282.3,338.35 7003,23435.7,349.88 7303,23560.8,354.07 7603,23572.0,335.44 7903,23751.8,350.09 8204,23826.9,344.84 8504,23940.9,346.59 8804,24064.3,349.67 9104,24112.0,341.00 9404,24275.9,349.92 9704,24296.4,338.19 10004,24449.7,345.10 10304,24571.8,347.47 10604,24658.1,345.26 12404,25303.9,347.70 12705,25353.8,342.37 13005,25476.5,344.27 13605,25629.8,339.47 13905,25785.4,344.18 14205,25901.8,345.30 14505,26020.5,346.56 14805,26126.6,346.73 15105,26177.0,342.47 16906,26797.1,342.71 17206,26915.1,343.74 17506,27032.9,344.72 17806,27152.0,345.74 18106,27279.1,347.24 18706,27481.0,346.84 19306,27644.2,344.16 19606,27771.9,345.59 19906,27851.6,344.21 20206,27906.3,341.47 20507,28058.8,344.25 22307,28700.3,345.34 22607,28767.1,343.51 22907,28888.9,344.43 23207,29042.2,346.84 23507,29067.5,343.09 23807,29228.3,345.80 24108,29272.3,343.04 24408,29394.9,343.93 24708,29516.1,344.75 25008,29650.0,346.10 25308,29672.2,342.52 25608,29844.9,345.54 25908,29885.7,342.86 26208,30038.0,344.94 26508,30068.3,341.89 26808,30251.9,345.22 27108,30309.0,343.33 27408,30462.3,345.35 28009,30633.6,343.94 28309,30748.7,344.40 28609,30808.9,342.75 28909,30920.2,343.07 29209,31037.4,343.60 29509,31185.9,345.28 30109,31329.9,342.99 30409,31453.4,343.73 30709,31606.7,345.51 31009,31683.6,344.57 31310,31804.8,345.19 31610,31927.6,345.85 31910,32000.6,344.80 32210,32125.0,345.50 32810,32247.9,342.72 33110,32401.2,344.37 33410,32526.9,345.09 33710,32648.8,345.68 34311,32769.6,342.95 34611,32948.8,345.33 34911,33070.1,345.87 36711,33686.9,345.70 37312,33840.2,344.14 37612,33960.4,344.62 37912,34096.5,345.55 38212,34205.2,345.69 38512,34311.8,345.77 38812,34358.1,344.18 39112,34502.3,345.30 39412,34624.2,345.80 40012,34782.4,344.48 40312,34917.8,345.33 40613,34947.7,343.39 40913,35102.5,344.74 41513,35310.9,344.78 41813,35328.1,342.58 42113,35460.0,343.32 42413,35598.8,344.21 42713,35710.2,344.42 43313,35863.5,343.11 43614,36008.6,344.13 43914,36027.5,342.09 44214,36152.0,342.62 44814,36332.5,342.02 45114,36451.8,342.41 45414,36576.1,342.92 47215,37184.9,342.72 47515,37301.5,343.03 47815,37417.9,343.32 48115,37521.9,343.35 48715,37645.4,341.56 49015,37768.2,342.00 49316,37911.0,342.85 51116,38526.1,342.81 51416,38564.9,341.50 51716,38718.8,342.54 52017,38830.6,342.72 52317,38953.6,343.13 54117,39592.1,343.53 54718,39707.5,341.79 55018,39852.3,342.59 55318,39973.9,342.95 55918,40197.7,343.28 56218,40272.4,342.76 56518,40302.1,341.40 56818,40494.1,343.05 57119,40599.9,343.10 57419,40685.9,342.79 57719,40808.6,343.15 Uptime (secs),Mpts in file,Estimate kpts/sec MuonBench after O/C (see above info from PC Wizard 2007 O/C stats) FSB 227 CPU 2156.4 651,40886.9,0.00 952,41009.8,409.52 1552,41191.7,338.54 1852,41348.7,384.69 2152,41374.1,324.68 2452,41496.5,338.54 2752,41597.0,338.01 3353,41750.4,319.69 3653,41888.4,333.71 3953,42005.9,338.96 4253,42112.8,340.40 4553,42266.1,353.51 5153,42419.4,340.43 6954,43074.6,347.12 8755,43703.3,347.57 9055,43864.3,354.32 hmm...doesn't seem to be much of a difference pre & post O/C...strange Thanks, Assimilator1 for your comments! Thanks, Stephen for including me in your new graph (someone had to bring up the rear ) |
Haiya-Dragon 2007-05-15 17:59:17 | Here's the results of 4 clients with 1 thread on the 2.66Ghz Xeons: Client 1 Client 2 Client 3 Client 4 583187,45439.3,237.50 575691,48057.8,235.79 583181,15352.4,236.74 580781,14773.3,237.08 583487,45534.9,237.89 578991,48868.7,236.34 583781,15497.2,236.79 581381,14935.6,237.40 583788,45630.7,238.27 579291,48947.5,236.47 584081,15571.7,236.84 581681,14994.3,237.20 584088,45724.2,238.61 581392,49425.8,236.17 584681,15726.0,237.03 582281,15155.8,237.51 584988,45886.0,237.80 581692,49521.4,236.57 585281,15820.3,236.29 582582,15235.9,237.65 585588,46047.2,238.08 583793,49999.9,236.28 585581,15904.8,236.50 582882,15331.7,238.05 586188,46209.4,238.37 585593,50478.7,237.09 585881,15991.6,236.74 584982,15809.2,237.70 586488,46282.7,238.40 585893,50571.1,237.41 586181,16087.4,237.12 585282,15903.6,238.05 586788,46377.0,238.74 586193,50631.6,237.25 586781,16197.9,236.64 586182,16064.7,237.24 587388,46470.9,238.00 586793,50727.4,236.56 587081,16294.2,237.02 586782,16260.7,238.05 Average: 238.17 236.59 236.77 237.59 Total: 949.12 kpts/sec pts/Mclock/core: 89 Not bad, 22% extra performance with only a few minor tweaks |
Haiya-Dragon 2007-05-15 17:59:44 | eww doesn't look like the forum supports tabs |
[DPC] Eclipse~NaWA 2007-05-16 01:38:43 | sssf: i tried 2 clients, 2 threads before with client 1 set to cpu 0 and 1, and client 2 set to cpu 2 and 3. the results were only around 5 Kpts/sec lower (755 Kpts/sec) then 2clients/2threads (760 Kpts/sec) without affinity set. all other benches i did were without affinity fixes. |