stephenbrooks.orgForumMuon1GeneralBenchmarking with Mpts thread
Username: Password:
Page 1 2 3 4 5 6 7 8 9 10 11 12 13
Search site:
Subscribe to thread via RSS
[TA]Assimilator1
2008-03-11 18:12:43
Ok, but the only reason I benchmark with d/l samples off is because your FAQ says to do so!  lol
From your FAQ:-

>>>How do I use the Muon1Bench.exe program?  First, make sure sample files downloading is switched off in config.txt.  <<<

Is it out of date?
[OCAU] badger
2008-03-11 23:34:44
actually my machines are connected to the internet, but can't download the sample files (maybe a firewall issue?) they upload results fine though.
[OCAU] badger
2008-03-11 23:35:44
(can't get servers.csv either)
Stephen Brooks
2008-03-12 10:42:43
Yes, not sure what I was thinking there.  Probably back then I thought downloading the sample files would take a long time but didn't realise not having sample files (at all) would make all the simulations short and unrepresentative.

Badger should probably check that the sample files URL in the config.txt file hasn't been mistyped somehow?
[TA]Assimilator1
2008-03-14 18:17:57
lol, ok

I see you've updated the FAQ ,though you might want to update this too

>>> Or you can create as many Muon1 directories as you have cores, each set to 1 thread with their own copy of the benchmark program and add the resulting numbers together.  This ends up being slightly faster in the present version (v4.43d) of Muon1 as running single-threaded simulations in parallel leaves fewer spare CPU cycles than running a single (faster) multi-threaded simulation.<<<

Seeing as autothreading is so much better now & 1 client is fastest then maybe you want to just chop that lot out?  unless their are still people running v4.43x?  ,in which case you could just add something like 'However with the present version v4.44d, a single client is now the fastest method'.
(unless you're waiting on more results to confirm that?).

Oh btw their's a grammatical error in the 1st line >>>On a multi-core machines .....<<<
Stephen Brooks
2008-03-15 14:28:45
I'm pretty sure 1 client is not fastest.  Multiple clients has to be the theoretical max, though my multithreaded version has got better too, so you'll only see a couple of percent difference.  Maybe I should stop confusing people with the multiple clients idea.
[TA]Assimilator1
2008-03-16 10:15:05
Didn't you see my benchmarks on the previous page?  ,on my quad core rig I tested 1,2 & 4 clients, 4 single threaded was slowest, 2 much quicker & 1 slightly quicker again.  And that was with download samples on or off.
Unless you're saying it'd be different for dual core rigs?  but I can't see why.
Stephen Brooks
2008-03-16 15:27:18
Try the same simulation on the setups (using queue.txt), does it get the same number of Mpts?  I think my current version might be over-scoring multi-threaded runs a bit.
[TA]Assimilator1
2008-03-18 23:51:03
Do I delete queue.txt from the 2nd,3rd & 4th client dir's & then copy the 1 from the main client?
Stephen Brooks
2008-03-19 20:19:50
No you just need to run it once on one client on each of the threads settings.  What I'm saying is that the Mpts count I think is currently affected by the number of threads.  Running the same result on two different clients ought to give the same number of Mpts on these new deterministic runs. 
[TA]Assimilator1
2008-04-05 16:36:06
A?  run it on one client??
Stephen Brooks
2008-04-05 18:22:18
Run an individual simulation on one client, I needn't have said that because you can't split a simulation between two of them.
K`Tetch
2008-04-05 21:54:29
Stephen, the reason the FAQ says to turn off samplefile downloads is that the samplefiles used to be added to the results files when loaded, so when the samplefiles updated during benchmarking, the benchmark program would suddenly see all these new mpts from the samplefiles, and it'd screw things up royally.Now, with them being seperate, that doesn't happen.
Stephen Brooks
2008-04-07 02:56:40
Ah!  Though there's also that option between "accumulate" and "latest". Without looking at the code I can't remember whether the "accumulate" option added to the results.txt or keeps the samplefiles in their own folder still.
Zerberus
2008-04-07 18:00:33
'Accumulate' adds to 'results.dat'.
Stephen Brooks
2008-04-08 16:29:24
Now I can look at the source code, it appears 'accumulate' accumulates in 'samplefiles\LatticeName.txt' and Muon1Bench looks at 'results.dat'. So sending or samplefiles shouldn't affect the performance numbers apart from the overhead for connecting to the internet and possibly timing out.
[TA]Assimilator1
2008-04-26 10:44:03
Ok I've re-read your posts about running simulations & it doesn't make sense, lol

Fisrtly I don't know how to run the same simulation on different clients or how to re-run the same simulation on the same client (for the purpose of testing different thread numbers).

I guess what your saying is that you believe a single multi threaded client is over-scoring Mpts on any particular simulation & that this could be proved or disproved by running the same simulation on the client & then setting the thread number to 1 & then multiplying the score by 4 to give a comparitive number.  Then compare that score to the multi-threaded score (4 in my case), is that right?

Btw if you are right, I'm still going to run the setup that gives me the most points!  ,if you want us to run the setup which gives the most science then you're going to have to alter the scoring of the client. 
Stephen Brooks
2008-04-28 17:50:49
You don't know how to run a single simulation using queue.txt?  I think it's in the FAQ.

Right now it looks like the multithreaded single-client gives the most Mpts even though it's technically slightly slower than multiple single clients.  However there's a bit of an advantage from a single client of 3.8x the speed (in the case of a quad core) because it can evolve faster than if the results were in 4 separate pools.
[TA]Assimilator1
2008-04-29 19:53:49
Cool.

Nope, I don't see it in the FAQ.At least not under a title including 'simulation'.
RGtx
2008-04-29 20:09:44
Try under " // Manual Seeding of Designs" in the FAQ.
[TA]Assimilator1
2008-05-01 20:13:42
Thanks , but what do I do for this? 

'write in it your genome/design line'
[XS]riptide
2008-05-08 02:38:48
Gawd!  Has nobody come along and topped me off the top of the Chart yet?  tut tut
Stephen Brooks
2008-05-08 10:25:22
Seems not much beats a highly-overclocked Core 2 Quad.  How did you get it running that fast anyway, did you use water on it?
[XS]riptide
2008-05-10 00:29:12
Water at the time yes.  However, the newer 45nm quads would be able to reach higher speeds >4Ghz on same water and add about ~10% per IPC on top of that.  And to top that of is the new range of DP Xeon systems like the Intel Skulltrail Platform with unlocked Multi Xeons and the Asus Z7S. Anything bascially with 2 quad core parts (65nm or 45nm) should be able to destroy my score, regardless of overclocked or not.  I'm a little surpirsed that nobody here with a DP/MP system has tried.
[TA]Assimilator1
2008-05-17 12:17:07
Hi Riptide ,yea your awesome rig is still way out in front , if my rigs score was on there (now a Q6600 @3GHz) I would hold a distant 2nd place (just) at 1055 Kpts/s. I might be able to bump that up IF I can get my quads clock up higher but I'll never be able to reach 3.69GHz!  :Q, 3.3GHz if I'm lucky.

Also about you not being bumped down, if you look at this stats page, http://stephenbrooks.org/muon1/?allteams=1 ,& compare the number of active users now to previous quarter & then the previous year you'll see that the number of active users has plummeted!  , so less people running naturally means less people likely to have a doom rig that'll bump ya down.

Oh & just for a laugh I'm thinking of benchmarking my old Celeron 366 @550 which I just got back .
[TN]opyrt
2008-05-18 02:41:17
I've got unofficial sources telling me about 1650Kpts/s while running auto detection on cores (i.e. one threaded instance, not separate for each core). 
We'll see if we'll get an official benchmark next week.
runesk
2008-05-18 23:23:36
The source is no official: Here is the results of my dual quad X5450 (3.0GHz) running on 8 cores. 


Uptime (secs),Mpts in file,Estimate kpts/sec
2775525,1406667.6,0.00
2775825,1407210.3,1808.81
2776125,1407505.1,1395.69
2776425,1407763.0,1216.98
2776725,1408527.0,1549.34
2780325,1414465.5,1624.40
2780925,1415646.7,1662.62
2781225,1416220.5,1675.77
2781525,1416702.8,1672.36
2781825,1416733.5,1597.59
2782426,1417877.7,1624.48
2783626,1420211.5,1671.90
2783926,1420675.8,1667.46
2784526,1421700.2,1670.11
2784826,1422121.7,1661.55
2785126,1422716.9,1671.62
2785426,1423182.8,1668.02
2785726,1423659.7,1665.71
2786026,1424124.0,1662.33
2786926,1425324.4,1636.38
2787226,1425790.0,1634.21
2787526,1426290.8,1635.09
2788126,1427474.3,1651.14
2788426,1427977.0,1651.71
2789026,1429154.1,1665.48
2789926,1430452.5,1651.54
[TA]Assimilator1
2008-05-20 19:44:01
That's a really poor score for dual quads ,you really need to run 2 clients with each set to 4 threads it seems.  Somewhere in this forum their's a discussion about dual quads with DPAD & Stephen mentions that beyond 4 cores you're probably better off running more clients, seems you've proved him right.

My single quad (Q6600) @3GHz manages 1055 Kpts/s btw.

I'd love to see what score you get with 2 clients! 
runesk
2008-05-21 08:42:58
I'll give it a shot this weekend (as ppl are using it for matlab during weekdays, I can't run muon1 dpad other than weekends)
[XS]riptide
2008-05-21 10:04:48
Excellent runesk.  Dual quads is what we're talking about.  Sweet.  Remind me... is that the Harpers, or the Clovers?
runesk
2008-05-21 11:37:25
I think X5450 is a Harpertown.. but I'm not sure
[XS]riptide
2008-05-21 14:22:58
Yes Harpetown.  It would be interesting to see what 45nm clock for clock can bring above 65nm in Muon/DPAD.  In many applications it is roughly 10% increase in performance for the same CLockspeed on top of 65nm parts.
Stephen Brooks
2008-05-21 16:10:46
Memory access is very important for Muon1, I think this becomes quite an issue on multi-socket systems, Intel in particular.  On the other hand, running 2x4, 4x2 or 8x1 thread clients might make it better too.  So it's going to be very architecture-dependent; HPC workloads are notorious for their heavy RAM usage and I'm pretty sure Muon1 is no exception.
[XS]riptide
2008-05-22 04:46:44
Well.... FBDIMMS will be a little slower than regular DDR2/3 ram sticks.  No doubt runesk is using them on his server boards.  However, at a guess I'd say the board also has independant FSB's to each chip, and depending again on the baords and the chipset possible have 4 channel mememory configuration that you get on boards like Intels SkullTrail Platform.  Maybe runesk can share more details about this system.
[XS]riptide
2008-05-22 04:47:25
^^^ Dayum spelling LOL
runesk
2008-05-22 22:46:00
I sure can:
Dell PowerEdge 2950
- 2 x Quad-Core Xeon X5450 3.0GHz/2x6MB 1333FSB
- 24GB 667Mhz FBD (4x4GB + 4x2GB dual rank DIMMs)
- 2 x 146GB SAS (15,000rpm) running raid1 on the PERC 6/iR controller

That should the the interesting components in this server.

As I've already told you, it's a MatLab work iron, and as far as I know, the people using it at a daily basis is quite happy with it's performance

.Rune
[XS]riptide
2008-05-23 06:22:01
Nice machine.  The one thing however, is the FSB.  I beleive the Poweredge has a Intel 5000X chipset which only has one FSB channel for both processor.  Nonetheless its a monster as it stands.

Now its time to bench it. 
[XS]riptide
2008-05-23 06:22:21
Poweredge 2950 that is
[TA]Assimilator1
2008-05-27 18:24:29
Well just for a laugh I thought I'd benchmark my Celeron 366 @550 , but it's been crunching since 08.00 (18.23 now) & their still isn't a results.txt yet!  :Q, so far it's done about 850 Mpts.  I hope it gives a result soon ,I'm not running the darn thing for days just to get a benchmark!  (that would be too much of a waste of electricity). 

At what point does DPAD create a results.txt file?
[TN]opyrt
2008-05-27 19:35:39
Afaik, as soon as a simulation is done, it creates a results.txt. 
[XS]riptide
2008-05-28 10:06:58
It seems its a long single simiulation.  AFAIK it could be a high yield one.  Man... thats a crap celeron.  LOL
Silverthorne
2008-06-02 23:26:02
Here's a benchmark for my core 2 quad Q9300 @ 3225 FSB430 4GB OCZ Reaper 5-5-5-15, this processor can be overclocked more but I have not attempted to get it higher yet.  I will do so later and post more results.  This was just one client running on all 4 cores, if my math is right I'm averaging 1422 kpts/sec

Uptime (secs),Mpts in file,Estimate kpts/sec
15939,87958.5,0.00
16839,89204.7,1384.60
21039,95327.6,1444.84
25239,101401.4,1445.40
26140,102598.4,1435.21
27040,103812.6,1428.22
27640,104598.7,1422.16
28240,105674.3,1440.23
29140,106885.0,1433.75
29740,107972.4,1450.21
30640,109165.1,1442.55
31540,110402.5,1438.64
35740,116490.6,1440.94
36640,117709.7,1437.18
37540,119082.3,1440.84
38140,119872.3,1437.49
38440,120119.4,1429.30
42640,126171.6,1431.13
43540,127299.3,1425.32
44140,128255.3,1428.89
45041,129483.1,1426.89
45941,130700.0,1424.64
46841,131761.0,1417.48
47441,132966.9,1428.76
48341,134162.6,1425.98
49241,135376.8,1423.90
49841,136284.9,1425.48
50741,137501.1,1423.56
51641,138563.7,1417.44
52541,139821.5,1416.94
56741,146087.7,1424.65
57641,147372.5,1424.71
57941,147647.8,1421.09
58241,148216.2,1424.45
59142,149432.3,1422.92
60042,150648.0,1421.45
60942,151868.2,1420.13
61842,153083.4,1418.76
62742,154297.3,1417.41
63342,155200.1,1418.51
63942,156074.3,1418.99
64842,157339.1,1418.74
65742,158554.6,1417.50
66642,159705.5,1415.04
67542,160882.0,1413.16
68442,162107.4,1412.27
69042,163302.7,1418.82
69642,164119.0,1418.17
70242,165031.2,1419.30
70542,165084.5,1412.47
71443,166304.2,1411.54
72043,167520.4,1418.12
72343,167580.1,1411.64
72643,168337.8,1417.53
73543,169338.6,1412.76
74143,170222.5,1413.38
75043,171445.4,1412.55
75943,172645.3,1411.35
76543,173862.0,1417.46
76843,173893.6,1410.99
77443,174983.7,1414.95
78343,176198.2,1414.00
78643,176710.1,1415.40
78943,176786.2,1409.87
79843,177998.2,1408.98
[XS]riptide
2008-06-03 06:24:01
Nice result.  Well done.  Seems I might have to get my hands dirty again and rebench!  Just goes to show the power of 45nm Intel.
[XS]riptide
2008-06-03 06:27:07
PS: Your rate seems to be declining all the time.  Does your PC have OS issues?  Maybe some services or something cutting in after a while and hogging CPU power?  The best way to bench DPAD is to set up 4 DPAD folders and assign one instance to each core.  Stephens new version never addressed the non-100% use of multiple cores when multithreading.  It still can drop to the mid 90% CPU usage instead of full and total 100% usage all the time.
Stephen Brooks
2008-06-03 11:00:04
Maybe he has that OpenGL pipes screensaver turned on? 
[DPC]white_panther
2008-06-04 02:24:36
i benched mine pc
its a Intel Core 2 Quad Q6600 @ 2.40 GHz with 2 Gig 8500 ddr2 mem
if you need somthing else let me know

averaging 874.29 kpts/sec

Uptime (secs),Mpts in file,Estimate kpts/sec
28836,2212908.8,0.00
30046,2213972.0,878.74
31256,2215097.0,904.31
32466,2216204.9,908.10
33676,2217131.8,872.58
34281,2217702.8,880.51
35490,2218814.8,887.55
36700,2219985.2,899.83
38213,2221159.2,879.89
39119,2222122.7,896.07
40327,2223146.7,890.95
41838,2224285.4,875.00
42745,2225044.3,872.48
49399,2230955.9,877.65
50306,2231791.1,879.46
51513,2232970.2,884.66
52725,2233984.9,882.24
53936,2235018.1,880.85
54844,2235715.1,876.91
56054,2236900.2,881.44
57265,2237888.5,878.68
58778,2239057.9,873.31
59989,2240254.4,877.78
61502,2241479.4,874.62
63016,2242674.1,870.85
64226,2243801.8,872.93
65437,2244933.8,874.98
66648,2246052.1,876.54
68161,2247167.0,871.15
69372,2248309.5,873.32
70583,2249486.1,876.17
72096,2250665.3,872.79
73307,2251836.7,875.36
74820,2253033.7,872.59
76030,2254115.9,873.14
77241,2255170.5,873.09
78754,2256351.9,870.29
79965,2257526.6,872.65
81176,2258628.9,873.53
82386,2259732.8,874.40
83899,2260907.9,871.71
84201,2261041.7,869.37
85412,2262228.9,871.75
86623,2263284.5,871.75
87833,2264468.9,873.94
89347,2265656.9,871.71
90255,2266557.7,873.48
91764,2267761.1,871.67
98686,2273608.5,869.00
99889,2274826.3,871.43
101393,2276009.3,869.67
102596,2277194.2,871.55
103799,2278310.1,872.45
105303,2279494.0,870.77
111921,2285412.6,872.65
112823,2286238.0,873.10
114026,2287436.0,874.84
115530,2288624.0,873.36
116733,2289814.7,874.96
118237,2290994.8,873.44
119139,2291906.3,874.80
[DPC]white_panther
2008-06-04 02:25:40
[edit]
running 1 client
[/edit]
Silverthorne
2008-06-04 04:52:55
It could have just been something running in the background like an automatic defragger or antivirus, I'm running again tonight with a little more on the FSB and some other tweaks in the bios.  I'm running this on the Asus Rampage Formula so there are a lot of tweaks in the bios.
[TA]Assimilator1
2008-06-04 18:38:01
[XS]riptide
I take it you didn't read my benchmarks on the previous page then ,the new client is sufficiently better than the old one at multi-threading that running 4 clients now is no longer the highest scoring setup ,single client on autothreading is the fastest now ,despite ~95% CPU loading,(at least upto 4 cores anyway).
Check out my scores on the previous page for hard stats.

Silverthorne
Awesome score , whilst your quad is clocked ~7% faster than mine its score is about 35% faster!!, holy crap that's a huge jump! 
[TA]Assimilator1
2008-06-04 18:40:43
Grr ,darn lack of editing function

Btw my Celeron was good in its time, which was 9yrs ago! 
Inccidently its score reset to zero not long after hitting 1000 Mpts, so I never did get a results.txt ,anyone know why?
: contact : - - -
E-mail: sbstrudel characterstephenbrooks.orgTwitter: stephenjbrooksMastodon: strudel charactersjbstrudel charactermstdn.io RSS feed

Site has had 25162349 accesses.