stephenbrooks.orgForumMuon1GeneralBenchmarking with Mpts thread
Username: Password:
Page 1 2 3 4 5 6 7 8 9 10 11 12 13
Search site:
Subscribe to thread via RSS
Stephen Brooks
2003-05-26 08:11:56
Last night I decided to make a simple program that you can place in your Muon directory and it will estimate the rate of calculation from the growth of results.dat.  Obviously it get more accurate with time and is best run for a few days.  It also produces a benchcsv.log file which you can graph in Excel or whatever, and if you fit a line through the peaks of the sawtooth you'll probably get a better estimate.  It can be downloaded here:

By default it only samples results.dat every 15 minutes so won't really affect much on your computer.

I'm also interested to see if anyone has one of the new Pentiums with hyper-threading, whether increasing the threads to 2 makes this go any faster or not.

My P-II-400 here got roughly 16¾ kpts/sec.  This sort of thing will enable me to produce some statistics on the "GHz rate" of calculation, so my PC's architecture gives 16.75/0.4 = 41.9 (kpts/sec)/GHz = 41.9 particle timesteps per million CPU cycles.

The above is for example use only because I've just found my version of Muon was running on full-debug mode, so your values for similar machines should be higher.

Today's weather in %region is Sunny/(null), max.  temperature #NAN°C

[This message was edited by Stephen Brooks on 2003-May-30 at 21:45.]
2003-05-26 09:39:28
Muon1Bench started.  Interval = 900.0 sec
Call 'muon1bench 300' or similar from DOS to change the interval.

Initial values: uptime=5492 Mpts=302692.5

Not sure what this is telling me Confused

However I just found this box being crashed since lunch by whatever reason Wink

I'd say more, but I can't reach the keyboard from the floor.
Stephen Brooks
2003-05-26 09:56:12
You've got to leave it running for about 12 hours like I did for it to give any estimates, since it has to time the program and see how fast results are added.

Today's weather in %region is Sunny/(null), max.  temperature #NAN°C
Zph [Team WAU]
2003-05-26 16:34:28
Nice, I already made that feature in the mIRC addon script, let me see if I can fix up a graph that logs all sim.  times (it now only shows the last one) to plot it out, graphs graphs graphs!!!  Smile
[OCAU] badger
2003-05-26 19:59:41
I had previously calculated that my k6-400 does 59.7mpts/hour (16.6kpts/sec).  Was a fair bit of work, so I will try this out... Proudly sponsored by GRX-Computers
2003-05-27 01:12:47
i wonder if 12 hours is enough for slow machines ; i've got one working on a queue.txt ... not good for the average Big Grin

Goner - [DPC]TeamNWW
[DPC]TeamNWW - Huub
2003-05-27 01:30:09
I did some calculations by hand, but they should be fairly accurate:
P4-2.4 - 156Kpts/s (65Kpts/s/Ghz)
P4-1.8 - 66.42Kpts/s (36.8Kpts/s/Ghz)
P3-866 - 41.86 Kpts/s (48.33Kpts/s/Ghz)

The P4-1.8 was running without a top results file the last couple of days.. resulting in lots of low percentage results.. looks like that was not very efficient..
2003-05-27 01:44:46
P4-1.8 - 66.42Kpts/s (36.8Kpts/s/Ghz)
P3-866 - 41.86 Kpts/s (48.33Kpts/s/Ghz)

Now I see why both my 866's aren't that bad in relation to them others P4's/1.6/1.8/2.0 Cool

I'd say more, but I can't reach the keyboard from the floor.
2003-05-27 01:48:50
i don't think so, because the Mpts per time should be independent of the yield of the result.  (stephen, please correct me if i'm wrong). 
2003-05-27 01:50:20
i was talking about:

...The P4-1.8 was running without a top results file the last couple of days...

[DPC]TeamNWW - Huub
2003-05-27 01:58:03
Markus.. it is the only reason i can see that accounts for the difference in Kpts/s/Ghz i can see.. If what you say is true then my p4-2.4 and the p4-1.8 should give approx the same amount of Kpts/s for every Ghz..

I just put a top250 file on that p4-1.8.. let's see if that makes any difference...
2003-05-27 04:07:26
My AMD Athlon 1 Ghz does about 80 kpts/sec (running on win xp).

Proud member of DPC.
Stephen Brooks
2003-05-27 08:17:45
Originally posted by MaFi:
i don't think so, because the Mpts per time should be independent of the yield of the result.  (stephen, please correct me if i'm wrong). 
Yes, that's right, or at least, that's how it's meant to go.  I'd be a bit suspicious of the middle result of those three (with the low efficiency for the 1.8GHz machine). 
Though actually the early P4s weren't nearly as efficient as the P3s. They've only just caught up with their "clock-rate/performance" proportionality with the new FSB and HT improvements.

I've also figured out that it would be more accurate for the program to fit a trendline through the tops of the peaks of the Mpts-so-far vs. time graph.  Might improve the program to do this automatically at some stage.

Today's weather in %region is Sunny/(null), max.  temperature #NAN°C

[This message was edited by Stephen Brooks on 2003-May-27 at 16:29.]
2003-05-27 08:22:16
@[DPC]TeamNWW - Huub
i don't exactly know, but didn't intel increase the L2 cache from williamette to northwood core?  and what about the FSB (100MHz or 133 MHz?)

edit: BTW: my athlon xp2400+ (15*133=2000MHz, nForce2 chipset) does approx.  230 kpts/sec.  with the unixmuon client.  windows-client(under win2000): approx.  165 kpts/s on the same machine. 
that is 115 resp.  82.5 pts/millon clockcycles.

[This message was edited by MaFi on 2003-May-27 at 17:08.]
2003-05-27 09:25:31
What I've noticed, running the console anyway, is that near the end of simulations, when the number of simulated particles drops to only a handfull, the timesteps/second does not increase in proportion.  The graphical client seems to do much better with this, and I don't know about the background (because there's no way to monitor the progress).  That could explain the slower simulations for the lower-yield machine; since more simulations are being done, a higher proportion of the time is being spent on the few particles at the end, so in the long run the simulations become less efficient.  Stephen, is there a procedure in the program that is being done between the timesteps or something, so the time taken on it is independent of the number of particles?  That would limit the speed increasing proportionally with a decreasing number of particles.
Who knows, maybe I'm just talking nonsense.

EDIT: Is the source available for download anywhere?  I'd be interested in going through it, even if only to become more familiar with how it works.

[This message was edited by Joey2034 on 2003-May-27 at 22:16.]
John Kitchen
2003-05-27 11:29:17
Zero Mpts???

I am running the benchmark program on 2 machines.  Initially on my dual Athlon MP2100+ and just started on a 1.2GHz Pentium.

In both cases, I saw the "Initial Mpts" was 0.0, and on the dual, which has run for many hours, the Mpts is still zero, but results.dat is growing (quite rapidly of course).  The viewresults.exe program works fine and shows the new workunits in both results.dat and results.txt.

The second machine has only just started, so is inconclusive.  ***

Any ideas as to why?  Thanks, John

*** PS The same thing is happening on the second machine:-
Muon1Bench started.  Interval = 900.0 sec
Call 'muon1bench 300' or similar from DOS to change the interval.

Initial values: uptime=24204 Mpts=0.0

uptime+900 Mpts+0.0 Estimate 0.00 kpts/sec
uptime+1800 Mpts+0.0 Estimate 0.00 kpts/sec

[This message was edited by John Kitchen on 2003-May-27 at 20:00.]

[This message was edited by John Kitchen on 2003-May-27 at 20:01.]
John Kitchen
2003-05-27 11:33:26
I can't as yet use the benchmark program to confirm real statistics, but I do see that on my dual machine with threads=2, the CPU consumption is low when the particle count drops (either due to poor design or in the final stages of simulation when most of the particles have gone to that great orbit in the sky.

I would guess that on multi-CPU machines the particles per GHz ratio would vary with yield.
2003-05-27 12:07:44
did you put muon1bench.exe in the same directory as your muon program files ?
John Kitchen
2003-05-27 12:35:38
Originally posted by ZeRoC00L:
did you put muon1bench.exe in the same directory as your muon program files ?

Sure did.
[OCAU] badger
2003-05-27 17:40:14
to answer those who are wondering about whether the yield makes a difference to the mpts per unit time:
when I did my benchmarking of my k6-400, I ran it with a top 250 results files (best result 8.9%) and then later starting with no results.dat (best result 0.86%).  both gave 59ish mpts per day. Proudly sponsored by GRX-Computers
2003-05-27 19:18:07
Stephen, I just wanted to report a strange occurence with the benchmark utility.
Here is a section of my log file, some of them are negative values:
Uptime (secs),Mpts in file,Estimate kpts/sec
Uptime (secs),Mpts in file,Estimate kpts/sec

'Normal' values are 190-200 kpts/sec.  The lower values from 29-99 were logged while playing Unreal Tournament and would be expected.  After closing the game, the values dropped to the negative readings, idle computer and all apps closed.  During the negative readings, there was a long re-check of a high result and a few junk results which appear to be a valid results, no problem there.  The normal readings at the bottom were after re-starting the benchmark utility.  Not sure if this is a bug or just an odd quirk.
Stephen Brooks
2003-05-29 13:32:33
OK, this new version only considers data points when the Mpts in the file has "just" risen.  I.e. on the sawtoothy Mpts vs. time graph, it takes fit lines from the tops of the teeth.

The irregularities you're seeing with the benchmark program are probably due to the fact it uses a cut-down results-reader function rather than the full one.  This expects all your linefeeds to be _correct_, so for instance perhaps John Kitchen's 0 Mpts problem is caused by having a carriage return right at the beginning of his file, putting everything beyond that out-of-whack.

One quick way of making sure all your CRLFs are in the right place is to load results.dat into WordPad and get it so that a PageDown operation advances the view by exactly an even number of lines.  Then if you whizz down the file holding PageDown, you'll see a jump if you go past an incorrect linefeed.

Of course the other thing I could do is put the full version of the result-reader into this program.

As for some designs going "proportionately slower", there is (inevitably) a small overhead per loop, but from looking at the background client, it adds up to maybe 10 seconds per simulation (at a guess).  That is, when there are only 1 or 2 particles left, the ns count increases very fast indeed.  However, with 2 threads, there is the issue of waiting for them all to finish before continuing.  Since the threads are split apart and rejoined on every timestep, this introduces a kernel-dependent fixed overhead per loop too.  So for multithread users this overhead might be a bit higher, though I'd guess not huge.

Today's weather in %region is Sunny/(null), max.  temperature #NAN°C
2003-06-14 06:15:45
Dual Athlon MP2200- 1.8 GHz - 240 kpts/sec (66.7 kpts/sec/GHz)
Win2k Server

ARS Team Atomic Milkshake
Unofficial Muon1 FAQ
John Kitchen
2003-06-18 06:40:46
Dual Athlon MP2100- 1.735 GHz - I am seeing various rates yield dependent

Threads=1 with CPU affinity
High Yield (13+%) = 125 kpts/sec
Low yield (6%) 115 kpts/sec

High Yield (13+%) 250 kpts/sec (and typically I see 92% approx CPU util)
Low yield (4-5%) 205 kpts/sec (lower CPU util about 82%)

I calculated the CPU utilizations from the long term (>24 hours) data in Windows Task Manager using the formula: "CPU Time" / Elapsed Time / number of processors (2)
2003-06-20 21:29:31
Could anyone suggest a method to benchmark the Linux client?  Although i'm thinking that because it's based on v4.3 it may not be accurate to directly compare it to v4.3c Windows clients.
[OCAU] Noodles
2003-06-20 22:22:44
The Muonbenchmark program doesn't work for me... It just says "No change in file size" when I ran it for 4+ hours (yet the results file went from 0kb to 53kb during the benchmark thingo). 

ps- I'm running the muonbenchmark from my Muon directory.
Mike Malis
2003-06-20 22:40:11
uptime+111921 Mpts+12288.9 Estimate 110.09 Kpts/ sec running on Athlon 2000+ @1.67 Ghz therefore approximately 65.9 Kpts/Ghz-sec

P.S. The above thread is new also.  It was posted a couple minutes before this one.  Smile

Mike Malis
2003-07-25 10:28:07
Results from me

Old P2-333 averaging about 14kpts/sec (128mb ram, WinME)
this Duel P3-550 averaging about 32kpts/sec (256Mb ram, Win2k)
2003-07-27 08:24:55
I should point out that the duel P3 machine's score is low because I'm a computer hater, and this machine is being punished.

No, really it's because Solidworks grabs so much stuff (as does bittorrent and Nero) that muon1 must get what it can, which often isn't much.  The P2 is left alone though, since it's my wifes machine, despite the fact that when you sit at one, you sit at both.

Liverdyne Robotics

[OCAU] badger
2004-04-18 16:28:36
I just tried downloading this again to see if running CPU/MEM FSB at 185/185 or 200/166 is better for muon production.

Any way I get an error: "muonbench.exe is not a valid Win32 application"

any ideas
Stephen Brooks
2004-04-20 13:58:56
That might be because it got corrupted in an upload I did last month, which didn't seem to happen properly... I'll try to upload it again tomorrow, see if it helps.

[edit] Done that.  Does this work now?
2004-04-21 06:16:20
Yes, it works.  Thanks Stephen.
[OCAU] badger
2004-04-21 18:30:01
thanks stephen, works fine now, I'll let you all know the result...
[OCAU] badger
2004-05-03 20:13:55
just in case you were hanging out for the result, here it is:

running cpu/mem at 185/185 gives 229kpts/s, 200/166 gives 253 kpts/s no real surprises there.  I get a higher 3dmark with 185/185 though...
the CPU is a XP 2400+ Barton (multi = 11) so I have it running at 11x202 = 2235 Mhz in the nice cold weather we have been getting here in southern Australia.

BTW I have DDR 400 ram, I think that there is a mobo problem which won't allow the ram to run at 200 if cpu is much higher than stock (166)
(mobo is asus a7n8x-deluxe rev 1.3)
Stephen Brooks
2004-05-04 06:37:17
Interesting to see the scaling with CPU speed, Badger.  Obviously Muon1 is not stressing the RAM too much.
[OCAU] badger
2004-05-05 22:41:28
yes quite interesting.

I realised yesterday that my Nforce2 board allows me to change the multiplier.

after a bit of a play I tried 12.5x179 (2237Mhz) (and could thus run the ram at 179 too) and got it up to 259kpts/s
3dmark is higher too
2004-05-06 13:23:37
heres mine so far:

uptime+33300 Mpts+6043.0 Estimate 181.47 kpts/sec

Athlon XP 2000+ (1.67 Ghz)
512mb PC2100 RAM
2004-05-09 08:56:29
uptime+141901 Mpts+18460.2 Estimate 130.09 kpts/sec

P4A 2,3 MHz
512MB pc2100
Asus P4B533
WinXP Prof SP1

Quite slow, isn't it?
Stephen Brooks
2004-05-10 01:04:17
2.3 MHz _is_ kind of slow, yes.  Razz
2004-05-13 21:26:00
130.09 kpts/sec for a 2.3 MHz damn...
And I'm currently running a 1GHz box for a measly 53 kpts/sec.
[OCAU] badger
2004-05-14 00:32:03
Originally posted by Stephen Brooks:
2.3 MHz _is_ kind of slow, yes.  Razz

for a p4... my 2500+ is running at only 2255MHz and gets >250kpt/s

go AMD
2004-08-20 18:22:22
How long should I run muon1bench on my version 4.4x before I have enough data to post here and have you guys tell me if I'm doing well?  And what do I post?  The contents of benchcsv.log or is there a way to copy and past what's in the dos box?
2004-08-21 07:28:46
What can this info tell me?
I let it run overnight.  12 hours or so.  I started it about 10 hours after muon was started.

Muon1Bench started.  Interval = 300.0 sec
Call 'muon1bench 900' or similar from DOS to change th

uptime+0 Mpts+0.0 No estimate so far
uptime+300 Mpts+15.1 Estimate 50.31 kpts/sec
uptime+1200 Mpts+457.8 Estimate 381.36 kpts/sec
uptime+2100 Mpts+139903.4 Estimate 66596.51 kpts/sec
uptime+2400 Mpts+139924.8 Estimate 58280.34 kpts/sec
uptime+3601 Mpts+140372.6 Estimate 38977.27 kpts/sec
uptime+4801 Mpts+140848.7 Estimate 29332.05 kpts/sec
uptime+5702 Mpts+141282.3 Estimate 24774.86 kpts/sec
uptime+6002 Mpts+141302.3 Estimate 23539.50 kpts/sec
uptime+7203 Mpts+141740.8 Estimate 19677.38 kpts/sec
uptime+8103 Mpts+142135.7 Estimate 17539.78 kpts/sec
uptime+9304 Mpts+142584.0 Estimate 15324.80 kpts/sec
uptime+10504 Mpts+143020.6 Estimate 13614.98 kpts/se
uptime+11705 Mpts+143441.3 Estimate 12254.52 kpts/se
uptime+12305 Mpts+143676.1 Estimate 11675.81 kpts/se
uptime+13505 Mpts+144114.7 Estimate 10670.45 kpts/se
uptime+14706 Mpts+144536.2 Estimate 9828.06 kpts/sec
uptime+15306 Mpts+144820.1 Estimate 9461.20 kpts/sec
uptime+16207 Mpts+145104.1 Estimate 8953.10 kpts/sec
uptime+17407 Mpts+145584.0 Estimate 8363.22 kpts/sec
uptime+17707 Mpts+145634.9 Estimate 8224.34 kpts/sec
uptime+18308 Mpts+145915.6 Estimate 7970.02 kpts/sec
uptime+19508 Mpts+146314.5 Estimate 7500.00 kpts/sec
uptime+20709 Mpts+146753.6 Estimate 7086.42 kpts/sec
uptime+21009 Mpts+146810.9 Estimate 6987.92 kpts/sec
uptime+22209 Mpts+147248.9 Estimate 6629.91 kpts/sec
uptime+22810 Mpts+147531.2 Estimate 6467.82 kpts/sec
uptime+23710 Mpts+147808.5 Estimate 6233.90 kpts/sec
uptime+24310 Mpts+148101.6 Estimate 6092.03 kpts/sec
uptime+25511 Mpts+148568.1 Estimate 5823.63 kpts/sec
uptime+26711 Mpts+148997.9 Estimate 5577.99 kpts/sec
uptime+27912 Mpts+149445.0 Estimate 5354.10 kpts/sec
uptime+28512 Mpts+149728.0 Estimate 5251.31 kpts/sec
uptime+29713 Mpts+150166.5 Estimate 5053.90 kpts/sec
uptime+30313 Mpts+150449.0 Estimate 4963.14 kpts/sec
uptime+31213 Mpts+150789.2 Estimate 4830.88 kpts/sec
uptime+31813 Mpts+151072.9 Estimate 4748.65 kpts/sec
uptime+33014 Mpts+151510.0 Estimate 4589.21 kpts/sec
uptime+33914 Mpts+151941.3 Estimate 4480.09 kpts/sec
uptime+34214 Mpts+151956.6 Estimate 4441.24 kpts/sec
uptime+34815 Mpts+152290.8 Estimate 4374.27 kpts/sec
uptime+36015 Mpts+152703.0 Estimate 4239.91 kpts/sec
uptime+36615 Mpts+152972.5 Estimate 4177.76 kpts/sec
uptime+42018 Mpts+155158.6 Estimate 3692.66 kpts/sec
uptime+43218 Mpts+155603.9 Estimate 3600.39 kpts/sec
uptime+43818 Mpts+155887.2 Estimate 3557.54 kpts/sec
uptime+45019 Mpts+156324.5 Estimate 3472.38 kpts/sec
uptime+46219 Mpts+290940.7 Estimate 6294.71 kpts/sec
uptime+47120 Mpts+291379.6 Estimate 6183.74 kpts/sec
uptime+48320 Mpts+291815.9 Estimate 6039.13 kpts/sec
uptime+49521 Mpts+292295.9 Estimate 5902.41 kpts/sec
uptime+50121 Mpts+292499.2 Estimate 5835.78 kpts/sec
uptime+51022 Mpts+292924.4 Estimate 5741.12 kpts/sec
Stephen Brooks
2004-08-23 01:49:18
You've not switched auto-download of sample files off!  That's why your benchmark above has the big jumps in it.  However I can subtract those off.
2004-08-24 05:46:10
Is that the way I want to normally run muon or is that just the way it should be run for a bench?  I turned off the sample file thing.  It was set for 3.

Is all the benching so far useless?  I can't learn anything from that?
Stephen Brooks
2004-08-24 09:01:53
The above data is fine as long as I miss out the bits where it jumped and average over everything else.

You should probably only switch off the sample files download for benchmarking purposes, unless you want to run muon in an isolated breed.
2004-08-24 17:04:38
OK.  I turned off the samples download and restarted both Muon and the muonbench.  My benchcsv.log looks very different now Eek
What can I learn from this?

Uptime (secs),Mpts in file,Estimate kpts/sec
Stephen Brooks
2004-08-26 06:44:03
That's very good.  Your rate is evidently about 404kpts/sec.
2004-08-26 16:41:34
Thanks for the response.  I just wanted to make sure I had everything set up properly.  I might throw in the rest of the machines in the house this weekend :dunno:
Stephen Brooks
2004-09-17 01:16:22
spldart, can you tell me what processor you were using to get the above 404 kpts/s?
: contact : - - -
E-mail: sbstrudel
Yahoo: scrutney_mallard
Jabber: stephenbrooksstrudel
Twitter: stephenjbrooks

Site has had 16279068 accesses.