Ive been running DPAD for over a year, unitl a few days ago on a slow machine but last week i bought a Core 2 Duo E6600.
Uptil now the performance has been less than statisfying: about 35k MPTS per day total (client version 4.4.3d).
A friend of mine running the same client on an AMD X2 4600 makes about 80-90k MPTS per day.
Another friend runnings als the same client on an overclocked Opteron 175 does a maximum of 135k MPTS per day.
I thought the new Intel Beasts were slaying the AMD's but i appears that this is not the case on DPAD.
Any suggestions to get more performance on DPAD?
Intel E6600, 2 x 2,4GHz , 4MB L2 cache
2x512MB (1G GEIL PC6400 on 4-4-4-12 timing
Windows 2000 Server operating system
I have been reinstalling DPAD and now CPU utilization is about 95-97% as a simulation starts and dropping of towards the end of a simulation to about 80%. This happens when the particles go down to about 500-1000. This is using the commandline version.
As the install is fresh i have no idea what my scores with the new install are. I suspect they wil be about the same.
|BaDu, my dual core Opteron and WinXP does the same thing. I pretty much ignore it anymore. |
If someone has found a good fix, perhaps they'll post it.
|An AMD X2 4400+ does 40k mpts a day when untouched (480 kpts/sec*60*60*24=41.4k mpts theoretically), from my own experience (mine's overclocked to 2*2.42 ghz). I doubt 80-90k on a 4600+ is possible, but if it Ãs possible I would want to know how to crack up mine . I believe Intel's architecture is the cause that dpad isn't that good on these machines. Can't say though since I don't own an Intel no more.|
This is the only Extreme benchmark I found around: Intel Core 2 Duo X6800 2,93Ghz - v4.43d - 495 kpts/sec
|Xanathorn: I saw the same benchmark 495kpts/sec. My E6600 does about 400kpts/sec over the last 24hours.|
JonB: my friend had an overclocked Opteron, but I doubt it makes such a diufference, Ill ask some more about hig rig.
Ik have no doubt that both reported scores (Opteron and X2) are real (im in the same DC team).
Xanathorn should now us : Kuuke, SSSF and BaDu (and others).
Seasons Greetings from team DeApen (temporary name).
|This sounds a bit strange... I hope the client is recording the right number of Mpts on those machines - it might be a good idea to compare the "average Mpts per simulation" as well as Mpts rate, just to make sure nothing fishy is going on.|
Intel's architecture *ought* to do pretty well. At worst, I'd hope similarly to AMD at the same clock rate (dual core vs dual core).
|We are currently actively investigating the "problem". I will soon post results here (within a few days).|
|I notice that my HT intel chip doesn't like using all the CPU, especially if it is near the end of a run. I actually run 2 instances, one with 2 threads, the other a single, to mop up all the CPU cycles. |
I've also noticed a significant decrease in output with the following conditions: results.dat is very large, results.txt is very large, autosend is enabled in http mode, and also if the auto updates are on but nothing is available on the website (which seems to always be the case for me)
|note: this thread: http://www.stephenbrooks.org/forum/?thread=1136&bork=acvnmbjzqn has a bunch of discussion on getting the most out of multi cpu and HT machines.|
|OUCH Most of my results.dat files are 200 to 300 Meg in size and Growing day By day|
I had Asked Stephen if I could Remove or archive them in a E-Mail I sent him
But never got a Reply to it
That would free up 13 Gig on my Pharm
Is it OK to delete them? or is it just best to just replace the entire program folder
|Ah! Most of that e-mail was about the stats, so I went to sort that out, didn't see the sentence on the end about 200MB results.dat.|
I have on the features list for the next version of Muon1 an option to automatically shorten the local results.dat when it exceeds a certain size, either by a random filter or keeping only the best percentage. Perhaps I should try and do a small release that just adds that feature.
The thing is, for small sizes (up to a few 10s of M, the more results the better, but once the file gets so large it can't be loaded all into RAM at once, it will cause thrashing and a big slowdown.
|Is this why I have been seeing DPAD error out and crash more often? And also I see it eating up all or more of the 512 MB or Ram installed|
Is it Because it is loading a 200 MB DAT file right into Memory
I looked at the computers that have been crashing DPAD, and they all have Result.dat files close to or over 400 MB OUCH
I archived the old DPAD folder and replaced it with a fresh one. MUCH MUCH FASTER
Is there no way to make the DAT files work in the swap file or compress them ?
If not only a PC with 2 or more Gig will be able to go past this point without crashing
|Any new info on the low performance of DPAD on C2Ds?|
>>>>This sounds a bit strange... I hope the client is recording the right number of Mpts on those machines - it might be a good idea to compare the "average Mpts per simulation" as well as Mpts rate, just to make sure nothing fishy is going on.<<<<<
Was that done? if not how do I do it?
|Damn the lack of an editing function!!|
Anyway,forgot to mention I have a C2D now
|I'm getting the impression Muon1 responds to RAM latency. Core 2 may be good at a lot of things, but it's still on the old slow Intel FSB. AMD's memory latency is lower.|
|Hmm ,maybe but that doesn't explain why C2D is slower than Athlon XPs (clock for clock on single thread),I can't help feeling their's something else going on here.|
How do I do that comparision you asked about?
|You want to know how to do a "comparison", by which I suppose you mean when I said|
compare the "average Mpts per simulation" as well as Mpts rate, just to make sure nothing fishy is going on."Well, you look at your results.dat file on each of the two computers and average the Mpts values through many results that you made on that machine.
However this measure is pretty sensitive to whether you use samplefiles or not and which lattices you've decided to run. It's probably a good idea to divide all results with "#runs=5" in their first line by 5, since the Mpts for those is actually the sum of 5 simulations.
|Reading around, I've found that one of the areas where AMD still beats Intel chips is in "HPC" (high performance computing with a lot of floating point and memory bandwidth). So I guess it's not so weird that Muon1 is better on those machines still.|
|I wish your forum had email subscription ,I keep losing track of these various threads |
I've no idea what "#runs=5" means nor what it relates to, is it an excel formula?
Btw I no longer have the Ath XPM system, that got retired when I upgraded to C2D last June, however my 2nd rig is a S754 Sempron @2.5GHz (which was about 10-15% faster than my old XPM) so I could use that. Though to complicate things further my main rig runs at 3GHz & not 2.5GHz, I just did a brief test at 2.5GHz shortly after I got it. Maybe thier are just too many variables now?
|Forgot to say, what you said about HPC performance makes sense if those 2 peformance factors are heavily used by DPAD.|
(could really do with an edit function too )
|runs=5 refers to a result that was "quarantined" as a potential high value. Muon1 will then run the same parameters 4 more times to make sure the first run wasn't a fluke. The end number is the average of the 5 runs. The Mpts is the actual total for all 5 runs.|