Stephen Brooks 2003-05-16 13:15:54 | I plotted the 600`000 or so results for v4.3 so far, with muons score against raw Mpts needed in the calculation. Here you can see a few things. The most intense, shallow line low down are the run-of-the-mill runs. The one with about 5x the gradient are the rechecked results (when the rechecking has worked). In between you can see fainter lines with 2,3,4x the gradient, which I am pretty sure are the result of the "design slips out of queue when autosave is interrupted" bug. The dots way up at the top are most likely John Kitchen and others who have increased their desired number of rechecks. I can see _one_ rogue 17% result that is clearly fake, and an odd dot below the bottom line that within all rights shouldn't be there. As for the rest... nothing from this view screams "Erroneous" at me. It looks like we're genuinely up to 13%something. But I'll now plot the results I've got through from v4.31c so far to see if the bugfixes have changed the distribution any. Today's weather in %region is Sunny/(null), max. temperature #NAN°C |
Stephen Brooks 2003-05-16 13:40:16 | Here is the one for the ~4`500 of those above results that have been marked as coming from v4.31c. Well it's not perfect, but there seems to be less rubbish flying around, and in particular (so far) no rogue high results. We can see the 1x and 5x lines, plus some high dots which are 10x rechecks. There's an odd sequence where something appears to have got stuck and submitted itself after 6,7,8,9,10 rechecks. Maybe I should look into what happens when the user changes the number of rechecks half-way through a queuing cycle or something of that nature. There are a few points in between the 1x and 5x lines still, which I still have to look into. The next step is to divide the Mpts down by the number of rechecks, and then we _should_ get just one streak. Today's weather in %region is Sunny/(null), max. temperature #NAN°C |
András Horváth 2003-05-16 14:16:30 | Hi Stephen, The graphs are very interesting. Do you have similar graphs for Linux clients and Windows clients only? If the strange values are coming from an operation system specific source (e.g. C library), the results for Linux and Windows clients should be different. Is it a stupid idea? (I don't know the programming details of clients.) András |
Stephen Brooks 2003-05-16 14:58:43 | The Windows and Linux clients do not identify themselves in any way when they send their results to the FTP server. I could do something like encode the source of client in the 5 random letters/numbers in the filename, or something, but I can't do that retrospectively for results I've already got. What I can do is find out the username who generated some of the 'rogue' points. Today's weather in %region is Sunny/(null), max. temperature #NAN°C |
John Kitchen 2003-05-16 15:13:04 | I will own up to some of the "rogue" points. Since I have been seeing some very high (>13%) yields, I decided to set the recalculation parameter to 10x instead of 5x with the intention of ensuring a more accurate and repeatable result. Some of those points out in the top right corner are mine, (here is one - 13.391895 (2687.5 Mpts) [v4.31c] ) and since they are the result of 10x recalc, they should be statistically significant. At this moment I think I own the highest verifiable yield. But for how many more hours? |
András Horváth 2003-05-16 15:31:28 | Hi, All fizika.sze.hu (former horvatha) results are generated by Linux clients. (You should use this information to generate platform-specific graphs.) Does the recalculation protect against all kind of errors? I think: it does not. The platform-specific graphs could be a good test whether some operation system or C library specific strange bug is in the background. (I think...) It is only one type of error source, but it may be worth to check it. András |
Stephen Brooks 2003-05-18 14:44:48 | OK, here's a graph that's a little less confusing. The Mpts score is now divided down so that it's the average Mpts used for each run of a particular result (if there were more than one run). So those multiple-line things have been "folded down" into one. (Note ~17`000 points this time due to the v4.31c submissions over the weekend while I was out). I've also labelled the users corresponding to "rogue" points where the amount of calculation does not appear to key in properly with the yield percentage. John Kitchen seems to have managed to produce a few results after much rechecking, but with a really low yield, and perhaps somehow hasn't included the #runs= data to tell me how many rechecks that corresponds to (either that or he nulled out the score so it didn't affect other people's stats too much). The only other rogues are a triplet from [DPC] plus Diablos.de. Dunno what sort of client they were using... perhaps they could suggest anything nonstandard about their results. Also I'm trying to remember if the version-number tag itself is checksummed, because it's possible that label could have been altered. Today's weather in %region is Sunny/(null), max. temperature #NAN°C |
John Kitchen 2003-05-18 15:55:10 | quote: Beats me! Could it be the result of changing the "Rechecks for best-so-far results" parameter in mid-re-calculation? Stephen, is it possible to report on just those results which had at least 5 calculations? I think if you eliminate the single calculation results from the chart, we will see a much tighter pattern. John |
Stephen Brooks 2003-05-18 16:47:08 | I've now subtly changed the above image. If you look at the clusters you'll find there are "bright" white points, which are the one that have been rechecked 5 or more times. As you'd expect, the clusters are smaller (by about the factor of sqrt(5)) and they are towards the right of the larger 1-sim clusters because only the best ones are ever rechecked. You'll probably also notice that the shape of the entire 1-sim cluster is not elliptical any more (like it was in previous versions) rather, it's pointed at the high-yield end, because the strays in the high direction have all been rechecked to nearer the centre of the thing. If you can't see the white dots very clearly, try loading the image in to an art package and decreasing the "gamma". Then maybe convert to black and white and decrease it again. Today's weather in %region is Sunny/(null), max. temperature #NAN°C |
Stephen Brooks 2003-05-19 13:57:53 | tantalumrodz=558;tantalumrodr=027;s1l=995;s1f=998;d1l=000;s2l=990;s2r=745;s2f=991;d2l=165;...s26l=993;s26r=990;s26f=996; Ah, yes, now _these_ are those rogue results labelled in the graph above. I've put the "..." in the first lines and added the "by (name)" in the second line, though note the last result already had a comment on it when I added that. Today's weather in %region is Sunny/(null), max. temperature #NAN°C |
[dpc]McPhisto 2003-05-19 14:15:37 | Hello Stephen, What's this about rogue results ? , to be honest I don't realy understand it a bit. for your info, I only modified the config file to No autosend, I used the common .dat file of dpc. regards McPhisto |
John Kitchen 2003-05-19 15:58:00 | quote: I keep hoping there is a prize for the WORST design. |
Stephen Brooks 2003-05-26 09:51:35 | Distribution from v4.31c so far with 87613 points. We have some more rogues, and the one on the right is [DPC]DemerZel. Seems like about 1 in 5000 results are wrong - if I manage to find the bug in v4.31c some people are talking about, we'll see if this can be decreased further. Today's weather in %region is Sunny/(null), max. temperature #NAN°C |
Herb[Romulus2] 2003-05-26 10:06:11 | I've produced some of them rogues since yesterday on one box in the class of Macal's. I hope I've caught them all, maybe on or another made it through. After rebooting the box, it seems to have disappeared. I have no clue how this came up ------------------------------- I'd say more, but I can't reach the keyboard from the floor. |
[DPC]Stephan202 2003-05-26 11:19:22 | This last graph clearly shows that the lower limit of mpts / yield is 19, not 18.5 So it's not a bug in my script --- Dutch Power Cow. MOOH! |
[DPC]Stephan202 2003-05-26 12:00:05 | Btw, check this graph, after some big adds of DPCers (have you seen our MF? ): There's now a second peak, more to the lower boundary. I was actually expecting these values to rise. --- Dutch Power Cow. MOOH! |
Pollock[Romulus2] 2003-05-26 13:09:36 | Stephan, I noticed that just by the scoring. The percentages of the scores are climbing at a rapid pace, but the Mpts. have remained almost exactly the same. It may be that we are just creating a more efficient design (high yield with less effort) or focusing too much in one direction. Not sure what to think of it. My results still seem to be moving upward, so I guess that is a good sign. Regularly running 13.64, but the re-checks bring them down a few points. It was good to see some of the DPC folks moving up, a lot of big numbers from different sources will give a wider variety of results to choose from. It was getting lonely at the top of the list. |
[DPC]Stephan202 2003-05-26 13:36:41 | True. These new 'seeds' may even lead to a slightly higher maximum yield. Let's hope so. --- Dutch Power Cow. MOOH! |
John Kitchen 2003-05-27 14:42:39 | quote: Stephen, how about removing the rogues from the stats so that we can see what we are REALLY shooting for! Thanks!! John (a convicted Rogue) |
Pollock[Romulus2] 2003-05-27 18:57:01 | quote: Good idea.John. Many of them are obvious and have been claimed by their owners. |
px3 2003-05-28 10:01:41 | Good idea to remove the rogues. Stephen if you need infos about our found "negativ" results, i'll send you the score+checksum+data PX3 |
Stephen Brooks 2003-05-29 05:45:04 | My normal way of dealing with those rogue results would be to release a corrected version of the client, and then when it seems to be working correctly, look at the results only from that as valid. This is what I was hoping would happen with v4.31c, but there's still something lihe 1 in 5000 of incorrect results. Since there is logic in my graph program for finding the rogue results, I can make it zero-out the checksums for those results, so they'll be removed from the database the next time that users sends a result in. That would be a temporary solution to the problem; I just haven't pursued that so much because I'm really aiming at fixing the client so that these rogues don't happen in the first place (and it operates without intervention from me all the time). If that proves to be impossible, I would have to instead add in a system for the bestNNN results to be sent up-and-down to a few completely different users and 'doubly verified' that way. Today's weather in %region is Sunny/(null), max. temperature #NAN°C |
John Kitchen 2003-05-29 08:18:07 | quote: Fixing the client is goodness, true, but we, the members of this project, are a part of this project too, and it is important to us that the results you publish are meaningful. Right now, the top yield positions in the stats are all occupied by results of your programming mistakes. Put yourself in our shoes just for a while, and try to keep us happy. We are important to you. |
Stephen Brooks 2003-05-29 14:13:17 | I've removed the rogue results in the last graph I made - but now realise that only did v4.31c, and there are rogues from v4.31b, so I'm making it do another scan now, to try and get rid of those Today's weather in %region is Sunny/(null), max. temperature #NAN°C |
John Kitchen 2003-05-29 14:27:05 | quote: But the stats are still showing the rogues:- 1. [DPC]DemerZel 0 0 4346 22329 3439457.4 20.355300 0 2. [AVE] Mr. Sledge Hammer 0 0 90 17051 2801950.5 17.346667 2732399 3. [AVE]Caesar 0 0 4490 9314 1067412.4 13.940349 3 4. [AVE] PX3 0 0 17979 24881 3378757.0 13.879626 6 |
Stephen Brooks 2003-05-29 18:26:34 | 4 HOURS LATER... Well, it took rather a lot of faffing around, but I've removed a load of obvious rogues from the database. The 20 and 17% should be gone soon anyway. [edit] Yes, yes, I know the page was still showing the rogues. That's what reminded me that my graph only showed the v4.31c rogues, and the stats were showing ones from earlier v4.3x versions as well. If the stats are STILL showing the rogues an hour or so from now, it'll be because some cached total somewhere hasn't got flushed properly... [/edit] Today's weather in %region is Sunny/(null), max. temperature #NAN°C |
Stephen Brooks 2003-05-29 18:30:06 | quote: I noticed you and [AVE]Sledge Hammer had rather a lot of results removed in this latest sweep. Was this the pthreads library problem? They seem to occur in "chunks" of your files so far as I could see. Today's weather in %region is Sunny/(null), max. temperature #NAN°C |
px3 2003-05-29 20:11:26 | Hi Stephen, yes, that's the problem we discussed via e-mail. One of the first Solaris ports produced high mpts with low percentages, because of the pthread lib on one of mine developement suns as you might remember, i needed round about 2 weeks to find this bug, so it could be possible that you found a lot of misrated results. i'd updated all machines running muon, but if these results still occour in my results then it's possible that i forgot some machines. What i'm wondering about is, that the rogue 17.xxx and 13.xxx results are still in your stats. Seems as if your cleanup script didn't catch everything. PX3 |
px3 2003-05-29 20:23:12 | Again me Stephen, please grep our results and look for: tantalumrodz=292;tantalumrodr=000;s1l=596;s1f=996;d1l=564;s2l=961;s2r=998;s2f=991; d2l=000;s3l=999;s3r=999;s3f=932;d3l=660;s4l=999;s4r=998;s4f=975;d4l=424;s5l=998; s5r=999;s5f=999;d5l=171;s6l=961;s6r=989;s6f=999;d6l=000;s7l=987;s7r=998;s7f=966; d7l=446;s8l=981;s8r=999;s8f=999;d8l=198;s9l=610;s9r=975;s9f=976;d9l=249;s10l=825; s10r=999;s10f=994;d10l=050;s11l=962;s11r=990;s11f=903;d11l=017;s12l=993;s12r=957; s12f=949;d12l=996;s13l=970;s13r=934;s13f=998;d13l=212;s14l=997;s14r=985;s14f=999; d14l=134;s15l=989;s15r=998;s15f=999;d15l=121;s16l=974;s16r=975;s16f=983;d16l=335; s17l=988;s17r=980;s17f=998;d17l=791;s18l=965;s18r=991;s18f=950;d18l=215;s19l=999; s19r=998;s19f=994;d19l=887;s20l=998;s20r=989;s20f=993;d20l=000;s21l=948;s21r=986; s21f=998;d21l=000;s22l=838;s22r=998;s22f=999;d22l=191;s23l=971;s23r=979;s23f=977; d23l=816;s24l=984;s24r=971;s24f=999;d24l=178;s25l=998;s25r=990;s25f=955;d25l=530; s26l=993;s26r=963;s26f=999;d26l=413;s27l=962;s27r=977;s27f=996; All results with a score higher than 12.4xx and containing this part are not reproduceable and should be removed from the stats. PX3 |
András Horváth 2003-05-30 07:52:15 | Hi, I have got a fresh result with the following line: 16.147025 (2676.4 Mpts) [v4.3] {339BB088} (Linux client) Does it seem OK? András |
px3 2003-05-30 08:53:52 | Hi András, what does the line before says about "#runs" ? But don't worry, i hadn't had the problems with the linux port. This feature was only avail for Solaris PX3 |
András Horváth 2003-05-30 12:21:18 | Hi! #runs=005 in the 16% case. Can it be a real result or an artifact? András |
AySz88 2003-05-30 20:37:35 | I think they would need the entire two lines, if possible. |
Stephen Brooks 2003-11-19 09:06:01 | While writing my report I figured out that the v4.3/SolenoidsOnly optimisation was actually quite useful for proving one particular point, so went about extracting the best results from it. I saw ~16% and wondered if it was genuine, so plotted another one of these graphs for that particular optimisation: ...this time with 3`341`845 points! I'm now re-testing one of the top SolenoidsOnly results to see what it's like. The odd thing is how there's that 'island' of results above 14% that has appeared. I don't know if this is me fixing something in the client that was giving wrongly low yields or what, as yet. HB Pencils, also sold as "Moron's Choice" Graphite Cigars. |
[DPC]Stephan202 2003-11-19 10:28:48 | That could of course be easily tested by resimulating a few of those 14% configurations. Nice graph btw. I wonder why there is such a straight 'line' or 'edge' around the 360mpts. Can you explain that? It doesn't make sense to me, and I don't see it on older graphs (of previous optimisations). --- Dutch Power Cow. MOOH! |
Stephen Brooks 2003-11-20 02:55:17 | quote:I did that overnight (and have been doing it historically for all simulations that I can in order to write correct results into my report). In this case, the result tantalumrodz=552;tantalumrodr=192;s1l=999;s1f=999;...s29r=992;s29f=994;#runs=005;Rechecked to give an average of 15.799422 on 55 runs, with a standard deviation of 0.051667, making 15.799422±0.013655% be the 95% confidence interval. quote:That's been nagging me for a while - I think it may just be a more intense form of that "wind blur" effect you get to the left of some of the previous graphs, where a parameter relatively late in the simulation is changed to reduce the yield while keeping the Mpts around the same (e.g. shrinking the last solenoid). Also notice the extra lines at 1/5, 2/5, 3/5, 4/5 of the gradient of the main one. They are from the "queue-jumping" problem we had a month or so ago. HB Pencils, also sold as "Moron's Choice" Graphite Cigars. |