Stephen Brooks
2005-10-31 08:36:31
Can you try and send those results again?  I've set up something on this end to monitor your file in case it shrinks (backups before and after the duplicate checker goes through).
2005-10-31 15:53:45
as speaksman of DPC i would like to know what happened yesterday (31-10-2005), we (as DPC) lost 253435 points

noizycows: 15391
white_panther: 56426
DanC: 181618

for me (white_panther) i have 2 pc's on autosend so this can't be dupes!

so would you have a look at it

2005-11-01 07:48:29
Stephen -

Just a thought here.  I've one machine that is fairly obstinate about sending results.  It's a WIN2K Pro, running ZoneAlarm.  If I just shut down ZA, it will at times upload just fine... at other times it takes multiple attempts.  Usually I then must reboot - shutdown ZA immediately - and then manualsend will work.

Is it possible that it's making some hosed files during the failed attemps - then finally somehow uploading those (which would logically be dupes?)

I never noticed a correlation between the two - but I'm beginning to wonder.  I don't dump this machine daily.

If you think it's possible - I'd be happy to monitor, and let you know when I dump this machine... going through all the failed attempts first.

I may be 'reaching' - but I figure it never hurts to toss out a theory.
2005-11-01 08:41:13
Exactly how am I supposed to send those results in again??  I don't know what results you deleted in the first place.  Basically your system destroyed around a months worth of computing time from me.  I got tired of losing small increments of points every week or so, so I decided to turn off the auto-send option in all of my computers.  Then around once a week or so I would manually send the results in.  In this manner I would KNOW that the results sent in weren't some error from trying to send it in twice or 3 times due to an FTP error or something.  This was my way of basically doing a quality control of the service and it has been found lacking.  Whats interesting is that the number of points that were taken off my account have no direct reflection to the amount that I have uploaded in any specific manner.  741,000 points gone in one fell swoop is a pretty large amount.  I don't have banks of computers from work or a school that I'm able to tap into for computing power and a hit like that hurts.  But as far as being able to "send those results again?" I have no clue how that would be done as the results are deleted after they are sent.  For sending results I have a "blank" muon directory on my main computer that has just the muon files but no results and I move my results.txt files 1 at a time from each networked computer and send them in 1 at a time from there instead of running all around the house to each computer to do it.  From what I can tell there is nothing that I can do to fix this and I can't see anything that I may have done wrong to cause this.  But having so much computer time deleted for reasons unknown is not something that I'm sure I want to have happen again.  I really like your project that you have set up and I feel like I'm actually contributing to something for a goal that is attainable in the end.  I know that searching for space aliens communications, or trying to find a cure for AIDS is interesting and fun and all but in the end this project has a timeline and goal that I know will come to pass and that is why I stuck with it.  But if something this simple can't be resolved then I'd rather test my luck with the space aliens as even if we don't find them at least my attempts at trying to find them will be counted.  Frown Please figure this out Stephen.
2005-11-01 09:22:56
look in the sendlog.log file, see how manny results you send, open the result.dat file coppy the last X results and put the in a new result.txt file.

simple isn't it Smile
Stephen Brooks
2005-11-01 15:00:32
Ah, sorry, I thought you said you'd switched off sending for a month then had a BIG file (that I assumed you'd have backed up) and sent that in all at once.  But Panther is right - the end of your results.dat will be the last results your Muon program has generated - so in fact nothing is lost, it just remains on your machine.

I only needed some results to be sent in; results that might contain whatever is confusing the stats generator; so you could for instance just take the last 1000 results (or ~300K of results.dat, rename it to results.txt and sent it with manualsend.bat.

What DanC said could also be correct in some way - I think a couple of years ago I had a problem with "killer" linefeeds that would eat the result after them every time it was parsed.  That's now solved but a hosed file *might* still do something bad.  It's less likely now we've gone to binary format uploads though.
2005-11-01 20:58:06
I have 5 computers running Muon and I move the results.txt from thier working directories to an empty muon directory with no results.DAT file in it.  I have no clue how many results got sent from each machine as they all got moved to one machine to send in one at a time.  Here is a copy of the last few lines of my sendlog.log file.  As far as which results came from what computer, I have no idea.

20050925-220852 218 results sent to
20051011-003229 35 results sent to
20051011-003254 42 results sent to
20051011-003308 77 results sent to
20051011-003335 56 results sent to
20051011-003401 91 results sent to
20051011-004551 632 results sent to
20051017-221739 637 results sent to
20051019-043133 990 results sent to
20051026-153740 644 results sent to
20051026-153839 335 results sent to
20051026-153917 319 results sent to
20051026-154002 343 results sent to

Not sure if this is helpful info or just spam for you.
Stephen Brooks
2005-11-02 07:08:11
I found a line in a log file:
2005-Oct-28 18:11:04 ChicaneLinacB\v4.4\results_BOHICASETI.txt b.ofs=0 fileend=52012838 BPOSZERO b.ofsnew=45126359 time=67.3413

...which indicates one of your files was shrunk from 52mB to 45mB.  I'm now going to get it to backup any other shrinking files, as I think something not-quite-right is going on.  Unless you sent in a 7mB results.dat (i.e. duplicates), that shrinkage was wrong.
2005-11-02 11:38:01
hey stephen i lost an other 10k points, whats happening?
Stephen Brooks
2005-11-04 13:22:12
The latest on this is that I'm logging all the shrinkages now, and finding there are a lot of "little" leaks going on as well as the occasional big one.  I fixed one obvious bug in the stats generator but it's still doing it.  I've identified a couple of users to whom this is happening a lot and putting extra logging on their files (backups every time they shrink) so I can see what's being deleted.  On Monday I should have some output waiting there.
2005-11-05 14:44:48
Stephen -

Update.  I dumped my "suspect" machine today.  Note the 172,000 mpts.  I should put out about 100,000 a day (give or take) with what I've got out there, so this is a definite spike.  For good measure, I took one of my other "failed" results files and uploaded it as well.  Both files were approx 190K.

To produce it -

A. I ran my machine for about 3 days without dumping. 

B. I attempted to upload without first disabling ZoneAlarm.  Went though a full 10 servers without success.

C. Repeat step B above... twice.

D. Shutdown windows, restarted.

E. Disabled Zone Alarm as soon as the icon showed in the task bar.

F. Run Manualsend

G. Rename other old hosed file to "results.dat"

H. Run Manualsend

The setup is: WinXP Pro, ZoneAlarm, McAfee

Coincidence?  Perhaps.  This output is also on a day that I did not dump my work machines.  Not sure where the buggy is... but there's something there, I'd submit.

I'd be curious to know if there is any correlation between my setup and any of the other folks that are experiencing these bizarre fluctuations.

It'll be interesting to see if the dupes engine nails me for about 220,000 mpts in dupes.  That's what seems to happen... you get a spike, then a subsequent loss greater than the sum of the spike + usual results.

I'm aware that this client hasn't always played real well with ZoneAlarm... but I have another machine running both, that has no problems, ever.  It's a wee bit on the odd side... and I may be barking up the wrong tree - but it seemed worth tossing out there.  Smile
Stephen Brooks
2005-11-07 09:40:51
Here's a big one that appeared in the log file just this morning
2005-Nov-07 09:31:51 file shrank, not backed up
PhaseRotC_bigS1\v4.4\results_[DPC]Alphen.txt [534585181 bytes]
--> PhaseRotC_bigS1\v4.4\results_[DPC]Alphen.txt.RR [518226709 bytes]

The "not backed up" is annoying - I put in code specifically telling it to back up the big drops but I put "Min" instead of "Max" somewhere and hence it ignored that one.
Stephen Brooks
2005-11-08 06:58:40
Some of these are proving to be doubled-up uploads where the server has not terminated the connection nicely so sendresults tries a further time (or two).
2005-11-29 07:35:06
another 40K of results lost, and that with 2 pc's on autosend Confused stephen can you see what happent?
Stephen Brooks
2005-11-30 03:57:04
I've just arrived back from a trip and found the stats generator has produced 13.7GB of logged files for me... 7 before/after pairs are yours; one ChicaneLinacB and the others PhaseRotC_bigS1.
2005-11-30 11:19:29
have fun with reading all the logfiles Razz
Stephen Brooks
2005-12-02 04:07:47
The logfiles seem to say it's removing multiple repeats of the same send (?) at the end of your file, which I think is just due to FTP servers occasionally not terminating the connection cleanly.

It seems that not all of the logfiles I picked up correspond to *real* stats reductions, so I'm now going to start cumulatively logging stats over time to identify the real losses.  The duplicate checker theoretically should not make the stats after a full parse be lower than it was last time.
Stephen Brooks
2005-12-05 06:54:27
The cumulative logs are now building up, I'll try visualising them to pick out the real losses from the "fake duplicates" caused by - I think - the 500 RT Not Understood FTP issue.
Stephen Brooks
2005-12-06 01:47:40
OK, so I got rid of the reductions that correspond to multiple FTP sends getting removed immediately.  I started to look at the others (rarer - the ones that cause visible stats decreases), where results have disappeared from the middle of the file.  The first one I've looked at appears to be also removal of results replicated via FTP, but with a time index #time=57515; that corresponds to early October this year.  I'll look at the others to see if there's any indication of a *genuine* bug in the statistics.

Looked at some BOHICASETI files - same thing, though more recent this time.  It looks like there is no bug in the script that is removing genuine stats, it is only removing repeats that have come about most probably via FTP re-sends (or perhaps some of them user re-sends).

The question that remains is why such duplicates weren't removed immediately from the file and had to wait for the next send to be discounted, giving the false impression of stats "losses".
Stephen Brooks
2005-12-06 08:36:14
The answer to that question is that the stats generator concatenates all of the newly-downloaded files onto the current files first, and then does a recount.  So if the stats build is interrupted (I sometimes switch it off, which is benign otherwise), the un-duplicate-checked results are left on the end.

Theoretically it should notice that the file-modified date has changed and rescan these files fully on the next stats sweep, still not producing losses visible to the outside.  But it looks like sometimes it forgets that these files are "dirty"... not currently sure how.
Stephen Brooks
2005-12-07 09:52:55
Starting tomorrow, a couple of users may see a few small oscillations in their stats, as I'm going to be using two of your uploads for debugging purposes.

What I'm going to do is check the effect of an interrupted run, so I'll imagine those two files got sent in, concatentated, but only one of them duplicate-checked, then I'll run the generator a second time to see if it removes the second one's duplicates on the next time around.
Stephen Brooks
2005-12-09 02:08:41
OK, done that.  The stats program acted correctly: detected the files had been modified without being recounted and did a full re-count on them.

So the only way it looks like I can catch this is to wait until it happens for real on someone's file and then work backwards somehow to see what caused it.  Though that means monitoring the stats in odd ways, not quite sure how, because I'll have to somehow detect when the repeats were "left in"...
2005-12-15 06:48:16

It looks like today might be the day.  Showing a very large debit on today's results.
2005-12-15 09:50:51
I miss DanC
Stephen Brooks
2007-06-01 11:53:06
I think it's time to reactivate this thread!  When I restored the top50 I had no intention of being unfair to smaller(?) users like [TA]GLeeM, I just had reason to believe the errors had been in the largest user files, which were segmented into parts.  If you want to reclaim Mpts from the May 2007 statsquake - see my first post in this thread.
2007-06-01 14:59:35
Thank you for putting in the Extra effort for all that are upset with there losses
2007-06-01 15:38:38
I have been asked to post for Bok the last good # for him on 5/17/07 was 15,314,319 and you have him listed at 9,735,974 now
He also Dumped 1.3 mil in results files on that D-Day

Stephen you can get the last numbers of any member by Free-DC old stats

Just click on any team and that will open list all members of that team and there points as of 5/17/07

Perhaps if you contact Bok so he can provide you with a Bit copy of it and you can do something to automate it
Not sure if that is a good thin tho
Stephen Brooks
2007-06-01 23:34:25
Just a CSV file of names vs. total Mpts before and after the crunch would be all I need.
2007-06-02 00:24:02

Lost about 5-5.5million points.
2007-06-02 00:26:28
Nah.. I lost about 3 million.  when I look back closer.
2007-06-02 13:58:43
Thank you Mr. Brooks!

Here is the link you wanted:
2007-06-02 16:30:47
OK Stephen asked Bok about it and he posted this

""(( I've just restored a backup from 29th May and written a quick php script to create a csv scores of before and after.

And I emailed Stephen about it, you can post about it too on his forum if you like.

If it needs a different date for the second backup, that's simple enough..

DPAD csv "") the file >>>
Stephen Brooks
2007-06-04 11:10:30
I've got the file and found 92 (!) other people who needed their stats back, besides the 7 I corrected earlier.  It seems that none of their usernames begin later than "L" in the alphabet (apart from those who start with a symbol).  We're now suspecting our office's NAS backup box has dodgy firmware (or the network), so some files were backed up with part of them made of zero(NUL) bytes.

The total lost was small on a project scale, but a significant fraction for each of these users, so I hope they'll forgive me for the inconvenience.

These changes should appear in the next stats update.
Stephen Brooks
2007-06-04 11:24:47
Oh, thanks also to Bok for making the historical copies of the total points available!
2007-06-04 14:37:19
Stephen this is good to hear.  I am sure the membership will be vary happy.
Thank you for spending the little extra time needed to set it right.
2007-06-04 20:16:42
Pretty much all of my points that I lost are back, and I am very happy.  Thank you for working to restore the stats Stephen. 
[OCAU] badger
2007-06-05 01:58:29
hmm, looks like I lost 49,946,896 mpts (26th June), then gained 11,799,216 the next day... lost 30,393,341 then gained 6,077,652 the next day, although he would have uploaded real results on both days (and I tend to upload once a week or so...)
Not sure if any others in OCAU are actually active right now...
[OCAU] badger
2007-06-05 02:03:58
bah, I scrolled across and found another +38,559,856 mpts, seems I'm back to normal, too
2007-06-06 23:20:46
"These changes should appear in the next stats update."

When will this next stats update take effect?

I still haven't got my points fixed
Stephen Brooks
2007-06-06 23:57:12
You ought to have had 3.5 million Mpts added on.  Tomorrow I can check the code to see if this is actually happening.
2007-06-07 06:44:25
Stephen your stat server is broken AGAIN.  It is subtracting from every body now , we all lost 245,000,000 points
2007-06-07 08:51:18
I just lost 10386999 points.  I'm now sending in almost 1M/day and this is getting me down.
2007-06-07 09:57:45
Same here, 50,636 points gone ...
2007-06-07 13:08:43
i just lost about 7M
2007-06-07 15:25:56
Free-DC lost about 43,000,000 over all Vary SAD Stephen as a bean counter you have failed
Im sure glad I did not do that 1/2 gig dump
2007-06-07 16:27:13
pff is that all
DPC lost today 50.388.322 points
Stephen Brooks
2007-06-07 16:27:35
Ah, no need to panic - this is just an indexing error on one of the old optimisations (PhaseRotC_bigS1), meaning it stopped adding that whole one to the totals!  I'm working on putting it back now.
2007-06-07 16:42:39
Ahhhh great news thanx Stephen!! 
Stephen Brooks
2007-06-08 08:46:45
All better now?
2007-06-08 10:31:27
Mine seems right!  Thanx Stephen!! 
