Stephen Brooks
2002-10-29 18:54:45
I've been told (by someone monitoring what's on my FTP server) that some people have apparently been "cheating" by submitting repeated units or files with repeats in them.  If this is accidental I don't mind.

However you should be aware that soon I plan to do a global check through the results and throw out all exact repeats.  Saves me disk space.  Even non-cheaters' stats will go down slightly because occasionally the program makes a duplicate all of its own.  But you see, the people who have deliberately cheated will lose far more than most big grin. I plan to do that sort of check more often in future.

I also feel inclined to publish a list of the people who have had the most duplicates.  This might be unfair to a few people who have submitted a lot of results by accident, but I think we'll be able to see quite clearly who the habitual cheaters are.

Expect the repeat-check later this week.  I need to write a little app to do it, but as you know the results format isn't awfully complicated - I just need to use an efficient repeat checking algortihm using a hash function and a sorted heap of stuff.

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
2002-10-29 19:07:03
It's great that you are implementing this step early in the project smile By doing this, you are telling users that you will actively act against cheating...unlike one other big DC project that does not respond at all big grin

Prehaps in the next version, you can implement a code that will prevent people from submitting duplicate results.

And I find it worrying that individuals are able to browse through your FTP server frown Increase sercurity?

One other thing is that I used the MuonProxy to harvest my results from my home farm.  And I noticed that there are a lot of dupes.  I have reported this to Floppus.  But I need to go thru my results to remove the dupes manually frown

The result file is 1.2MB and I am bound to miss out on a few.  What should I do?  Remove the dupes as best as I can and submit?
2002-10-30 13:40:13
Great, in the time i was running a eccp109 proxy, i made in the beginning a valid and dupe check.. you always see some results where people try to 'play' with there results file, just because they are curious what will happen if......
If i see some invalid try's, i'll inform the submitter in case of a corrupt client etc.
Dupes will always be a 'problem' cause many users manually collect there reasults (sneakernet) and forget if they already send the points.
If i can be of any help programming dupecheck / validator (if possible) let me hear.
2002-10-30 14:00:38
I also have submitted a few (well im not sure how many) dupes, because of MuonProxy - i had mentioned it a few days ago in a thread about the proxy and Floppus said he was gonna look into it.

i also found an upset file permission on one of my linux boxes which may have caused a few but i fixed that also.
Stephen Brooks
2002-10-30 14:54:30
Do you know if MuonProxy is actually producing more duplicates, or it's just a matter of there being duplicates produced by Muon1 anyway and MuonProxy putting them all in one place where you can see them?

Note that I decided to check for 'cheating' on the advice of someone else who was looking at the FTP server, and not because I personally had seen a lot of evidence of it - it may be that he just saw the unfortunate cases where people did stuff by accident, although it was also mentioned that some of the multi-billionaire people appeared to be sending in rather a lot of duplicated results.

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
2002-10-30 16:16:25
as far as i can tell its the proxy - it seems to fill the results.txt file with x copies of one simulation, when it fails to do an FTP
Stephen Brooks
2002-10-30 16:52:42
Well I'm getting there with the repeat checker... Now reads in the files in a fairly error-checked way to remove glitches, although it will currently insist on echoing version "v4.21b" as "v4.210000b" because I've used floating point for the version number big grin. I think I'll go and %g something.

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
Stephen Brooks
2002-10-30 18:46:46

Hey this repeats-checker is quite nifty.  Parses from 0.5 to 1MB of results per second, on my P-II-400. The database is 660MB for the current versions (4.2x) so it'll take about 20 minutes to check the whole thing.

So far there's been no nasty surprises.  Nobody worth mentioning has more than about 20% of their results as repeats.  Blummi probably stands to lose about 1.5 billion points, but probably the others up there also do - harvesting all the results from a farm like that must produce a few repeats.

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
2002-10-30 19:20:36
Any idea on whether it would be possible to design a program capable of checking for tampered-with results that works as a background filter when results are uploaded?  Come to think of it, that program might work great even if it wasn’t there wink.

Stephen Brooks
2002-10-30 19:28:35
Originally posted by astra412:
Any idea on whether it would be possible to design a program capable of checking for tampered-with results that works as a background filter when results are uploaded?

I'm way ahead of you... smile Such a thing is already in the manualsend module of v4.22. It isn't as powerful as the check as this end though because it can only check for repeats or tampered results _within_ one file.  If someone sends the same file twice it can't "rememeber" the last file.  It's most important purpose is to rewrite the results in "grammatically correct" form, excluding any HDD-glitches or bad linefeeds or what-have-you.  That should make managing the results here much easier when they are of higher quality.

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
Stephen Brooks
2002-10-30 19:41:03
OK I've done that repeat-check now.  You can see the results of it here. cool

[edit] This rocks.  My team-stats have fixed themselves too.  Your results-last-updated clocks have all been reset though because I've just rewritten all the files.  [/edit]

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"

[This message was edited by Stephen Brooks on 2002-Oct-31 at 2:52.]
2002-10-30 20:16:43
Stephen cool
The DPRGI team was usin a lot of dual Athlon MPs maybe it has something to do with their % of duplicates.
Also any word on the bug in the client?  Will 4.22 fix it or is it too hard to tell at this point in time?

ARS Team Atomic Milkshake
Unofficial Muon1 FAQ
Stephen Brooks
2002-10-30 20:43:35
With something so intermittent it's hard to even test whether what I've fixed in version 4.22 will fix it or not.  It's almost a 50-50 chance.  If it doesn't, I'll make a self-checking version of the program.

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
2002-10-30 21:38:20
I lose again 87 Mil.  I hope I don't send Dublicate Partical.

John Campbell
2002-10-31 00:04:55
What does the "bestN left in" column signify?
Bluumi [SwissTeam.NET]
2002-10-31 02:12:47
Originally posted by Stephen Brooks:

Blummi probably stands to lose about 1.5 billion points, but probably the others up there also do - harvesting all the results from a farm like that must produce a few repeats.

frown Jep i saw ...
first i was thinking there was a problem with the last upload... but i know some weeks ago a duplicate upload can be.  There i was juming in diskspace limits will i upload... anyway ... i'm lucky that i'm quite near by the 7 bio big grin big grin
But now my friend behind me is very near wink
Stephen Brooks
2002-10-31 02:20:31
Well by the looks of the results it doesn't looks as if anyone has deliberately been sending in thousands of repeated results.  Having 20% of repeats is just like sending in an accidental copy of 1 in every 4 that you sent, so not so hard to do.

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
Bluumi [SwissTeam.NET]
2002-10-31 02:34:46
Originally posted by Stephen Brooks:
Having 20% of repeats is just like sending in an accidental copy of 1 in every 4 that you sent, so not so hard to do.

Jep, no Prob on my side... i was just wondering about the Count... but the forum is right around the corner, and so i saw why ...

Now i need my grand mom tho help me... right behind me is a nice guy... Gand mom must now run muon on her mp3-Walkman big grin

2002-10-31 03:30:08
I hate Cheating
2002-10-31 04:11:40
Great, i see i have no duplicates.. that confirms my http-post- ftp- mail- proxy works correct smile
2002-10-31 04:49:12
Having 20% of repeats is just like sending in an accidental copy of 1 in every 4 that you sent,

I thought that would have been 1 in 5 for 20% Oh well, I always stunk at math smile
Stephen Brooks
2002-10-31 05:31:37
big grin

He sends 4, then another 1 that's a duplicate of one of the previous ones.  I get a total of 4+1 results, out of which 1 is a duplicate of a previous one, so the proportion is 1/(4+1) = 20%.

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
2002-10-31 05:43:38
If anyone is interested in doing somthing with this repeats-data by using PHP, I have a file all data is listet like this [SG]PvS#798#10#0#1.253
easy to parse by PHP (name#results#repeats#bestn#%)
You can find this file here

Do what ever you want with it (but download it to your account, i have enough traffic without you parsing this file every 5 minutes big grin )

2002-10-31 05:56:38
Originally posted by Stephen Brooks:
big grin

He sends 4, then another 1 that's a duplicate of one of the previous ones.  I get a total of 4+1 results, out of which 1 is a duplicate of a previous one, so the proportion is 1/(4+1) = 20%.

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a _spiral space-time whirly thing_, AND an interesting plotline"

from the for what it's worth dept, my 502 dups were apparently from a specific upload when there was a problem connecting/transfering?  to your connected, did a few things (yes, I wish I would have written it all down and will do so if it happens again), sat there for a moment and then said something to the effect that it didn't happen, goodbye.  At that point the results.txt file is still on my hard drive so it obviously didn't work.  So I tried it again.  Same song second verse, no go.  A few hours later, it worked.

(So could it have actually transfered the info but didn't send back a signal to the program that it finished and therefore didn't delete the .txt file?)


ps: I'm running the background program and halt it before
running manualsend.exe
2002-10-31 09:02:17
Hi Stephen,

i got 162 duplicate results and IMHO this can't be such a high number of duplicates.
i think one time i resendet about 20 duplicates, that's what i wrote you per e-mail, as far as i remember, but where does the rest come from ?

do you have some additional informations, maybee dates when these results had been sent or logfiles from your ftp ?

i really would like figure out, how this could happen.

Thanks in advantage,

Stephen Brooks
2002-10-31 09:52:13
Might have been that FTP thing: if it can deposit the result on the server, but for some reason the connection fails or it can't download the signal.dat file, the program thinks that because the signal.dat isn't there, no contact could be made at all, so as a precaution it doesn't delete the results.

So on occasions they get sent multiple times.  This is not a major problem from the project's point of view, as I can now cull those repeats from time to time.  As it turns out, there weren't any obvious examples of deliberate cheating on that list, so nobody is being "accused" of anything.  Nor can I "give you back" any points because you can't understand why you sent duplicates.  To be honest, _I_ probably don't understand why you sent duplicates, but they were here, so they got deleted.  If I delete more regularly, you will never be mislead into the idea that you have more points that you actually do.

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
2002-10-31 10:12:02
Hi Stephen,

ftp could really be the problem.  over the last weeks i got serveral timeouts while sending a resultfile - btw.  a few minutes ago i got 2 send-timeouts.

i'd that ftp idea too, so i asked for the logfiles from the ftp.  it's no problem for me to have less points now, that was not my intention to post, i only wanted to check my idea and help reducing duplicates =.-)


2002-10-31 10:37:53
20% - i just noticed my results and they are down by 58% - that seems an awefull lot of dupes frown surely the proxy never submitted THAT many!

i would be gratefull if you would send me a copy of the dupes i have done - i would like to try to find out how this is possible (im still gobsmacked!!) and on which machine(s) they have originated smile
Stephen Brooks
2002-10-31 12:10:38
Originally posted by px3:
i'd that ftp idea too, so i asked for the logfiles from the ftp.  it's no problem for me to have less points now, that was not my intention to post, i only wanted to check my idea and help reducing duplicates =.-)

OK.  I think there _are_ server logfiles that I can download, so I might have a look (although if it's going to take ages to analyse them for anomalies I might not do that).

Also Infopop has been up and down a lot this month, and my site with it unfortunately.  They're also charging me for way more bandwidth than I think I've used this month - 13GB - and their web-usage counters disagree with each other on the same page.  I've sent a request for user support in.  If they continue to behave like this much into the new year I may well change hosts.

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
Stephen Brooks
2002-10-31 12:13:04
Originally posted by NicJA:
20% - i just noticed my results and they are down by 58% - that seems an awefull lot of dupes frown surely the proxy never submitted THAT many!

I bet it did actually.  I've asked about how that proxy produces duplicates, and it is possible for it to actually produce rather a lot under bad conditions.

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
2002-10-31 14:27:43
Originally posted by Stephen Brooks:
Originally posted by NicJA:
20% - i just noticed my results and they are down by 58% - that seems an awefull lot of dupes frown surely the proxy never submitted THAT many!

I bet it did actually.  I've asked about how that proxy produces duplicates, and it is possible for it to actually produce rather a lot under bad conditions.

When I checked the results harvested by the proxy, there are results that are duplicated as many as 6 times frown I also notice that the dupes get worse the longer the proxy runs.

What I do not is run the proxy for a minute and then shut it down.  Cut and paste the results harvested from the proxyresult.txt file to the result.txt file in the muon dir.  Do a manual upload from there.

Works like a charm now smile
John Kitchen
2002-11-01 04:55:13
Originally posted by NicJA:
20% - i just noticed my results and they are down by 58% - that seems an awefull lot of dupes frown surely the proxy never submitted THAT many!

i would be gratefull if you would send me a copy of the dupes i have done - i would like to try to find out how this is possible (im still gobsmacked!!) and on which machine(s) they have originated smile

NicJA, I have sent you some examples via email.  Maybe you should take kenlow's advice, you are still uploading duplicates by the mile.  John
2002-11-01 10:04:41
thanx for the warning John, im just checking my mail.

I cant understand how im still sending dupes - i have been running the proxy (till last night) but ws checking the results.txt file when it had crashed.

I then removed all the dupes from it, and sent it manually.

BTW - i just checked your attachments - all from 2 days ago or longer i noticed wink
2002-11-01 11:02:44
STEPHEN - should the stats pages be listing the number of results INCLUDING the dupes?  it seems that thats what its currently doing.
John Kitchen
2002-11-01 11:44:53
Originally posted by NicJA:
BTW - i just checked your attachments - all from 2 days ago or longer i noticed wink

I have emailed you 13 bad files from the last 24 hours.  However the VERY recent files, (since 1740 today your time), look clean.
Stephen Brooks
2002-11-01 14:11:38
Originally posted by NicJA:
STEPHEN - should the stats pages be listing the number of results INCLUDING the dupes?  it seems that thats what its currently doing.

No it's not.  Everyone's scores are lower than they were before I threw out the duplicates.... Of course any new duplicates you send in will be counted, but I'll do another sweep in a few weeks and get rid of those.
At some stage I might even build the duplicate checker into the dynamic stats-receiving script so it gets rid of them as it goes.

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
2002-11-01 14:20:31
are you sure?  im not on about the number of particles but the actual no of results ... mine doesnt drop when the number of particles do
Stephen Brooks
2002-11-01 14:27:26
That's because you just submitted a rather large number of results full of duplicates again... You've doubled your score since I actually dupe-checked the database.  I'm going to put the duplicate checker in the stats-generator now.

[edit] OK, tried that.  It's too slow having to re-parse the whole of [SG] DOA's file on every hour he sends in anything.  I'll go back to weekly or monthly full-sweeps through the database... [/edit]

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"

[This message was edited by Stephen Brooks on 2002-Nov-01 at 22:28.]
2002-11-01 14:28:20
Thanx again John

... hmmmm.... well, i did leave the proxy running over night , but set to dump weekly as Floppus had suggested, then doing a manual dump so i dont understand what has happened there!

i killed it off totally this morning, so i should be totally clean now. 

STEPHEN - I would be extremely gratefull if you could remove the dupes from my score again, and i have killed the proxy TOTALLY till floppus releases his next version smile
Stephen Brooks
2002-11-01 15:30:20
I've removed those duplicates.  Actually most of your 1000+ results score is genuine as it happens.

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
2002-11-01 15:41:05
thanx Stephen smile

i am pleased that i have actually done that amount big grin
2002-11-02 11:16:44
just out of curiosty - any reason why the results ive dumped since this morning arent showing up on the stats ?

i know theyve been sent since ive been doing it manually.
Stephen Brooks
2002-11-02 12:31:24
Oddly enough you just submitted some more results and your score decreased by about 40 results.  I think there was a glitch in the "Bookmark" for your file after I removed the repeats.  The current reading of 999 results is correct judging from the size of the file.  Wait a few days, and if your score increases as you'd expect over that period then the glitch should be gone.

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
2002-11-02 13:14:41
this is getting confusing.  im now showing -11,807,149 for today, when earlier it was showing 4,434,324 - what gives?  is this the result of the last lot of dupes being removed?

even at 4,434,324 thats still considerably less than id normally do - infact it was at this at about 2 this morning, and i know ive dumped a lot more since then.
Stephen Brooks
2002-11-02 14:24:32
I have no time to trace individual queries like this.  I looked at the stats program and it seemed to be doing what it should do, I also got it to recount your stats again.  Maybe the duplicates weren't removed the first time I said they'd been.  I'm sure there are other projects you could join, which have more reliable stats, and don't rely on someone using their free time to maintain them...

"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
