stephenbrooks.orgForumMuon1GeneralDistributed computing?
Username: Password:
Search site:
Subscribe to thread via RSS
Joey2034
2003-05-04 04:13:34
Just a suggestion to make convergence of the muon percentages come a lot faster.  Chances are you've already thought this over, but I'm posting mainly to get feedback or an explanation of why it isn't done.

Here goes.  What this distributed project looks like to me is a gigantic collection of individual projects - i.e. each user evolves his/her own particle accelerator from scratch (unless they use the Best 250 of course).  With this, the muon percentage becomes mainly a function of time on individual machines, so the faster machines or those that have been around longer will have evolved the highest percentages.  Fine, whatever, but that means that the newer users or the users donating spare cycles from slower machines will never have even the slightest chance of competing with the power users, so their cycles are basically wasted on shoddy designs.  Another effect of this is that the global maximum percentage comes from just one machine, and doesn't get the benefits of all the other cycles out there (ok, so I'm exaggerating a litte bit, because I'm negleting the effect of the best 250, which incidentally I haven't even been able to find for the 4.31b, so I had to start my evolution from scratch).

What I'm suggesting might be a big change in the infrastructure of the entire project, but I think it would be worth the effort.  Instead of saving results in the results.txt and uploading to some server somewhere every 100 results, and then doing further work on those results, why not do the following: set up a server to manage _all_ of the results uploaded.  When each user's 100 or so results are uploaded, they are added to the global results database (GRD), then a server-side GA is done on the GRD and a list of 100 or so projects are put in the user's queue (the number should be set by the user, based on their connectivity, processing power, or maybe personal preference).  The client then crunches through the queue, placing the results in results.txt and results.dat.  At the end of the 100, if the user isn't connected to the internet or is otherwise unable to upload, the client can perform the GA on results.txt to simulate more projects until a connection with the server is established.  This way, instead of each machine working with an average population size of 1000 and an average muon% of 1 or 2, _all_ machines work together as a sort of global metaprocessor on the GRD of tens of thousands of configurations.  This way, the total processing power pushes up the max %, instead of just one user's. In other words, instead of users competing to get the highest muon %, they work together, in the spirit of truly distributed computing.

The drawback that I can see with this is that some promising families of configurations might be lost due to random chance, but I say 'so what?' As it stands now, I think the problem is that there are far too many families of configurations.  And in the Best 250, that might result in incompatibility when it comes time for crossovers, just like a human can't mate with a cat. 

Looking through the stats page right now, I see that the highest percentage for solenoids-only comes from a user with 1672346.9 Mpts.  That's less that 5% of the total Mpts, which means the system could run upwards of 20 times more efficiently.

Let me know what you think of this idea.  And if it has already come up before, please let me know the outcome of the discussion.

thanks
Stephen Brooks
2003-05-04 04:33:45
Yes, v4.3x is currently a lot of separate projects.  They would normally be coupled by a best250 file or something of the sort, but I've got some bugs in the client to still work out, so in a way it's still in "beta".

Having a lot of separate optimisations is good, though, because the standard weakness of optimisation techniques is that they can get stuck in a "local" maximum.  There is no way around this other than to simulate a very large number of individual pathways from different starting points, to find the global maximum.  Then after I've let the paths go off into their own directions (and possibly on paths to different local maxima), the bestNNN is released and typically in that the ones that were getting stuck in the lower local minima are gradually competed out by the higher ones.

I think maintaining variety from different paths together in the bestNNN is important because often good ideas in engineering come from trying to combine the best aspects of two fairly different designs.  Muon1 has three different ways of combining genomes, so in other words, it'll try and mate a human with a cat, then it'll try and stitch a human's head onto a cat, and then it'll make a human dress up as a cat before it gives up.

The problem with a central results server as you suggest is that it is _not_ distributed.  Although the current genetic algorithm operations are relatively fast compared to the simulations themselves, it's possible I could in future use something more complicated (like a neural network, or some sort of local gradient estimation) to do this, which I might find with 1000 users and a database of 1 million results begins to get too much for 1 computer.

In truth I've been surprised at the number of people who have joined up to this project - I initially only needed a few users, but due to 'popular demand' have been investigating ways to gradually make better use of the computing power available.  The first of those was in v4.2 when I greatly increased the number of parameters in the optimisation - and the network still coped with it.  More recently I've been putting systems in place for me to simulate many different accelerator-design ranges in parallel on the network, to use even more power and also to increase the productivity in general.

Today's weather in %region is Sunny/(null), max.  temperature -99999°C
Joey2034
2003-05-04 08:07:47
To maintain variety in the genetic pool, what I have seen done in some artificial life programs is this: they keep a record of genetic combinations and mutations done - a family tree of sorts - and combine two genomes together only if they are closely enough related.  I don't know the specifics of the genetic algorithm used in this project, so I don't know if incompatibility of genomes is even a problem here.  Do you have any links to detailed information about the specific GA you're using?  Or could you outline the process a bit? 

The problem I've run into with the artificial life programs is that they don't store a very large genetic pool, so they end up deleting most branches on the family tree pretty easily.  But if the entire genetic history is maintined, I see no reason why diversity can't be upheld, even with a central results server.

I see your point, though, about the processing burden that would be placed on a central server, especially if a neural network were to be used.  But with the GA you're using right now, how much processing power is needed for each project?  When I'm running the console, it looks like information shows up when combining genomes, but it goes by too quickly to read.  I guess this question ties in with the request for more information on the GA.
wirthi [Free-DC]
2003-05-04 10:25:32
Some teams are generating best250 files for their users.  Just check at one of the big teams, at least for Rechenkraft.de I know that they (or some of their members) produce such best250 files.  Of course it would be unfair to "steal" this file and not being a member of their team ...
[DPC]Stephan202
2003-05-04 12:12:29
Currently the DPC are also making such topNNN file.  We have created a script that updates the topNNN file every hour.  Every member can download the file and upload his own results.

At this moment it's only accessable to DPC members, but in time we will make it public.

---
Dutch Power Cow.
MOOH!
AySz88
2003-05-10 11:06:08
username results from v4.0x v4.1 v4.2 v4.3 total particles best muon percentage hours since last active
1. [AVE] Mr. Sledge Hammer 0 0 90 9508 1978221.9 17.346667 3
2. [AVE]Caesar 0 0 4490 5654 601823.9 13.940349 2

Time for a Best NNN file, perhaps?  Big Grin I have a feeling someone's not trapped in a local maximum...
Or are we going to wait for more 15%+ independently-derived designs to make sure?
px3
2003-05-10 11:12:26
17.xxx and 13.9xxx are really great, but after rechecking these results it seems as if they're not really reproducable.

i did a check on these topresults and i'd never been able to reproduce these.

our team-internal ranking shows me, that we currently reached a high-water-mark round about 12.9% (reproducable).

PX3
John Kitchen
2003-05-10 12:04:49
I'm pleased to see that my intuition on these high scores is vindicated by actual facts.  Thanks px3. Again, the issue of purging the official scores of artificially high, scientifically invalid yields comes up.

I am quite prepared to offer computing time to recalculation of work units to verify high yields in the interests of more exact science.

I understand this can be done by removing the results.dat and autosave files and saving the subject work unit as "queue.txt".

If any of you ftp site owners want to pass work units my way, please do so.  I should be able to turn requests round in 24 to 48 hours.  If it is too hard to purge the actual stats, we can at least publish verified numbers on this forum.

How many times do you think we'd need to recalculate to get a statistically valid mean?

Best to all, John
Stephen Brooks
2003-05-10 12:29:37
Personally I've been suspicious of everything generated so far that claims to be above about 7.5%, so I'm glad PX3 can reproduce some 12s. I noticed these high scores a while ago (the sudden jump to 13% looked odd) but figured I can get rid of them later, since if you plot the Mpts score against the percentage, these show up as having had maybe a quarter of the amount of calculation they ought to have done.

The reason is that there's a bug in the queuing algorithm.  There's one known issue about client restarts causing a result in the process of being retested to just skip the rest of the retesting, which I was going to fix in v4.31c. It _might_ be causing the high results too.  This also explains the delay in me releasing any bestNNN.dat files because I wanted to (1) purge the existing database of false results and (2) release a client that seems not to produce 'freak' results, before doing that.

You can try doing manual retests of various sorts yourselves and come up with a sort of "safe" best100 of all-confirmed results, which I'd be happy to release on the website.  Work on v4.31c has been delayed as I'm revising for my final degree exams now (coming up in 3 weeks' time).

PX3 and John Kitchen, it might be an idea if you tell everyone else how you are doing the retesting.  I didn't specifically program such a capability into the released clients, but if you've got something that works, a HOWTO would be of interest.

Today's weather in %region is Sunny/(null), max.  temperature -99999°C
John Kitchen
2003-05-10 14:53:43
I haven't actually done any retesting, but I heard through the grapevine that it can be done as below, maybe Stephen, you can confirm that this would work:-

1. Stop the client
2. Delete the autosave file if any
3. Delete the results.dat file if any
4. Put the work unit into a file called "queue.txt"
5. Start the client.

Does it work?  John
[DPC]Stephan202
2003-05-11 03:56:54
quote:
Originally posted by Stephen Brooks:
The reason is that there's a bug in the queuing algorithm.  There's one known issue about client restarts causing a result in the process of being retested to just skip the rest of the retesting, which I was going to fix in v4.31c. It _might_ be causing the high results too.  This also explains the delay in me releasing any bestNNN.dat files because I wanted to (1) purge the existing database of false results and (2) release a client that seems not to produce 'freak' results, before doing that.

I do not think this bug causes higher yields.  Only lower yields.  Afaik the program adds up all the found yields and devides it by the number of runs.  A restart of the client will delete previous yields, but not the number of runs already done.  So what will happen is this:

NORMAL: 10.2% 10.3% 10.1% #RUNS=003 --> (10.2 + 10.3 + 10.1) / 3 = 10.2%
BUG-SITUATION: 10.1% #RUNS=003 --> 10.1 / 3 = 3.367%

As you see, in the bug-situation two yields are lost, because of a restart after the second run, but it still says #RUNS=3.

---
Dutch Power Cow.
MOOH!
px3
2003-05-11 03:58:52
I did the tests by stopping the client, deleted results.dat and autosave file.  then i put the result in my queue.txt and watched what happened.

in addition i raised the "Rechecks for best-so-far results (min.  5)" from 5 to 15, which was 5 by default.

if the result after 15 runs still is about the original high score , then it's valid and reproduceable for me.

what i found out is that this only works with results that allready have "#runs" higher 1.

btw.  i deleted all rechecked results and didn't send them to be rated.

and last but not least here's the status of our team internal database:
results > 10.xx
10.xxxx = 1781
11.xxxx = 989
12.xxxx = 399
13.xxxx = 45
17.xxxx = 1

rechecked results, as mentioned above:
10.xxxx = 531 (success)
11.xxxx = 321 (success)
12.xxxx = 211 (2 failed, rest success)
13.xxxx = 45 (all failed)
17.xxxx = 1 (failed)

all failed results had been removed from our team internal top xxxx file.

PX3

[This message was edited by px3 on 2003-May-11 at 12:29.]
px3
2003-05-11 05:01:22
Ok, one additional info:

check the results for
tantalumrodz=292;tantalumrodr=000;s1l=596;s1f=996;d1l=564;s2l=961;s2r=998;s2f=991;
d2l=000;s3l=999;s3r=999;s3f=932;d3l=660;s4l=999;s4r=998;s4f=975;d4l=424;s5l=998;s5r=999;
s5f=999;d5l=171;s6l=961;s6r=989;s6f=999;d6l=000;s7l=987;s7r=998;s7f=966;d7l=446;s8l=981;
s8r=999;s8f=999;d8l=198;s9l=610;s9r=975;s9f=976;d9l=249;s10l=825;s10r=999;s10f=994;
d10l=050;s11l=962;s11r=990;s11f=903;d11l=017;s12l=993;s12r=957;s12f=949;d12l=996;
s13l=970;s13r=934;s13f=998;d13l=212;s14l=997;s14r=985;s14f=999;d14l=134;s15l=989;
s15r=998;s15f=999;d15l=121;s16l=974;s16r=975;s16f=983;d16l=335;s17l=988;s17r=980;
s17f=998;d17l=791;s18l=965;s18r=991;s18f=950;d18l=215;s19l=999;s19r=998;s19f=994;
d19l=887;s20l=998;s20r=989;s20f=993;d20l=000;s21l=948;s21r=986;s21f=998;d21l=000;
s22l=838;s22r=998;s22f=999;d22l=191;s23l=971;s23r=979;s23f=977;d23l=816;s24l=984;
s24r=971;s24f=999;d24l=178;s25l=998;s25r=990;s25f=955;d25l=530;s26l=993;s26r=963;
s26f=999;d26l=413;s27l=962;s27r=977;s27f=996;

this candidate produced the 17.xxx and 13.xxx score. 

the actual highest reproduceable score for this one is 12.171933. Everything above could be dropped.

PX3

forgot the linebreak Big Grin
Stephen Brooks
2003-05-11 05:42:01
It's interesting that PX3's false results are in a very small band (45 around 13%), with few failures lower down.  It really looks more common than e.g. a computer floating-point glitch should be, so it suggests a bug somewhere in the simulation.  I'll fix what I know to be wrong and release v4.31c today, to see if that helps.

Today's weather in %region is Sunny/(null), max.  temperature -99999°C
px3
2003-05-11 05:59:59
Stephen,

i did some investigation again and found out that all (!!!) negative results had been
produced by the v4.31b release.

there're no invalid results from the v4.3 (windows and my ported unix versions).

maybe you should check the differences between v4.3 and v4.31b

PX3
Stephen Brooks
2003-05-11 06:19:23
quote:
Originally posted by px3:
maybe you should check the differences between v4.3 and v4.31b

Sounds like a plan.  It's at times like this I'm glad I remembered to update this page.

[edit]OK, instinct tells me that the 'modification to the rechecking algorithm' mentioned in v4.31 is what did it.  This is where the scores get stored in an array instead of just being added up, so I can do more detailed statistical stuff with them.  In terms of bugs, it means that something is probably writing into the array when it shouldn't.[/edit]

Today's weather in %region is Sunny/(null), max.  temperature -99999°C
[DPC]DemerZel
2003-05-11 09:00:51
Hmmm don't know if it wa possible or not.

But i just reran a 12% one in v4.3 and v4.31b
(using the que file)

v4.31b came up with the same result a 12.4%

But v4.3 alterd the parameters and came with e 0.0% one. 

Original and used by v4.31 b
tantalumrodz=504;tantalumrodr=027;s1l=994

V4.3 made it
tantalumrodz=156;tantalumrodr=788;s1l=004;



They started with the same quefile and after the first run v4.3 had the altered values in it.  !!  ????
px3
2003-05-11 09:39:15
Some more infos about rechecking:

you can use v4.31b to recheck v4.31 or v4.3 results
you can't use v4.3 to recheck v4.31 results.

v4.3 won't be able to checksum v4.31b results. 

PX3
px3
2003-05-11 10:30:57
quote:
Originally posted by John Kitchen:
I'm pleased to see that my intuition on these high scores is vindicated by actual facts.  Thanks px3. Again, the issue of purging the official scores of artificially high, scientifically invalid yields comes up.

I am quite prepared to offer computing time to recalculation of work units to verify high yields in the interests of more exact science.

If any of you ftp site owners want to pass work units my way, please do so.  I should be able to turn requests round in 24 to 48 hours.  If it is too hard to purge the actual stats, we can at least publish verified numbers on this forum.

How many times do you think we'd need to recalculate to get a statistically valid mean?

Best to all, John


Hi John,

currently i'm updating our team-homepage, not with graphics but with recheck-status for our results.db.

i'll try to code some smaller routines to automaticly generate recheck-queue-files for the top-results.

at the moment half of my suns are doing checks for v4.3 results and half of my windows pc are checking the v4.31b results.  it could be helpfull if some more users would participate in this.  therefore i hope to finish the cding inbetween the next 2 or 3 days.

if you're interested take a look at our team homepage

in general i think a rerun about 10 times should be enough to verify the result.

Regards,

PX3
[DPC]DemerZel
2003-05-11 11:54:37
quote:
Originally posted by px3:
Some more infos about rechecking:

you can use v4.31b to recheck v4.31 or v4.3 results
you can't use v4.3 to recheck v4.31 results.

v4.3 won't be able to checksum v4.31b results. 

PX3


Shouldn't it say "blurp" or something.  ? 
Ahwell it is an old version anyway
John Kitchen
2003-05-12 11:48:35
Thanks px3. Anything you can do to help make the recalc process more automated will be appreciated!

Do you plan to publish validated results from other teams?
px3
2003-05-12 19:10:47
Hi John,

that's what i plan, if the recalc process is automated. 
if stephen agrees i think a best of file with high scores checked more then 10 times could be a possible way.

PX3
John Kitchen
2003-05-15 13:12:06
Meanwhile, I have changed the recalc parameter so that my new high watermarks are redone 10 times.  I am seeing a spread of 0.067 in the current file with a mean of 13.2811 across 8 values.  Taking an average of 10 recalcs should tighten it up and make the results more meaningful.

John
: contact : - - -
E-mail: sbstrudel characterstephenbrooks.orgTwitter: stephenjbrooksMastodon: strudel charactersjbstrudel charactermstdn.io RSS feed

Site has had 25163526 accesses.