stephenbrooks.orgForumMuon1Generalfair reward of computing power
Username: Password:
Search site:
Subscribe to thread via RSS
gogomaus
2002-08-13 15:48:50
Hi Stephen,
you excellently explained within thread "muon1v4.211 crashed" nowadays particle counting is just a mediocre reward of computation time spent.  More fair would be a product of particles times tiemesteps invested for each run, which you considered to be taken in account for version 5 somewhere in the future.
I can imagine that this would cause enormous programming effort.
On the other hand it´s a little bit frustrating to watch "newcomers" or those people who do not use accumulated results.dat to gain a multiple of particle counts compared to "advanced" users.
Thus I had a simple idea of matching both groups interests.
How about spending "particle points" as a quantity measuring tool, which could be the sum of products of particles times muon-%-yield ?  (The latter one is a more appropriate measurement of computing time contributed to your µon project).
Historical results from v4.0 - v.1x could be evaluated with a "standard yield" of 1.000%, but v4.2y should be weighted with its individual yield factors.

I dont knmow, if that´s possible to implement easily onto your stats server.
I trust in your proven ability to find a compromise for majority of users.

It is just a simple proposal to meet overall interest - to my opinion.
Would be great, if you consider this as worthful enough to worry about.

Kind regards, Wolfgang
GP500
2002-08-13 23:13:39
i get your point
but i think version 1 2 3 and 4 could better NOT be fused in score, sinds they are
in them selfs diverent projects i believe.

T E A M: Grutte Pier [Wa Oars] / Fryslân wer mei Kening \
Stephen Brooks
2002-08-14 07:16:32
Sum of (Particles * Muon percentage) over all results.  I like it.  It would reward people who persist at the project and get higher scores.  It is also fairly easy to implement in the stats-generator, since that reads all the lines in the file anyway.

However, I'll have to ask the rest of the people on the project whether they mind the stats being changed like this.


"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
Michal Hajicek
2002-08-14 08:17:27
Gogomaus, good idea, I think.  (If current columns won't be removed).  But remember that luck plays certain role in this scoring method (not really) and it's not so precise.  So Stephen's way is more appropriate.

Q: Is particle number in results.* files the total number of simulated particles for each run?  And if it is (I think I'm right), what depends this number on?  (This number does'nt follow the muon-yield (or vice versa) as i could see in my 900K dat file.) And are all particles simulated till the and of each run?  And takes every particle same CPU time?  ... I hope you can understand my poor english, and that it's not nonsense at all.
Stephen Brooks
2002-08-14 08:32:29
The particle count is the total number of particles that ever were simulated, including those produced by the pion decaying into muons multiple times.  Not all of these take the same CPU time because some are lost earlier than others.  The particle count does increase slightly with higher muon yields because more particles survive for longer and have more chances to decay.


"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
Jim Reed
2002-08-14 09:06:29
Stephen,

Would it make sense to modify the scoring function for the optimization to consider the number of particles in addition to the % yield?  (I'm assuming that it only considers yield right now.)

It seems to me that such a change would help select configurations that allow particles to get farther down the channel even if the last stages keep them from getting out.

(Just to keep this on topic, I personally do not mind you changing the stats - or leaving them the same!  wink)
Stephen Brooks
2002-08-14 11:26:47
quote:
Originally posted by Jim Reed:
It seems to me that such a change would help select configurations that allow particles to get farther down the channel even if the last stages keep them from getting out.


That is a sensible idea, and I might have done such a thing is the v4.2 optimisation "struggled" with zero or very-low results - the particle count would have given it something of a 'ramp' up to where the muon yield starts scoring.

Right now enough particles are getting through, it seems, that the optimiser is functioning correctly.  Since the yield is actually what we're worried about, I'm not going to include anything else in the scoring function this time.


"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
gogomaus
2002-08-14 14:02:41
quote:
Originally posted by Michal Hajicek:
Gogomaus, good idea, I think.  (If current columns won't be removed).  But remember that luck plays certain role in this scoring method (not really) and it's not so precise.  So Stephen's way is more appropriate.


Luck plays always a big role, and so it does for this sim project and to any simple way of scoring.
Maybe my first posting was too short and/or you were not aware of the time factor.  So I will give a short overview with distinct figures (taken from my slow PC, but will be valid in proportion for everyone).

I have the choice to start from scratch (way A) or to use my actually good results.dat (way . Anyway I will spend 2 hours computing power.

A) do 8 runs @ 15 min each with 30,000 part.  and average yield 0.15%

do 1 run during 2 hours with 44,000 part.  and muon-yield of 0.90%

Rewarding would be (SB-now) resp.  (gogo) :

A) (SB-now) = 240,000 part.  or (gogo) = 36,000 (part.  * %)

(SB-now) = 44,000 part.  or (gogo) = 39,600 (part.  * %)

I cannot see my proposal to be less precise or inappropriate compared to what has happened until now.
Perhaps it would prefer better results a little bit more and one could consider that as unfair, too.  O.k., but even then its relatively much fairer than before.

And last but not least, the origin purpose of this project is to try to find the _best_ design, measured as muon-yiled-%, and not to produce a maximum of particles...

Hope nobody fell to sleep while reading this longer explanation.

Would be nice to get some more response, especially from users who experienced the effect of strongly increasing calculation time at higher yields.
Stephen Brooks
2002-08-14 16:01:06
I've decided I'm not going to radically change the scoring until version 5, when I will make some other "system architecture" changes as well.  One reason for this is that people object to their stats being messed around.

However I think there should be a bigger reward for the higher scores, so for results above 1% I will do the following...

1% - 1.1999% particles score multiplied by 5
1.2% - 1.3999% particles score multiplied by 6
1.4% - 1.5999% particles score multiplied by 7
(etc.)

This is admittedly not as "neat" as the other system, but it does have the advantage of preserving the existing stats scores (as nobody has gone over 1% yet).  It won't reward very fairly in the 0.6% - 0.9999% range, but fairly soon I shall upload a best-results seed file that should allow more people to get into the higher >1% results range fairly quickly.


"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
gogomaus
2002-08-14 16:33:38
Thanks Stephen,
no revolution, but a smart idea, which will pay off somewhere in the future.
I do understand other user´s concerns and the modern "Team spirit".
That are hard times for single fighters.

A look at my muon progress prediction module (MPPM) shows me, I will break through 1%-boarder on Saturday morning shortly after sunrise.... big grin
Stephen Brooks
2002-08-14 16:45:12
(Directed at gogomaus) This isn't strictly a complaint, but you could have TOLD me you had manually modified some of your results to get better yields.  I found this when I constructed the best250.dat (I allow 10 results max per user and other than that the file just contains the best ones).  Your best results had uncannily 'round' numbers and sequences in them.  I assume what you did was to put one manually-constructed result into results.dat so that the program would pick it up and test that design.  This is actually fine and I was thinking about adding in such a "manual design" capability to a future version.


"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
gogomaus
2002-08-15 02:22:02
@ Stephen,
your ananlysis is correct.  After 250 runs my prog was lost in optimizing some rather idiotic designs, so I manually overwrote the first entry in my result.dat with a "fairly good" one, giving it a virtual yield of 0.999999%.
Of course, never sent this in for stats counting.
And indeed it worked after a while producing some 0.8x% results by repeating, mutating and coupling with others.
I chose artificially round parameter values to make it obvious and being able to detect those "gens" easily in further generations.
The reason not to inform you was simply the _feeling_ you were not interested in such an additional accelerating way.  I posted several times (under v4.0x ?) to concentrate the search to a more limited area especially in terms of zrsc and rotsrc (and now s1f).  Also I made the proposal not to start from scratch, but to spend small seed packages from beginning.  In the past I got no (satisfying) answer from you, so I didnot like to repeat this basic issue again to you.  As you nowadays look more openminded for similar techniques, I´m happy to have demonstrated to you something worthful.  It was not my attitude being smarter than you, and for future I will come back to a more discussion mood.

@ community
It was a great pleasure for a tiny user than me to lead the %-top score for some days in competition to Mega crunchers, who are able to do more work units per day than I am able to produce during a month.
I have paid for those top results by still slower new result speed.

You are free to crucify me, but I´m still convinced to have done some intelligent and valuable contributions to µon project...
Stephen Brooks
2002-08-15 03:27:27
quote:
Originally posted by gogomaus:
After 250 runs my prog was lost in optimizing some rather idiotic designs, so I manually overwrote the first entry in my result.dat with a "fairly good" one, giving it a virtual yield of 0.999999%.


Looking at my results graph it seems that although you may get these occasional plateaus, the algorithm does a fairly good job of continuing to increase the yield.  I could have actually seeded this project with a ~1.8% result from v4.1x that I could have converted in terms of the new genome, but was more interested is seeing what happened with this algorithm if it was just given free range within the bigger design space.

Your "intervention" does not really mess this up because you probably changed the design to something not quite the same as I was working on in v4.1x, so it is still looking at different sorts of things.

quote:
The reason not to inform you was simply the _feeling_ you were not interested in such an additional accelerating way.


I was mostly concerned that if I told people about this, some idiot would alter their results.txt and I'd get sent invalid results.  A safer way would be with a "manual design" mode integrated with the graphical program that will produce a queue of manually-submitted designs to test before the program goes back to it's normal algorithm again.

quote:
I posted several times (under v4.0x ?) to concentrate the search to a more limited area especially in terms of zrsc and rotsrc (and now s1f).


That is really what I'm trying to get the algorithm to do by itself.  I have one or two ideas for other sorts of optimisation steps, and if I mix those in, the proportion of random runs will decrease from 1/4 to 1/6 or something, which I think was your problem - the number of random results being used.  Also if I put in manual-mode I could also let the users control the proportions of these different methods used.  There's no harm in allowing that control at this stage - in fact it can do very little except accelerate the optimisation further.

quote:
Also I made the proposal not to start from scratch, but to spend small seed packages from beginning.  In the past I got no (satisfying) answer from you, so I didnot like to repeat this basic issue again to you.  As you nowadays look more openminded for similar techniques, I´m happy to have demonstrated to you something worthful.


The main problem I had with many small seed-packets is that it meant doing something more with FTP and I didn't want to risk breaking an already-working system.  If I ever implement auto-version-updates I will also have the chance to think about such a two-way system because I'll have to manage downloads anyway.  The other issue is that searching the whole database (now 100MBish) for the best results takes quite a lot of time and I would rather not do it every hour like the stats updates but maybe once per day.  Such a system would probably work by generating a best500 file once per day, then splitting it into 100 5-results "seeds", and when users send their results, they automatically download a seed which gets added to the end of their results.dat.  But that is not problem-free either because you might get people who keep manualsending every 1 result until they get given the seed containing the highest result!  (Defeating the point of preserving variety by having limited access across results).

It was always my intention to release these bestxxx.zip files from time to time, but I preferred to keep it a fairly long time between them to let the simulations diverge a bit first.


"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
gogomaus
2002-08-15 05:06:59
quote:
Originally posted by Stephen Brooks
quote:
Also I made the proposal not to start from scratch, but to spend small seed packages from beginning.  In the past I got no (satisfying) answer from you, so I didnot like to repeat this basic issue again to you.  As you nowadays look more openminded for similar techniques, I´m happy to have demonstrated to you something worthful.


The main problem I had with many small seed-packets is that it meant doing something more with FTP and I didn't want to risk breaking an already-working system.  If I ever implement auto-version-updates I will also have the chance to think about such a two-way system because I'll have to manage downloads anyway.  The other issue is that searching the whole database (now 100MBish) for the best results takes quite a lot of time and I would rather not do it every hour like the stats updates but maybe once per day.  Such a system would probably work by generating a best500 file once per day, then splitting it into 100 5-results "seeds", and when users send their results, they automatically download a seed which gets added to the end of their results.dat.  But that is not problem-free either because you might get people who keep manualsending every 1 result until they get given the seed containing the highest result!  (Defeating the point of preserving variety by having limited access across results).

It was always my intention to release these bestxxx.zip files from time to time, but I preferred to keep it a fairly long time between them to let the simulations diverge a bit first.


What about to use a counter, so that small pack seeds will be limited to one per day and user, just to avoid too much traffic from manualsenders ?
I know is much easier to make proposals than getting them established on a live server.  So, if that causes too many (new) problems, forget it and proceed like you pointed out.  Everybody can live with actual procedure.
Stephen Brooks
2002-08-15 07:20:10
Well I thought of another way around it which was to allocate each user a coordinate (x,y) on an integer grid and only let that person receive seeds from their 4 or 8 nearest "neighbours" on the grid.  But then I realised eventually the good results would spread everywhere throughout the grid anyway, so it's almost equivalent to a best250 file at less-regular intervals.

Your method would again serve to limit the spread of results, but the eventual effect would be the same - i.e. I think the good results will tend to spread around anyway, so it would be very difficult to gauge any difference between these methods and the current one.  There is also the question of choosing the "limiting rate" for seeding - and I'm afraid this would end up being a nearly arbitrary decision because it is difficult to test and compare different values' effects on the project unless I did something drastic like splitting it into five parts, each with a different seeding frequency.  And all this might just be to accomplish what best250 is already doing.


"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
gogomaus
2002-08-21 09:06:49
Dear Stephen,
I agree mostly on your detailed considerations.  And especially that some other ideas would cause much more complications and perhaps unforeseeable complications, so I commit to your way of general seeding with adaequate frequency.

Just a more actual preannouncement : I will manualsend an extraordinary good top result later this evening.  Just to avoid any misunderstanding between us :
I definitely do confirm its natural evolution (just added the best 100 of your last Top 250 to my result.dat on Sunday); no "manual acceleration" !
You may delete this message after having read it, but I couldn´t contact your e-mail SB@stephenbrooks.org.  (failure message).
Bye, Wolfgang
Pascal
2002-08-23 05:36:15
quote:
Originally posted by Stephen Brooks:
..
That is really what I'm trying to get the algorithm to do by itself.  I have one or two ideas for other sorts of optimisation steps, and if I mix those in, the proportion of random runs will decrease from 1/4 to 1/6 or something, which I think was your problem - the number of random results being used.  Also if I put in manual-mode I could also let the users control the proportions of these different methods used.  There's no harm in allowing that control at this stage - in fact it can do very little except accelerate the optimisation further.


Stephen, I've got a question:

How much do you want the client to produce these randomly generated results?  Is 1/6th enough?
Now I know about these four different kinds generating results...
If my question does not have any sense to you, delete this posting.

___________________________
Member of www.rechenkraft.net - German Website about Distributed Computing Projects

1: Athlon TB-C, 1.2 GC/s, 256 MB DDR-RAM, Erazor x², ADSL-Flatrate, NIC Intel, Win 98 SE Mainboard MSI-6380 Rev.  1
2: Pentium III, 600 MC/s, 256 MB RAM, NIC Intel, Win 98 SE
Stephen Brooks
2002-08-23 06:14:19
The problem is, it's very difficult to _tell_ what the correct proportions of these four types are.  I assumed that putting them equal would not be _too_ bad a choice in any circumstance, and have also attempted to test the algorithms on "toy" problems to see if mixing the types differently changed the speed of optimisation.  From those results it still looks like having an equal amount of each algorithm is close to being the best thing.


"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
: contact : - - -
E-mail: sbstrudel characterstephenbrooks.org
Yahoo: scrutney_mallard
Jabber: stephenbrooksstrudel characterjabber.org
Twitter: stephenjbrooks

Site has had 16040685 accesses.