2007-06-02 00:20:54
Well Stephen it was just a suggestion.... Anything at all might do.

RGtx why thank you for your helful comments. 

Nobody saw a reason to complain in that thread quoted.  However in this thread 2 have complained about not so Puritan talk. 

2007-06-02 00:49:33
riptide, I was jesting, to be honest, the language and imagery employed in this thread, would hardly make a maiden aunt blush.

On a lighter note, in all probability it was you who brought this fiasco on us - never ever denigrate a Dell, it will always pay you back!
2007-06-02 05:30:46
Oh right.  hehe!  Was it me in my suggesting new hardware mode?  ooops.. did I not include the disclaimer
2007-06-02 17:27:13
Having participated in this project from the "early days" - and having been at one time a "many boxen" producer, I understand the frustration of some for whom stats are to live and die by.  There's certainly room in the project for the small producer, all the way up to those that command hulking datacenters of crunch, and that's one of the great things about DC.  We can choose to crunch large volumes, or crunch high quality -or anything in between - and everyone's motives are suitably pure.

May I suggest that rather than berating Stephen (who has done a yeoman's job singlehandedly managing a huge project,) would it not be just ever-so-slightly more useful to provide positive input toward improving the reliability of the stats?

I would submit that there is a pretty substantial amount of expertise in a number of areas involved here.  Certainly it's all up to Stephen how or IF any participants can assist from the IT perspective... but in any event, I'd have to think that people as talented as the participants of this project can do better than whine about how things aren't "just so." I offer this - bearing in mind that I myself have been one of the biggest whiners about various issues that are meaningless in the grand scheme of things.  Mea Culpa.  My attitude about my contribution to this project changed when I began reading and trying to understand what it is we're doing -- and it's fascinating stuff.  So for me... it's now about more than stats.  (convenient for me since I now only have two machines, down from my former crack-rack and data lab days.)

It is notable that a number of people offered technical assistance, and that is laudable.  I do feel that some of the criticism is just a little bit over-the-top.  Yes - much of it was tongue-in-cheek, and sometimes humor doesn't translate well... but I think that some of us have treated Stephen quite a bit more roughly than he is deserving of.

The questions I pose are these:

A. Would Stephen consider a small virtual team to assist in some yet-to-be-determined way in the technical and computing aspects of the project?

B. Would there be participants who would be willing donate their time to make (A.) possible?

If A. and B. are true, then shall we move forward?

Just some food for thought.

Stephen Brooks
2007-06-04 15:21:12
Well we've gone as far as distributing the FTP servers.  Distributing the datacenter would require me to come up with some agreed protocol for everyone to store and count the results.  Having said that, it might also make cheating a bit too easy, unless of course the clever "agree by consensus" arrangement between the datacenters makes that unlikely.

Also be aware the database as it stands is 136GB (89GB on a compressed drive) and will get larger in its current arrangement.  This would also be a rather high bandwidth activity, I'm not sure we really want to multiply up the network traffic just for the sake of it!  It'd be hard for me to receive a backup of a database of that size too if it's not on my LAN (100GB at 8Mbit/s is over a day...)
Stephen Brooks
2007-06-04 15:22:16
...not that you can usually upload at 8Mbit/s anyway over DSL!  Mine at home can receive at 6.5Mbit/s but I think the upload smaller by some ratio like 8:1.
2007-06-05 02:14:02
What if you set a break point at 100G move that first 100 G to a dive of it's own make a backup of it for safety
Then only make backups of the last 35+ as it grows.  That way you could use a network drive backup They only cost $200.00 + for a 500 gig unit and $350.00 to $400.00 for a 1 TB
They can work with out human intervention too

You can also use incremental to save on bandwidth
If you can find a way to backup over the INTERNET to a IP address I have 2 to 3 TB of free HD space in my system at home
2007-06-08 08:26:50
136GB backed up to a decent backup server would take well under an hour.  If the database is something sensible, it should support incremental backups enabling the data to be continuously secured.
Stephen Brooks
2007-06-08 08:46:27
I meant a remote backup as DanC was suggesting.  I was only following up his idea...
