stephenbrooks.orgForumMuon1GeneralProject offline while I move to the US
Username: Password:
Search site:
Subscribe to thread via RSS
Stephen Brooks
2013-10-18 22:12:23
Some time this Monday (21st October) I will switch off the stats server - currently my home PC - so that it can be shipped to the US.  I'll send it by air freight but the main delay is whether the Brookhaven Laboratory network will play nicely with FTP or not.  The worst case is that I can't run the stats until I've moved into my own apartment (~2 months).  The best case would be stats back up in about 2 weeks.

So it's up to you how to handle this.  You may run in offline mode and then dump results when it comes back up (and I will find a way ).  Or you can take a muon holiday and try other projects, the LHC has a good one on monte carlo particle collision analysis.

This warning is also so that you can upload any results you want credit for before the statistics freeze.

Sorry for the inconvenience - moving to another country is a bit tricky administratively!
Zerberus
2013-10-18 23:20:53
No problem, take your time.  I'll take a little muon time off and do other projects in the meantime.

Will flush the remaining results ASAP.
[OcUK]diogenese
2013-10-19 09:11:43
I have done the emigration thing myself and from experience some stuff can lose it's priority when you are trying to adjust to a new normal
my low power borgs will probably stay on but I think my main cruncher may find someting else to do for a bit.  Enjoy the move! 
Stephen Brooks
2013-10-21 12:09:49
Stats now stopped.
K`Tetch
2013-10-21 22:00:20
For those worried, any results you do send should still be counted, EVENTUALLY.

Three of the four current results servers are still running and should take your results when generated.  normally, the stats server will then pull the results from these, but since it's not going to be online, it won't.

There are no promises about the intermediate results servers, as a multi-week hold may swallow up a lot of disc space, but the transfer method is a positive-response one (meaning unless everything checks out, it won't mark a transfer as completed)
You can keep an eye on the server stats on the server stats page

OR you can set your config files to continue offline.
Stephen Brooks
2013-10-23 18:53:13
The air freight has been loaded today, estimated arrival in US in 7-14 working days, which is 2013-Nov-01 to 2013-Nov-12. I still have to find out about FTP/network issues, though, plus factor in the time to set it up correctly.
Stephen Brooks
2013-10-24 20:57:03
Found some information on the FTP issue here. Looks hopeful: as long as the FTP is initiated from inside the network it's OK.  I can't run a server but I couldn't at the previous lab either: that's why the stats server runs an hourly batch system.

[edit] Just did a test of FTP scripting from here in my web backup utility.  Just needed the proxying as per that page, plus "quote pasv" immediately after logging in to activate passive mode.

[edit2] Hmm, it's working *intermittently*. Asking network guys about it.
Stephen Brooks
2013-10-31 20:51:15
Got UPS calling me because they hadn't received a customs form and my passport scan.  Have provided those.  Probably means my air freight will arrive soon.

Meanwhile the network help here are confused why my FTP is only working intermittently (it works from their computers on this site), they're offering to do packet traces etc. over the next few days.
[OcUK]diogenese
2013-11-01 06:36:05
Good news I'm getting withdrawal symptoms! 
Stephen Brooks
2013-11-01 21:04:51
Everyone here still puzzled by the FTP problem, it may be specific to the Windows FTP client (when used with FTP proxies).

[edit] I've got a workaround using WinSCP's scripting mode working for a simple case.  Don't know yet if it has the "multiple get" function required for Muon1's stats server too.
Stephen Brooks
2013-11-05 01:30:53
Well, this wasn't exactly what I intended.  My air freight boxes got delivered to my office instead of my temporary apartment.



The right-hand one of these I think has the Muon1 server HDDs inside my home PC.  I need to get these back to the apartment tomorrow, power up the home PC to make sure it's OK and then take the Muon1 HDDs back out and to the office.
[OcUK]diogenese
2013-11-05 17:58:56
They look like they have taken a bit of a beating, I hope things are okay inside
Stephen Brooks
2013-11-06 00:03:58
Yes, not good news.  The PC took a bashing right on the corner where the hard disks were.  Metal case bent and HDDs fallen out of their trays.  A few plastic bits were rattling around the case.

Fortunately I have a backup of the Muon1 database in the UK and I've just e-mailed to ask my group to make an extra copy of that.  Can't power on the PC right now as the cables are in another box, I'll try to get a US power cable tomorrow.

I've photographed the damage and insured the PC for £1050, so at least I get to replace it if it's completely broken.
Stephen Brooks
2013-11-06 00:07:34
Incidentally it's pretty hard for Muon1 to lose the *important* data: that is, the best results and the stats totals.  About 6 different places record the stats scores and the best results are distributed via the sample files to >100 people.  The probability of scientific data loss is small.  The worst plausible case is that the intermediate results (those not good enough to be in the sample files) are lost.  And right now, barring the office in the UK blowing up, the situation isn't even that bad.
RGtx
2013-11-06 01:29:56
And in the case of Linac900Ext6Xc2_nosample ? 
Stephen Brooks
2013-11-06 03:26:57
In that case I just ask the user who had the highest score - currently K`tetch, weirdly - to send me their results.dat.
Stephen Brooks
2013-11-07 00:49:31
More drama today: the situation was the reverse of what I thought.  The UK backup had been deleted because my backup script saw that the drive wasn't there and decided to "sync" by also removing the backup.  However, when I got home and powered on the dented PC, at least 3 of the 4 HDDs worked, including the system drive and both mirrors of the Muon1 database.

I've now brought those HDDs to my office but have the problem that Dell (F***ing Dell) have value-engineered my tower to only have 2 HDD bays, both of which are full.  I don't know what the point of a tower workstation is if it's not expandable but there you are.

This is why I'm going to ask everyone for recommendations: what's a good external HDD case?  Fast enough to keep up with a mechanical hard drive.  I believe I have USB 3.0 ports on my workstation.  I'll need two of the things, one for the main drive and the other for the mirror.  Are there two-bay ones?  Not sure.  Anyway I'll probably go and buy them online because IT here takes several months to procure anything and the two HDDs are my personal property too.
Stephen Brooks
2013-11-07 10:16:26
This is pretty much what I'm looking for, though it has mixed reviews.

[edit] For the sake of speed, I've ordered the StarTech enclosure with next day delivery, so it ought to arrive on Friday if the office internal mail also works.  Will also ask about the laboratory's backup services because they do have them and I need a backup not physically in the same place.
K`Tetch
2013-11-08 02:33:18
I'd actually suggest getting a whole new PSU, rather than just a cable.  I've heard of a few incidents of psu's frying when you switch voltages, mainly after having been on one set for a long time.  It happened to me when I moved to the US in 03.

It's pretty cheap to avoid any collateral damage to the system as a whole, and you'll still have the old one as a backup in case I'm being paranoid.

And don't sound so surprised I got the top score, I've been focusing this q6600 on it, and I've been doing some manual tweaking when work's been slow.  I see your evolutionary algorithm, and raise you some intelligent design! 
Stephen Brooks
2013-11-08 19:26:20
Got the StarTech enclosure, two drives installed, only the 320GB appears with a filing system.  My first priority is backing that up to the network drive here.  Estimated complete in 9 hours.  Then I'll investigate what's happened to the 2TB disk.
Dave Peachey
2013-11-08 21:35:44
@Stephen, although I've pruned my results.dat files in the last few weeks, I still have a number of reasonably high (>4%) scores in case of need ...

@K'Tetch, my i7-3930K is still full on Muon1 (all 12 cores and 5 Muon1 instances) and I think I may have a surprise waiting for you once the stats collation resumes ... and no "intelligent design" just pure evolution

[TA]Assimilator1
2013-11-09 00:58:33
Ah that's why the measly output from my part time dual core Pentium hasn't shown .

What was the move to the US for?
Hope you get your PC fixed, that 4th HDD dead?
Stephen Brooks
2013-11-09 18:03:31
The backup to the network drive has finished.  Bit worried about the 2TB muon results drive because it appeared to be unformatted when I put it in this enclosure.  Had to reformat and am recopying the results from the other drive.  Should probably run a disk check on it (or alternatively it could be something funny about this cheap enclosure not accepting certain HDD formats -- GUID partition tables for instance?)

Anyway I'll wait for the laboratory's backup service (with history they tell me) to be running on this before I activate actual stats scripts.  Monday is a US holiday so late next week there might be some progress.
Stephen Brooks
2013-11-10 03:45:07
Ran Windows disk check (thorough including looking for bad sectors) and can't find anything wrong with the 2TB muon disk.  Also the disk check utility can read the volume label "Muon1 Results 2TB" whereas Explorer just shows "Local Disk" and I can't change it.  Checking the 320GB disk too now.
[DPC]white_panther
2013-11-11 01:59:10
i had some good results retrieval data with 'getdataback' from runtime software
Stephen Brooks
2013-11-14 13:59:50
Sent another e-mail yesterday asking about the site's backup service because they'd not got back to me.

[edit] Apparently they should have set this up two days ago, hopefully will get some response tomorrow.
Stephen Brooks
2013-11-15 23:06:53
OK, they didn't come today either.  I think I might start setting the project up anyway, perhaps this weekend.  I do have mirrored drives plus a network backup, albeit on a network drive that's not really meant for backups.
[OcUK]diogenese
2013-11-20 17:36:15
It seems to me that the IT department is a bit under staffed/trained
Stephen Brooks
2013-11-20 23:18:55
They've been good on previous occasions but this request seems to have gone into a black hole.  I might try phoning someone up rather than e-mailing tomorrow.  I asked the local IT to get hold of them but they've not turned up.
K`Tetch
2013-11-21 01:43:08
Have you tried turning it off and on again?
Stephen Brooks
2013-11-21 15:00:30
Bypassed all the systems and just called one of the IT guys' office phone number.  He says the network drive I've chosen to back up to is a pretty good choice because it is itself backed up.  The domain backup utility just backs up my desktop and user documents folder (that I don't use, though I've kept the utility because I sometimes leave important stuff on the desktop and forget to backup there), so he's asked about other options for scientific data.

I'll start adapting the scripts to work with the slightly odd FTP setup here.

[update] Apparently there's a departmental NAS drive I didn't know about and they're trying to get me an account on that.

[update2] Got access to the departmental NAS, currently backing up onto it, though I'm still backing up to the other network system (and the mirrored drive).
Stephen Brooks
2013-11-24 01:55:13
I've just updated and recompiled all the stats server scripts.  Genstats is running, first thing it got was about 100MB of BOINC results over the past month.

[update] Now it's made contact with matrixworld.serveftp.org.

Wow [TN]steinrar sent a lot of results on October 24th.

Didn't make contact with 81.0.198.134 Michal Hajicek but hopefully will next run, since that server is still working.  muon1.dyndns.org ought to come back in early December once I've moved in and have broadband.

Why does matrixworld.serveftp.org's FTP server contain a file called "hax0r.txt" that contains the text "This is just a test"? 

Right, Linac900Ext6Xc2_nosample has about 20k sends to merge with the existing files.  That'll take an hour or three.  I'm going to see if any food places are still open at 11pm.

On the coloured graphs I see 34 results have been returned for Linac900Ext6Xc2_nosample from version "v4.47v". That might be one of my colleagues accidentally running a developer version (4.47_dev) that's had limited distribution.

At 1am the _nosample has finished but there are another 20k uploads on 8Xc2!  I'm going to go home and monitor this from a remote desktop.
Stephen Brooks
2013-11-24 07:51:04
There was a crash on one of the 20k files for 8Xc2 (I'll just temporarily remove the offending file for investigation), but now I'm rerunning it, I've managed to get downloads off white-panther.mine.nu.

Alexey Petrov (Russia) seems to do many uploads in a batch, [TN]haraldr appears to have set his upload size to large (90KB .bin rather than 10KB from everyone else).
Stephen Brooks
2013-11-24 14:27:59
OK, it's updating, though I've still only seen 2 of the 3 FTP servers being contacted (81.0.198.134 may not have been).  Did you get the scores you expected?
Dave Peachey
2013-11-24 14:53:42
Stephen,

I was down by 300 results on the first update (Oct 21 127,000; Nov 24 126,700) and I didn't upload anything whilst you were offline.

My first (manually) uploaded batch of 1,000 results today has been picked up (it went through to white-panther.mine.nu) but not the second batch of 1,000 (which went through to 81.0.198.134).  I'm waiting to see what happens over the next few hours before uploading more.

Dave
Stephen Brooks
2013-11-24 20:07:45
According to my logs, you never had 127000 results until just now! 

2013-Oct-15 08:07:53 Linac900Ext6Xc2_nosample v4.4 Added 200/37700.7/0.308401 BMark 124800/172414420.9/4.014 to 125000/172452121.6/4.014
2013-Oct-15 22:07:12 Linac900Ext6Xc2_nosample v4.4 Added 50/164134.9/3.919462 BMark 125000/172452121.6/4.014 to 125050/172616256.5/4.014
2013-Oct-15 22:07:12 Linac900Ext6Xc2_nosample v4.4 Added 200/40090.6/0.314703 BMark 125050/172616256.5/4.014 to 125250/172656347.1/4.014
2013-Oct-16 14:06:26 Linac900Ext6Xc2_nosample v4.4 Added 50/147614.7/4.01249 BMark 125250/172656347.1/4.014 to 125300/172803961.8/4.014
2013-Oct-16 21:04:44 Linac900Ext6Xc2_nosample v4.4 Added 250/61038.8/0.372824 BMark 125300/172803961.8/4.014 to 125550/172865000.6/4.014
2013-Oct-17 11:03:17 Linac900Ext6Xc2_nosample v4.4 Added 50/134136.0/3.998227 BMark 125550/172865000.6/4.014 to 125600/172999136.6/4.014
2013-Oct-17 18:06:13 Linac900Ext6Xc2_nosample v4.4 Added 200/47598.9/0.372388 BMark 125600/172999136.6/4.014 to 125800/173046735.5/4.014
2013-Oct-18 00:06:58 Linac900Ext6Xc2_nosample v4.4 Added 50/142766.4/4.012364 BMark 125800/173046735.5/4.014 to 125850/173189501.9/4.014
2013-Oct-18 01:05:06 Linac900Ext6Xc2_nosample v4.4 Added 50/150821.6/3.833888 BMark 125850/173189501.9/4.014 to 125900/173340323.5/4.014
2013-Oct-18 17:03:27 Linac900Ext6Xc2_nosample v4.4 Added 200/53344.7/0.379739 BMark 125900/173340323.5/4.014 to 126100/173393668.2/4.014
2013-Oct-18 20:08:28 Linac900Ext6Xc2_nosample v4.4 Added 50/138366.6/3.993496 BMark 126100/173393668.2/4.014 to 126150/173532034.8/4.014
2013-Oct-19 08:03:07 Linac900Ext6Xc2_nosample v4.4 Added 50/135510.2/4.009914 BMark 126150/173532034.8/4.014 to 126200/173667545.0/4.014
2013-Oct-19 19:02:48 Linac900Ext6Xc2_nosample v4.4 Added 200/53416.9/0.392704 BMark 126200/173667545.0/4.014 to 126400/173720961.9/4.014
2013-Oct-20 11:02:41 Linac900Ext6Xc2_nosample v4.4 Added 300/364885.6/4.00738 BMark 126400/173720961.9/4.014 to 126700/174085847.5/4.014
2013-Nov-24 07:14:46 Linac900Ext6Xc2_nosample v4.4 Added 458/1095923.9/4.016113 BMark 126700/174085847.5/4.014 to 127158/175181771.4/4.016113
2013-Nov-24 07:14:46 Linac900Ext6Xc2_nosample v4.4 Added 542/1098594.8/4.01648 BMark 127158/175181771.4/4.016113 to 127700/176280366.2/4.01648


Here the values after "BMark" are "Results/Mpts/Best Yield".
Dave Peachey
2013-11-24 20:23:12
Stephen,

Interesting!  My last pre-shutdown (21 Oct) snapshot score showed as 127,000 but, hey, I'm not going to worry too much about a few hundred results here or there.

I'm seeing two of my three uploads from today but still haven't seen (what I presume is) today's 1000 result upload from 81.0.198.134 so just waiting on that one.

Cheers
[Edited by Dave Peachey at 2013-11-24 20:25:17]
Dave Peachey
2013-11-24 20:44:27
Just noticed that my last 300 results (up to 127k) went to 81.0.198.134 on 21 Oct at 06:17 GMT so they may be stuck in the jam somewhere on that server ... ?
Stephen Brooks
2013-11-24 21:15:04
Yeah the guy at 81.0.198.134 is saying he's not seeing all the logins (just the signal.dat one, not the results-getting one).  I'm guessing that means a third of the results are still on there.  I'm going to see if he can fix the server over the next couple days (or if I can find another way of logging in) but otherwise will remove it and just get the results on there by e-mail.
Stephen Brooks
2013-11-27 02:16:48
@Dave Peachey: your missing 300 results ought to appear soon because I've managed to flush 81.0.198.134.

So now the network is working pretty much as normal.  I found a few things while recompiling the stats server: I had to increase the timeout from 10 seconds to 30 seconds because 81.0.198.34 seems to respond very, very slowly, but it should work for now.  I also noticed half a dozen BOINC hourly files from throughout 2013 that hadn't been processed because one user had a VERY LONG NAME, which was impossible for the Windows file system to create a file for.  So it was aborting and leaving the files there for later.  I've now made a rule that if a BOINC user's name doesn't fit in a tweet, their results go into a fiery pit.  The other users in those files with sensible-length names will finally get their stats back after a delay of several months.
Dave Peachey
2013-11-27 14:43:24
Stephen,

I see that all the results I've submitted since the project came back online have been picked up ... just not the missing 300 from right at the point it went offline.  Still, not to worry, I've got some more waiting to upload so I'll give it another shot.

Dave
: contact : - - -
E-mail: sbstrudel characterstephenbrooks.org
Yahoo: scrutney_mallard
Jabber: stephenbrooksstrudel characterjabber.org
Twitter: stephenjbrooks

Site has had 16214710 accesses.