stephenbrooks.orgForumMuon1Bug ReportsP4 issues with version 4.31c
Username: Password:
Search site:
Subscribe to thread via RSS
[OCAU]GenomeX
2003-05-20 16:23:52
After installing version 4.31c on my Pentium 4 2.4B, I let it run for 24 hours and it produced only 2kb of results!

It had no problems when runnning 4.31b (which I have now reverted back to) so I can only see it being a glitch somewhere.

Other users on the OCAU forums are experiencing the same problem also.  Link: http://forums.overclockers.com.au/showthread.php?s=&threadid=109660&perpage=15&pagenumber=37
(sorry you will need to sign up to the forum to view the thread)

Thanks for any advice.

Aiden.

P.S. can we please have a P4 optimised client Wink

Muon Project member for OCAU
[OCAU] badger
2003-05-20 16:37:24
I too am having crashing problems with v4.31c, on my Athlon XP 2000+ (Win XP).  None of my other machines (Win 98) are having problems, but they don't get used for games etc as well.

ps my athlon is beating GenomeX's P4, I would like it to stay that way Wink

www.BadgerMotorsport.tk Proudly sponsored by GRX-Computers
Stephen Brooks
2003-05-20 16:39:01
One basic thing to check is to make sure you install to a clean direcory (no old config.txt hanging around).  The other thing to be aware of is that new results will be rechecked 5 times, so it will be slower than before, but that feature was present in v4.31b and shouldn't reduce the output by as much as you say.

How much CPU is the program taking up, and are you running the graphical, background or commandline version?

Today's weather in %region is Sunny/(null), max.  temperature #NAN°C
[OCAU] badger
2003-05-20 16:48:34
I run the commandline version, usually with muonmonitor overlooking it.  I did the install into a new directory and added the old results.dat, user.txt and sentlog.log files from the previous.

www.BadgerMotorsport.tk Proudly sponsored by GRX-Computers
ZeonX[OCAU]
2003-05-20 17:03:53
yeah i'm having problems with my machine. 

Its a 1.2 T-bird with 512 mb ram running the commandline client.  It just crashes and normally while i'm sleeping but i normally have something to take over if it does as i had a few nights were my comp was doing nothing.

My p2-500 is working great and hasn't had 1 problem (v4.31b) Wheres my comp will crash with either 4.31b or 4.31c Frown

hope you get it fixed and i don't mind running a de-buging version to log stuff for you either.

ZeonX - Muoning for OCAU
[OCAU]GenomeX
2003-05-21 03:37:20
I am running the background client with a fresh install exceot the user.txt and results.dat.  From taskmanager it says it was running 23 hours then I checked the results.txt file and saw it only increased 2kb. 

Hopefully we can get this fixed up as some people are getting shitty with it, as muon has always been so stable in the past.

Muon Project member for OCAU
Pollock[Romulus2]
2003-05-21 09:23:41
Not sure why you two are having problems with the Athlons.  I am running an Athlon XP2000+ and a T-Bird 1200 here with no problem. 
Both are running the 4.31c background client on Win 2000 Pro.

I can't help with the Pentiums, though.
Stephen Brooks
2003-05-21 15:49:42
quote:
Originally posted by ZeonX:
hope you get it fixed and i don't mind running a de-buging version to log stuff for you either.

So is this program crashing, or slow, or does it differ across different people's systems?
A crashing client can be diagnosed with a debug build, which I can have tomorrow.
If yours is just slow, you could try seeing what happens if you switch off autosaving (set the interval to 0 in config.txt) because I changed that in version v4.31c from b.
Also, in the slow clients, I'd like to see what the results that come out of them look like (in case it's a queuing problem).

Today's weather in %region is Sunny/(null), max.  temperature #NAN°C
[OCAU] badger
2003-05-21 16:30:27
Mine has been crashing, usually at the end of a simulation.  It pops up a polite "sorry Muon_cmdline.exe has to go byebye, hope you weren't doing anything important". Often when I try to restart it, it will just pop up the window for a fraction of a second and then die again.  killing everything and trying again usually works ok.  It has happened about 5 times now, usually just before I go to bed or after I leave for work, so it is not running for the max possible time!

www.BadgerMotorsport.tk Proudly sponsored by GRX-Computers
ZeonX[OCAU]
2003-05-21 19:24:29
Yeah its the program crashing.  Just like how badger described it.  It seems to be going at the same speed as v4.31b when its running.  And even 4.31b is crashing on me now, well it did when i first tried the new version and then one day it started working for about 2 week with no probs and now the problems again.

Sounds good to use a de-bugging version so you can upload that to my ftp if you really want or email... or jsut a link.  Maybe you should have a small beta testing team to make sure some little bugs are gone before you release it to everyone, i wouldn't mind being on it.

EDIT: Also stephen if i run muon and i run out of free HD space to save the saving file it crashes muon straight out, i know i set it to not save but i don't want to loose work because its on a slow comp.
Brian Sogard
2003-05-22 06:52:53
quote:
Originally posted by Stephen Brooks:

So is this program crashing, or slow, or does it differ across different people's systems?
A crashing client can be diagnosed with a debug build, which I can have tomorrow.
If yours is just slow, you could try seeing what happens if you switch off autosaving (set the interval to 0 in config.txt) because I changed that in version v4.31c from b.
Also, in the slow clients, I'd like to see what the results that come out of them look like (in case it's a queuing problem).




v4.31c crashes on win2kpro and winxp systems I have, not every time, but often enough that I don't run muon on those systems.  It is not specific to processor, as the processors range from Pentium 233MMX to AMD XP1700. It runs fine on my win98se system.  And I understand those of you who have muon running on win2k and winxp, it just doesn't run on several of my systems.  I would also be willing to run a debug version on both my win2k and winxp systems.

[edit] I would actually clarify, all of the v4.3x builds have been crashing for me on my win2k/winxp systems, this isn't new to v4.31c [/edit]

[This message was edited by Brian Sogard on 2003-May-23 at 0:16.]
Stephen Brooks
2003-05-22 16:45:03
http://stephenbrooks.org/muon1/v431c_cmdline_debug.exe

I think I heard that you were having trouble with the commandline client, so here's a version of that with debug-tracing activated.  If it crashes, a box ought to come up with line numbers in it, which will tell me where the bug might be.

Today's weather in %region is Sunny/(null), max.  temperature #NAN°C
Brian Sogard
2003-05-22 18:41:53
Duron 850MHz/256MB ram, win2kpro debug screen reads

0 Cell_freeref [5] c:\ral\muon1\muon1.c 154
1 parser_free [3] c:\ral\muon1\muon1.c 33
2 Machinecell [6] c:\ral\muon1\muon1.c 481
3 machinecellfromfile [13] c:\ral\muon1\muon1.c 549
4 SBCLIB_main2 [46] c:\ral\muon1\muon1.c 110
5 main [0] c:\ral\muon1\muon1.c 64

Fault occurred outside a function scope
current instruction 0x77f5215c

second machine is another duron, 1,2GHz 512MB ram winXP Pro, fault screen is identical

Hope this helps!
Stephen Brooks
2003-05-23 11:48:30
Right that's definitely located where the fault is in the code.  Can you tell me if you've ever seen this crash happen on the first run it does, or is it always on a subsequent one?  (Also, is there any correlation with whether it loaded the first one from an autosave file?)

I've been trying to replicate the error here, with no success so far.  Are you using the cmdline version or the background one?

Today's weather in %region is Sunny/(null), max.  temperature #NAN°C

[This message was edited by Stephen Brooks on 2003-May-23 at 22:26.]
Brian Sogard
2003-05-23 15:27:41
I get the crash to occur with either the commandline or the background client.  It crashes at the time a simulation is beginning, once a simulation has started it will continue without crashing until that simulation is complete, when the next simulation starts up there is a probability of a crash occurring, it will sometimes crash on program startup as the program initiates a simulation, or it will execute 2 or 3 simulations before crashing.  If it starts normally and a simulation begins, shutting it down and restarting may or may not cause a crash.  Within 5 restarts it WILL crash on those systems that it crashes on.  I have not shut it down when an autosave file is present to see what happens when it restarts from an autosaved file, if I get time this weekend I can try that.  These PCs run 24/7 headless so under normal operating conditions I only check them once a week or so from a remote login.
ZeonX[OCAU]
2003-05-23 16:56:51
Hey stephen i was runnig it all last night and then this morn it crashed.. Where is the log file or info on what happened all i got was this error on the screen.
Application popup: v431c_cmdline_debug.exe - Application Error : The instruction at "0x77f53207" referenced memory at "0x001e5790". The memory could not be "written".

Click on OK to terminate the program

So yeah... I always run the cmdline version and i've never got a saved file after it crashes so it has to be after it clears it, so generally i never start with resuming from a saved file either.
Brian Sogard
2003-05-24 04:43:44
Additional information:

I got a simulation running (duron 1.2GHz 512MB WinXP Pro) and stopped it once an autosave file was created.  Restarting the commandline client approx 50 times with an autosave file present the client did not crash on any startup.  Deleted the autosave and it crashed right away.  I've saved a screen capture which represents one of the crashes, perhaps the commandline output will help you interpret what is happening at a crash.



Re-running the client many times I was able to see that the second "TrialType" was at different times all 4 trial types, so you could substitute any of the trial types into the line listing the second trial type.  In the image you can see part of the folder listing, which has the file queue.txt, this file shows up when the client crashes and its presence or absence does not appear to have an impact on future client startups.  And yes, every time it crashes the beamline=117 units.  Observing a series of startups, when the simulation successfully started the beamline varied from 53 to 59 units.
Stephen Brooks
2003-05-24 12:02:13
OK looking at that screenshot I can see a variety of weird things!  One is that you should never get 117 units in a beamline - that's about as many as you'd have if somehow 2 beamlines got loaded over each other.  Another is that the first time the simulation reads for genome scores, it gets 0 of them, but the second time it gets 14. The third thing is that it appears you have no particles in the simulation of 117 units, so it ends immediately (or after 0.01ns).

My general feeling is that this will all trace back to the earliest possible oddity, which would point the finger at the genome-scores loading routine (unless you changed results.dat while that screenshotted run was going).  This is odd because it's not where I originally thought it would be.  I'll have a look in the relevant module.

Incidentally, if you remove results.dat (and put it somewhere else for safe keeping), do you still get the crashes?  This might be a question of something weird getting into the .dat file - I had one bug like that before.

Today's weather in %region is Sunny/(null), max.  temperature #NAN°C
Brian Sogard
2003-05-24 12:35:13
I removed results.dat from the folder, crashes occured the same as before.
[OCAU] badger
2003-05-25 17:18:49
I have been running the background client for a few days now without crashing (command line before).  I am running on WinXPHome.

www.BadgerMotorsport.tk Proudly sponsored by GRX-Computers
[OCAU]DGROMS.com
2003-05-27 01:57:05
Stephen,
I am having the same problems.
Event Type: Information
Event Source: Application Popup
Event Category: None
Event ID: 26
Date: 5/27/2003
Time: 4:51:38 PM
User: N/A
Computer: JFMARO-DC2
Description:
Application popup: Muon1_background.exe - Application Error : The instruction at "0x77fcb261" referenced memory at "0x001e05d0". The memory could not be "read".

Click on OK to terminate the program
I have 13 Dual Xeon 1.8Ghz machines all Dell PowerEdge 1600SC machines all with 1024Mb RAM, all running either 4.31b or 4.31c getting the same problems.  I have tried fresh installs on all machines an it is still happening, I have also tried commandline and background of both versions.  I will run the debug version overnight on one machine and see how that goes
Stephen Brooks
2003-05-30 14:30:44
I just had a thought... why not send you the version that has the debugging _I_ use to track memory leaks (instead of the compiler's stack trace)?  Try replicating the crash with this, and then look to see if there are any text files produced in your muon directory.  I think one called "doublefree.txt" will appear if you're in luck.

The above is another commandline version.  The level of debugging I use means it will sometimes appear to freeze when memory operations are being done behind the scenes, as extra calculation has to be done to track the blocks.

Today's weather in %region is Sunny/(null), max.  temperature #NAN°C
[OCAU]DGROMS.com
2003-05-30 17:17:27
Stephen,
Thanks for your help.  It seems after dumping my entire Muon directory on all my machines and starting again from scratch has cleared things up.  just deleting the sav files and results wasn't enough, maybe there was a corrupt build on the website at one stage?  But now that I have re-downloaded and started from scratch I have 65Ghz of machines running Muon perfectly
ZeonX[OCAU]
2003-06-03 03:31:56
Well i thought i would give this one a try and it seems good.  Here are the problems i got when i ran it in my problem directory.


debugfree.log file

I hope that helps stephen.

EDIT: Stephen when i put in muonerror.jpg as the file name the forum made it muon_error.jpg and then the link was wrong.  Changing it to just error seems to work but it seems stupid.
[OCAU]GenomeX
2003-06-03 16:44:20
Dumped the muon directory exactly the same as what DG did and all problems have gone.  I believe it was the results.dat mucking it up.

Muon Project member for OCAU
Brian Sogard
2003-06-03 18:37:43
Curious that starting over works for some, I've done clean installs in new directories after dumping the old ones, without success...
[OCAU] badger
2003-06-04 05:23:21
I set autosave interval = 0 in config.txt and it hasn't crashed since...

www.BadgerMotorsport.tk Proudly sponsored by GRX-Computers
Stephen Brooks
2003-06-05 15:32:23
This is weird.  You're getting different sorts of crashes to each other, by the looks of it.  It could be 2 or 3 bugs.  Or something weirder.

I've saved the data to disk - but I'm going away for a week now so won't be able to do much.

Today's weather in %region is Sunny/(null), max.  temperature #NAN°C
: contact : - - -
E-mail: sbstrudel characterstephenbrooks.org
Yahoo: scrutney_mallard
Jabber: stephenbrooksstrudel characterjabber.org
Twitter: stephenjbrooks

Site has had 16246498 accesses.