stephenbrooks.orgForumMuon1GeneralIntel or AMD
Username: Password:
Search site:
Subscribe to thread via RSS
Helix_Von_Smelix
2002-06-01 23:39:22
okay so someone had to start this thread, so i guess its me.  So which is faster??  I have a 1GHz classic Athlon K7 Slot A which seems to be about 20% faster than a 1GHz PIII Slot 1, both are 100MHz FSB.  any one found anything else??  or can't you be arsed to look.  big grin The only DC project that PIII is faster is the UD cancer project, where the intel is about 10% quicker. 
So if anyone has time on their hands this long weekend (well in the UK cool ) have a look.  Seeya all.  Parrots and ducks?????  What is wrong with Mr Chris Johnson, and Mr Stepen Brooks should be getting more sleep at exam time.  razz razz

If you make instant coffee in to a microwave do you go back in time??
Bellbox
2002-06-02 20:52:59
Seeing as Intel is a sponsor of UD, it's no wonder they most likely coded it to run faster on the Pentium.
kruemi [SwissTeam.NET]
2002-06-04 00:32:27
It belongs on the Project and what sort of Calculating is needed.  But The AMD-Athlon Processors are really a bit Faster in many ways than the Ones from Intel.  They greatly improved the Performance ot the Integer and Floating-Point units in the past Versions (Since K6 and K6-2).
P4 has one more Problem with the big pipelines which slow down the processor when the brace-predicrion fails (which happens quiet often... about 5%!).

As long as the Clients arnen't fully optimized for this type of processor (which means many changes in the structure) the P4 will be "slow" in such contests!

bye

kruemi
AlArenal
2002-06-04 02:17:15
I guess you mean "branch prediction". Generally, one would have to test wich project runs better on which processor.  Instead of hard-coding in assembler, the most influence will come from the compiler options.

Modern compilers allow to optimize code for different processors.  If you use Intel's C or Fortran compiler you'll optimize best for Intel CPUs.  Most other compilers won't be able to compile as well for Intel, but generally are better in optimizing for AMD or other (not Intel compatible) processors.

As we have no influence regarding the used compilers and options I gues we just have to try it out.  Maybe Stephen can give us some details on what compiler he uses with which option.  But I guess he tried to tune as well as he (and the compiler) could.
AlArenal
2002-06-04 06:03:01
@stephen:

Have you tried compiling your source code with GCC?  I bet it produces far better code than LCC, considered the fact that the old linux versions (compiled with gcc) worked a lot faster than the Windows versions.

Maybe someone else out there who might be able to give it a try?
Orbi-tel
2002-06-05 03:09:20
quote:
Originally posted by Bellbox:
Seeing as Intel is a sponsor of UD, it's no wonder they most likely coded it to run faster on the Pentium.


WRONG.  Intel have had no involvement in the development of the UD project.  They have helped in optimising THINK's 1.14 version but not the 1.03 we use for UD.  In fact the UD project runs faster on AMD processors due to the superior FPU it has.  The UD agent amd THINK task uses none of the latest Intel or AMD optimisations (SSE, SSE2, 3DNOW!).

The sponsorship is to help with the infrastructure (servers, net connection costs etc) and not with the development activities.

This has been made clear on many occations on the UD message boards.

Cheers

Orbi cool
Orbi-tel
2002-06-05 03:11:53
quote:
Originally posted by AlArenal:
@stephen:

Have you tried compiling your source code with GCC?  I bet it produces far better code than LCC, considered the fact that the old linux versions (compiled with gcc) worked a lot faster than the Windows versions.

Maybe someone else out there who might be able to give it a try?


I have tried on Linux but Stephen is using some 'features' that LCC allows but GCC does not.  This is being worked through by Stephen and myself as part of the port to Linux.  Stephen has to do more work in separating the Windows Graphics code source so it can be excluded before I can continue.

Cheers

Orbi cool
Helix_Von_Smelix
2002-06-05 05:17:25
Hi Orbi-tel, i ran the same UD "WU" on both a 1GHz PIII and 1GHz K7. the PIII was about 10% faster.  Most strange.  maybe i should make a longer term test......... nope can't be arsed.  i will stick with this one for now.  Keep up the good work.

If you make instant coffee in to a microwave do you go back in time??
AlArenal
2002-06-05 06:54:24
quote:
Originally posted by Orbi-tel:

I have tried on Linux but Stephen is using some 'features' that LCC allows but GCC does not.  This is being worked through by Stephen and myself as part of the port to Linux.  Stephen has to do more work in separating the Windows Graphics code source so it can be excluded before I can continue.




Yeah, I tried yesterday to compile it under Win2k via gcc and the Cygnus environment and all I got was lots of error messages.  frown

But I guess once Stephen implemented an abstraction layer to separate the calculation part from the graphics and you get it working under Linux, we'll be able to use the same source for an gcc compiled DOS version which may result in better code.

It's always good to squeeze some extra performance out of one's given hardware.
Stephen Brooks
2002-06-05 07:26:21
--[But I guess once Stephen implemented an abstraction layer to separate the calculation part from the graphics and you get it working under Linux,]--

There already is a switch in the code to turn the graphics on and off: this is how I maintain both the Windows graphical version and the background version.  What I have not yet done is attempted a compile completely under DOS to see what I need to change to get that working.  I may add a #define DOS 0/1 switch to make this easier.

The main problem with compiling under GCC is that it doesn't like pieces of code like


typedef struct
{
double x,y,z,vx,vy,vz; // Position, velocity vectors
ParticleSpecies s;
ParticleState state;
double m0,wt,t,t_decay,t_start; // Rest mass, weight (from Monté Carlo input), proper time, proper time of decay, starting time (for delayed runs)
} ParticleAttributes;

typedef struct
{
int ns1,ns2,ns3; // Three nearby solenoids...
int nbm1,nbm2; // ...and TWO B-magnets
} ThingsNearParticle;

typedef struct
{
ParticleAttributes;
ThingsNearParticle;
} Particle;


...where I nest structures in side each other.  This is actually a very powerful feature, and without it I'd have to do something like


typedef struct
{
ParticleAttributes p;
ThingsNearParticle n;
} Particle;


...which would mean I'd have to use an extra full-stop to get at fields of the particle structure.  I don't think this makes any impact on speed though, since the structures are all fixed-offset anyway.  A more interesting example is:


typedef struct
{
enum {Rectangular,Circular,Edge,Output,Finish} type;
AffineFrame f;
union
{
struct {double x1,y1,x2,y2;}; // Rectangular
struct {double cx,cy,r;}; // Circular
struct {double d,nx,ny;}; // Edge
struct {char *outfile; double radius;}; // Output
double finish_radius; // Finish
};
} Aperture;


Where the "type" field determines which of the subsequent elements of the union actually defines the rest of the structure.  Note that x1, cx, d, outfile and finish_radius all have the same offset in the structure.  I think the way to make this GCC-friendly is by the following:


typedef struct {double x1,y1,x2,y2;} ApertureSRectangular;
typedef struct {double cx,cy,r;} ApertureSCircular;
typedef struct {double d,nx,ny;} ApertureSEdge;
typedef struct {char *outfile; double radius;} ApertureSOutput;

typedef union
{
ApertureSRectangular r;
ApertureSCircular c;
ApertureSEdge e;
ApertureSOutput o;
double finish_radius; // Finish
} ApertureU;

typedef enum {Rectangular,Circular,Edge,Output,Finish} ApertureType;

typedef struct
{
ApertureType type;
AffineFrame f;
ApertureU u;
} Aperture;


But note that now whenever you have an Aperture object a, you'll have to write a.u.r.x2 instead of just a.x2. Trying to compact everything with literal names is probably possible with clever use of "union", but unfortunately I think it depends rather critically on structure alignment (i.e. 4-byte, 1-byte, 8-byte or whatever you have it set on).


"As every 11-year-old kid knows, if you concentrate enough Van-der-Graff generators and expensive special effects in one place, you create a spiral space-time whirly thing, AND an interesting plotline"
AlArenal
2002-06-05 07:41:41
Your switches seem to get ingored by gcc as it does complain about code in include files, it shouldn't have included, anyway.

For now I'd be happy with a Linux port, so I could use my ressources for Muon only.  I just hate to do anything else with my CPU cycles.  big grin

How comes you first coded a Win app?  Maybe I'm too much into the open source stuff, but it sounds kinda unusual.
kruemi [SwissTeam.NET]
2002-06-06 04:45:41
The Statement is working well.  But as i took a look to the 4.0x Sourcecode, i saw, that many Windows-Specific includes (like directx) where made, even if graphics are switched off!

bye

kruemi
: contact : - - -
E-mail: sbstrudel characterstephenbrooks.orgTwitter: stephenjbrooksMastodon: strudel charactersjbstrudel charactermstdn.io RSS feed

Site has had 26761396 accesses.