Maker Pro
Maker Pro

Re: Intel details future Larrabee graphics chip

N

Nick Maclaren

Jan 1, 1970
0
|>
|> > |> IEEE achieved its objective a long time ago: just about all hardware
|> > |> and software uses it.
|> >
|> > That is a political objective, not a technical one; I was referring
|> > to a technical objective.
|>
|> I wouldn't call it political at all - the goals of standards are obvious and
|> non-political (the *process* of establishing a standard is often political,
|> but that is a different matter altogether).

The technical goals of standards are obvious? The mind boggles.
Clearly you haven't been involved with many of their committees.

|> > |> So that is not the problem. However it doesn't
|> > |> explicitly allow a subset of the features to be supported, which is what
|> > |> almost all implementations do.
|> >
|> > In different ways, which means that it hasn't even achieved its
|> > political objective in full.
|>
|> Well the main goal was to create a common binary format which gives
|> identical results on all implementations. That is exactly what we have
|> today, so it is an example of a standard that worked well.

That is factually false, as you yourself stated. Not merely are some
aspects of it left implementation-dependent, you yourself stated that
most implementations use hard underflow (actually, it's not that simple).

Also, you are ignoring the fact that almost all programs use languages
other than assembler nowadays, and the IEEE 754 model is notoriously
incompatible with the arithmetic models used by most programming
languages. That, in turn, means that two compilers (or even options)
on the same hardware usually give different results, and neither are
broken.

|> > |> Flushing denormals to zero is one key feature that is missing for example.
|> > |> Similarly round-to-even is the only rounding mode ever used. It would be
|> > |> easy to fix the standard to make the current defacto situation official.
|> >
|> > Which? I know of dozens of variants.
|>
|> I know a compiler that supports 5 variants, but there aren't any useful
|> variants beyond that. Even that is too much, I think just 1 or 2 commonly
|> used subsets would be sufficient to capture 99% of implementations.

There are a lot more than five variants in use today, even just at
the hardware level. Actually, Intel has at least three, and quite
likely more.


Regards,
Nick Maclaren.
 
J

JosephKK

Jan 1, 1970
0
[. IEEE 754 ....]
[...]
Perhaps you have read it. That is one strange format though.
Essentially using the sign bit in combination with the mantissa to
squeeze one more bit of resolution out.

I had seen the implied bit idea elsewhere. The idea of using up codes
to specify things like NAN, +INF and -INF are things that I have found
myself disagreeing with. I have never liked "in band signaling".
That is the only part of the standard I found to be strange.

I had to convert some all positive fixed point numbers to IEEE so that
is the only part I really spent much time on.

Actually they are not really in band nor denormal. They used codes
that do not occur in the standard value encodings. And there any many
more unused encodings than the 5 or so that they have used. Damn, 20
years since i last studied it and i forgot a lot of detail.
 
W

Wilco Dijkstra

Jan 1, 1970
0
Nick Maclaren said:
|>
|> > |> IEEE achieved its objective a long time ago: just about all hardware
|> > |> and software uses it.
|> >
|> > That is a political objective, not a technical one; I was referring
|> > to a technical objective.
|>
|> I wouldn't call it political at all - the goals of standards are obvious and
|> non-political (the *process* of establishing a standard is often political,
|> but that is a different matter altogether).

The technical goals of standards are obvious? The mind boggles.
Clearly you haven't been involved with many of their committees.

As I said, it's the process that is the problem. Design by committee rarely
leads to something useful due to every member having their own agenda
and axes to grind.
|> > |> So that is not the problem. However it doesn't
|> > |> explicitly allow a subset of the features to be supported, which is what
|> > |> almost all implementations do.
|> >
|> > In different ways, which means that it hasn't even achieved its
|> > political objective in full.
|>
|> Well the main goal was to create a common binary format which gives
|> identical results on all implementations. That is exactly what we have
|> today, so it is an example of a standard that worked well.

That is factually false, as you yourself stated. Not merely are some
aspects of it left implementation-dependent, you yourself stated that
most implementations use hard underflow (actually, it's not that simple).

No, the implementation defined aspects don't lead to different results.
The choice of whether to support denormals or flush to zero doesn't affect
most software, so it hardly matters. In the rare cases where it does matter,
you still have the option to enable denormals and take the performance hit.
Also, you are ignoring the fact that almost all programs use languages
other than assembler nowadays, and the IEEE 754 model is notoriously
incompatible with the arithmetic models used by most programming
languages. That, in turn, means that two compilers (or even options)
on the same hardware usually give different results, and neither are
broken.

That's not true. It's not difficult to optimize based on the chosen floating point
model. So compilers give the same result even with full optimization (if they
don't then they are broken). You might get different results only if you enable
fast floating point options that allow reordering of operations.
|> > |> Flushing denormals to zero is one key feature that is missing for example.
|> > |> Similarly round-to-even is the only rounding mode ever used. It would be
|> > |> easy to fix the standard to make the current defacto situation official.
|> >
|> > Which? I know of dozens of variants.
|>
|> I know a compiler that supports 5 variants, but there aren't any useful
|> variants beyond that. Even that is too much, I think just 1 or 2 commonly
|> used subsets would be sufficient to capture 99% of implementations.

There are a lot more than five variants in use today, even just at
the hardware level. Actually, Intel has at least three, and quite
likely more.

You can create an infinite number of variants, but only a few make sense.
There are only a few choices when flushing a denormal to zero, so defining
the correct way of doing this would reduce the variation that exists today.

Wilco
 
N

Nick Maclaren

Jan 1, 1970
0
|>
|> No, the implementation defined aspects don't lead to different results.
|> The choice of whether to support denormals or flush to zero doesn't affect
|> most software, so it hardly matters. In the rare cases where it does matter,
|> you still have the option to enable denormals and take the performance hit.

This is getting ridiculous. Most software doesn't USE floating-point,
and most of the rest has no trouble with IBM 360, VAX, or anything
else. Of the programs which do, a very high proportion have trouble
with such variations. And you don't always have the option, and it's
not just a performance hit.

|> > Also, you are ignoring the fact that almost all programs use languages
|> > other than assembler nowadays, and the IEEE 754 model is notoriously
|> > incompatible with the arithmetic models used by most programming
|> > languages. That, in turn, means that two compilers (or even options)
|> > on the same hardware usually give different results, and neither are
|> > broken.
|>
|> That's not true. It's not difficult to optimize based on the chosen floating point
|> model. So compilers give the same result even with full optimization (if they
|> don't then they are broken). You might get different results only if you enable
|> fast floating point options that allow reordering of operations.

You are quite simply wrong, again.

Fortran (still THE leading numerical language) doesn't even have the
CONCEPT of a defined execution order in terms of 'basic' operations.
I could explain why you are wrong about C/C++, but lack the energy
to do so yet again. Java is a numerically unimportant language.
I could go on.

I shall not continue this unproductive thread.


Regards,
Nick Maclaren.
 
J

JosephKK

Jan 1, 1970
0
What is wrong in your opinion and how would you improve it?


You can't count through zero or past Infinity indeed. But you can count up
from zero all the way to infinity without any checks. I use this fact to first
encode the result and then round by just incrementing it - no special case
checking when rounding denormals, or round up to infinity. Comparisons
take just a few instructions despite the difference between sign-magnitude
and 2-complement.

Wilco

Make sense please. I have built many systems with hardware that can
count though zero. If you have built a system that can count to
infinity you need to publish. If you are talking about obscure
properties on floating point itself or 754 compliant implementations
please be more clear.
 
J

JosephKK

Jan 1, 1970
0
|>
|> > |> The IEEE format is pretty well thought out.
|> >
|> > That is a matter of opinion. A large number of experts, in many
|> > aspects of numerical computing, disagree.
|>
|> What is wrong in your opinion and how would you improve it?

This has been described ad tedium. I would start by defining a
clear, consistent objective and work from there. It doesn't
matter so much WHAT the objective is, provided that it HAS one.


Regards,
Nick Maclaren.

It has been some time since i have fussed with this. Where do i find
discussion the improvements you are talking about?
 
J

JosephKK

Jan 1, 1970
0
I wouldn't call it political at all - the goals of standards are obvious and
non-political (the *process* of establishing a standard is often political,
but that is a different matter altogether).


It is true standards end up targetting the lowest common denominator,
and as such fall short of technical excellence. However the IEEE format
represents a major advance over other formats on almost all aspects,
so it deservedly killed many badly designed formats. If someone comes
up with an even better format then I am all for it - though I doubt it can be
improved much.


Well the main goal was to create a common binary format which gives
identical results on all implementations. That is exactly what we have
today, so it is an example of a standard that worked well.


I know a compiler that supports 5 variants, but there aren't any useful
variants beyond that. Even that is too much, I think just 1 or 2 commonly
used subsets would be sufficient to capture 99% of implementations.

Wilco

I know of at least 5 different early hardware implementations of
floating point and have written two software implementations myself.
So what. It was a long time ago as well.
 
J

JosephKK

Jan 1, 1970
0
|>
|> > |> IEEE achieved its objective a long time ago: just about all hardware
|> > |> and software uses it.
|> >
|> > That is a political objective, not a technical one; I was referring
|> > to a technical objective.
|>
|> I wouldn't call it political at all - the goals of standards are obvious and
|> non-political (the *process* of establishing a standard is often political,
|> but that is a different matter altogether).

The technical goals of standards are obvious? The mind boggles.
Clearly you haven't been involved with many of their committees.

Possibly not any. I did a short stint on IEEE 1219 and the politics
of protecting profits and other reasons for entrenched positions was
glaringly present.
|> > |> So that is not the problem. However it doesn't
|> > |> explicitly allow a subset of the features to be supported, which is what
|> > |> almost all implementations do.
|> >
|> > In different ways, which means that it hasn't even achieved its
|> > political objective in full.
|>
|> Well the main goal was to create a common binary format which gives
|> identical results on all implementations. That is exactly what we have
|> today, so it is an example of a standard that worked well.

That is factually false, as you yourself stated. Not merely are some
aspects of it left implementation-dependent, you yourself stated that
most implementations use hard underflow (actually, it's not that simple).

Also, you are ignoring the fact that almost all programs use languages
other than assembler nowadays, and the IEEE 754 model is notoriously
incompatible with the arithmetic models used by most programming
languages. That, in turn, means that two compilers (or even options)
on the same hardware usually give different results, and neither are
broken.

This one is new to me. Where do i go to find backup? The magnitude
of the claim does sound a bit extreme.
|> > |> Flushing denormals to zero is one key feature that is missing for example.
|> > |> Similarly round-to-even is the only rounding mode ever used. It would be
|> > |> easy to fix the standard to make the current defacto situation official.
|> >
|> > Which? I know of dozens of variants.
|>
|> I know a compiler that supports 5 variants, but there aren't any useful
|> variants beyond that. Even that is too much, I think just 1 or 2 commonly
|> used subsets would be sufficient to capture 99% of implementations.

There are a lot more than five variants in use today, even just at
the hardware level. Actually, Intel has at least three, and quite
likely more.

Some pointers to more details please.
 
N

Nick Maclaren

Jan 1, 1970
0
|> >In article <[email protected]>,
|> >|>
|> >|> > |> The IEEE format is pretty well thought out.
|> >|> >
|> >|> > That is a matter of opinion. A large number of experts, in many
|> >|> > aspects of numerical computing, disagree.
|> >|>
|> >|> What is wrong in your opinion and how would you improve it?
|> >
|> >This has been described ad tedium. I would start by defining a
|> >clear, consistent objective and work from there. It doesn't
|> >matter so much WHAT the objective is, provided that it HAS one.
|>
|> It has been some time since i have fussed with this. Where do i find
|> discussion the improvements you are talking about?

I wasn't describing specific improvements. Anyway, probably on the
archives of this group, or comp.arch.arithmetic. Google groups
seems to be on the blink, as it gets only one hit for "IEEE 754"
on the latter, and for "IEEE" and "floating-point" on the former,
which I have difficulty believing!

There was also an IEEE 754R mailing list, which may be archived
and accessible - see http://www.ucbtest.org.


Regards,
Nick Maclaren.
 
N

Nick Maclaren

Jan 1, 1970
0
|>
|> >Also, you are ignoring the fact that almost all programs use languages
|> >other than assembler nowadays, and the IEEE 754 model is notoriously
|> >incompatible with the arithmetic models used by most programming
|> >languages. That, in turn, means that two compilers (or even options)
|> >on the same hardware usually give different results, and neither are
|> >broken.
|>
|> This one is new to me. Where do i go to find backup? The magnitude
|> of the claim does sound a bit extreme.

The relevant language standards? Seriously. For 'clarification',
you will need to read the archives of the SC22 mailing lists.

To save you time: in Fortran, look for the rules on expression and
function evaluation and, in C/C++, the rules on side-effects and
the INCREDIBLY arcane syntax and semantics of preprocessor versus
compile-time versus run-time expressions. And remember that flag
handling is NOT optional in IEEE 754, but a fundamental part of its
design, as Kahan points out.

|> >There are a lot more than five variants in use today, even just at
|> >the hardware level. Actually, Intel has at least three, and quite
|> >likely more.
|>
|> Some pointers to more details please.

Look at the IEEE 754 specification on the handling of underflow
and NaNs, and then study Intel's architecture manuals VERY
carefully, looking for x86 basic FP, MMX, SSE, IA64 and optional
variants. Then look at the MIPS, SPARC, PowerPC and ARM
architecture manuals.

Then laugh, cry or scream, according to taste.


Regards,
Nick Maclaren.
 
W

Wilco Dijkstra

Jan 1, 1970
0
Terje Mathisen said:
Not correct.

Every single (int) cast of a fp variable in a C program must truncate, not round, which means that you absolutely have
to have a directed rounding mode on top of the default.

That is true. But does FP->int conversion require dynamic rounding modes?
Many ISAs have integer conversion instructions with a fixed rounding mode.
This is far simpler and faster than reading the current rounding mode, changing
it to round to zero, doing the conversion and then restoring the previous mode.
Fix C at the same time then?

Why? On x86 at least it may be faster to do the conversion via emulation
anyway (SSE2 can only do float/double->int in one instruction).

Wilco
 
W

Wilco Dijkstra

Jan 1, 1970
0
JosephKK said:
Make sense please. I have built many systems with hardware that can
count though zero. If you have built a system that can count to
infinity you need to publish. If you are talking about obscure
properties on floating point itself or 754 compliant implementations
please be more clear.

The context is clear, we were talking about the properties of the IEEE-754
floating point format. The encoding has the property that values from zero to
infinity use larger encoded values for larger values (the only exceptions are +0
and -0 and NaN which has no value). This allows for the implementation tricks I
described.

Wilco
 
W

Wilco Dijkstra

Jan 1, 1970
0
JosephKK said:
I know of at least 5 different early hardware implementations of
floating point and have written two software implementations myself.
So what. It was a long time ago as well.

Were they all different subsets of IEEE (which is what we were discussing)?

Wilco
 
W

Wilco Dijkstra

Jan 1, 1970
0
Nick Maclaren said:
|>
|> No, the implementation defined aspects don't lead to different results.
|> The choice of whether to support denormals or flush to zero doesn't affect
|> most software, so it hardly matters. In the rare cases where it does matter,
|> you still have the option to enable denormals and take the performance hit.

This is getting ridiculous. Most software doesn't USE floating-point,
and most of the rest has no trouble with IBM 360, VAX, or anything
else. Of the programs which do, a very high proportion have trouble
with such variations. And you don't always have the option, and it's
not just a performance hit.


|> > Also, you are ignoring the fact that almost all programs use languages
|> > other than assembler nowadays, and the IEEE 754 model is notoriously
|> > incompatible with the arithmetic models used by most programming
|> > languages. That, in turn, means that two compilers (or even options)
|> > on the same hardware usually give different results, and neither are
|> > broken.
|>
|> That's not true. It's not difficult to optimize based on the chosen floating point
|> model. So compilers give the same result even with full optimization (if they
|> don't then they are broken). You might get different results only if you enable
|> fast floating point options that allow reordering of operations.

You are quite simply wrong, again.

Fortran (still THE leading numerical language) doesn't even have the
CONCEPT of a defined execution order in terms of 'basic' operations.
I could explain why you are wrong about C/C++, but lack the energy
to do so yet again. Java is a numerically unimportant language.
I could go on.

You're again falling into the trap of making false generic statements without
any evidence whatsoever. I suggest you read up on this, for example:
http://softwarecommunity.intel.com/isn/downloads/softwareproducts/pdfs/347684.pdf
It clearly states that in the strict IEEE mode it adheres to source order, including
taking parentheses into account. So compilers support well defined ordering for
Fortran despite your claim that it is impossible.

Wilco
 
M

MooseFET

Jan 1, 1970
0
[. IEEE 754 ....]
[...]
Perhaps you have read it. That is one strange format though.
Essentially using the sign bit in combination with the mantissa to
squeeze one more bit of resolution out.
I had seen the implied bit idea elsewhere. The idea of using up codes
to specify things like NAN, +INF and -INF are things that I have found
myself disagreeing with. I have never liked "in band signaling".
That is the only part of the standard I found to be strange.
I had to convert some all positive fixed point numbers to IEEE so that
is the only part I really spent much time on.

Actually they are not really in band nor denormal. They used codes
that do not occur in the standard value encodings.

They do not occur in the standard encodings but they could have. This
would have lost those signals but made some more floating point
values. In this way, they are "in band".
 
N

Nick Maclaren

Jan 1, 1970
0
|>
|> > >I had seen the implied bit idea elsewhere. The idea of using up codes
|> > >to specify things like NAN, +INF and -INF are things that I have found
|> > >myself disagreeing with. I have never liked "in band signaling".
|> > >That is the only part of the standard I found to be strange.
|> >
|> > >I had to convert some all positive fixed point numbers to IEEE so that
|> > >is the only part I really spent much time on.
|> >
|> > Actually they are not really in band nor denormal. They used codes
|> > that do not occur in the standard value encodings.
|>
|> They do not occur in the standard encodings but they could have. This
|> would have lost those signals but made some more floating point
|> values. In this way, they are "in band".

Not just in that way. The infinities are "in band" because they have
many of the semantics of actual numbers. One can argue whether or not
that is a good idea (I vacillate according to whether they help or
hinder what I am doing!), but there is only one serious consistency
errors with infinities in IEEE 754. There are lots more in C99 and
IEEE 754R, of course.

That is that the sign of zero is unreliable, but dividing by it gives
an infinity; if infinity was created only explicitly or by overflow,
there would be no consistency problems.

NaNs were a bit of a mess in IEEE 754 and are a disaster area in C99
and IEEE 754R. God alone knows what they mean, because I know for
certain that none of the people working on either of those two
standards did or do.


Regards,
Nick Maclaren.
 
M

Michel Hack

Jan 1, 1970
0
The only thing I dislike (from a sw viewpoint) is the fact that
denormals and zero share the same zero exponent, this makes it slightly
slower to separate out regular numbers+zero from the special cases
(Inf/NaN/denorm).

Well, this is actually what I liked best about the format: rounding
overflow
from subnormal to normal is automatic, and as somebody else pointed
out already, one can "count" through strictly increasing magnitudes
using simple integer arithmetic (as long as one watches out for Inf).

I agree though that a test for zero requires a different mask than the
other tests, and sometimes that's awkward.

Michel.
 
J

John Doe

Jan 1, 1970
0
JosephKK said:
.


Caches or not, memory speed has been more performance limiting
that CPU speed for decades. Multiple CPUs on a single socket only
aggravate this. Multiple memory busses might help.

BWAAAHAHAHAAAAAA!!!!

Sounds like someone who is fishing for the motivation to upgrade.

I'll let you know when my multiple core CPU cannot use all cores at
100%. Multiple core CPUs are the biggest hardware performance leap
in many years. Bet on it.
 
A

Andrew Reilly

Jan 1, 1970
0
BWAAAHAHAHAAAAAA!!!!

Sounds like someone who is fishing for the motivation to upgrade.

I'll let you know when my multiple core CPU cannot use all cores at
100%. Multiple core CPUs are the biggest hardware performance leap in
many years. Bet on it.

They may very well show you that they're running at 100% in a CPU use
meter administered by a time-sharing OS, but do you know how much of that
100% is the processor stalled, waiting for off-chip memory? [*1] Is the
throughput on your problem of choice four (or whatever) times what it is
on a single core?

Well, sometimes it is. My own algorithms fit neatly into two categories:
totally contained in cache (for modern values of cache), and totally
memory bandwidth limited, so I am happy to have a couple of extra cores.
I can imagine applications where it makes little difference, though.

[1] This is the single statistic that I most wish for, in an operating
system performance display, and I don't know how to get it. Is it
possible?

Cheers,
 
J

John Doe

Jan 1, 1970
0
Andrew Reilly said:
....


They may very well show you that they're running at 100% in a CPU
use meter administered by a time-sharing OS, but do you know how
much of that 100% is the processor stalled, waiting for off-chip
memory?

If I needed to know, I'd probably use Performance Monitor in Windows
XP.
[*1] Is the throughput on your problem of choice four (or
whatever) times what it is on a single core?

It's close enough for me.
I can imagine applications where it makes little difference,
though.

Some applications don't take advantage of multiple cores, but that's
not necessarily the CPUs fault. A good example is Supreme Commander
and a tiny utility called CoreMaximizer. Without the utility, one
core bounces against 100% and causes a replay to stutter while the
other core is 50 or 60%. With the utility, both cores are almost
even and there is a noticeable improvement in performance without
stuttering.
 
Top