Flexible Instruction Set Encoding.

Quadibloc · Aug 29, 2007

Nobody said:
If 32-bit code needs to store 2 pointers at successive addresses, those
addresses will be 4 bytes apart, e.g.:

mov [bp-8], si
mov [bp-4], di

In 32-bit mode, the 32-bit versions of si/di (i.e. esi/edi) will be
stored; in 16-bit mode, the 16-bit versions will be stored (similarly, bp
will be the 32-bit ebp in a 32-bit mode). The actual sequence of
opcodes is identical regardless of mode.

In a 16-bit mode, there would be 2 unused bytes between the two values.
But in a 64-bit mode, the stored values would overlap as they are
only 4 bytes apart.

There's no way that the compiler can handle this aspect automatically.
The code would have to be written to avoid fixed offsets altogether in
favour of "base + multiple * word-size" constructs, where word-size is
determined at run-time. This could have a substantial performance penalty:
using compile-time constants is often faster than using run-time variables.

..
That's true. But in assembly language, of course, you can just reserve
64 bits of space for numbers that may be 16, 32, or 64 bits long in
your subroutine which may be called then to handle numbers of each
size, after you set the mode properly in the calling routine.

Compilers don't generate such code because languages don't allow one
to specify a need for it.

John Savard

cr88192 · Aug 29, 2007

MooseFET said:
Yes .net is as bad in 64bit as it is in 32bit. Since it isn't really
CPU instructions it doesn't qualify. It is a very slow way to get any
complex task done.

well, .NET can be made faster by employing a more involved translator.
for example, an essentially braindeded direct-JIT, would be fairly slow.

if it reworks it, say, into SSA form and handles the stack and variables
partly through register allocation, then, very likely, you will get faster
results.

can't say exactly what the various .NET VMs do, as I haven't looked too much
into this.

for example, my compiler uses a language I call RIL (or RPNIL), which is
essentially a vaguely postscript like language. now, it is a more or less
stack machine model. so, I compile C to this language, and this language to
machine code.

and, the spiffy trick:
the RIL stack, increasingly, has less and less to do with the true (x86)
stack. part of the stack is literals (only in certain situations are they
actually stored on the machine stack), part is registers (look like stack,
but really they are registers), and soon, part may be in variables (needed,
for technical reasons, to get my x86-64 target finished).

so, it looks like a stack machine, but a lot goes on that is not seen...
I chose a stack machine model because it is easy to target (and was
fammiliar), but this does not mean strictly that the internals, or output,
have to work this way.

and, odd as it may sound, only a very small portion of the RIL operations
actually generate CPU instructions, and typically then, it is for something
that happened earlier in the codestream...

There was an effort a while back to make a CPU that spoke Java
bytecode directly. I don't think they could get good speeds out of
it.

yes, a few possible reasons come up...

A far better way to go would be to standarize on the instruction set
of a real CPU. Ideally it should be a small and simple one so that
anyone who wanted to could add a special section of control store to
their processor to allow it to be run directly.

I don't understand exactly how this differs that much from current
approaches.
another CPU is, another CPU.

x86 and, soon x86-64, are the most common (at least for normal computers).

This would require that a standard way of doing the GUI also be
created. It needs to be a very simple GUI so that even simple
machines could do it.

yes.
for PCs, a standardized GUI would work.
for non-PCs, it is probably a lot harder (given the wide variety of physical
interfaces).

now, a very simple and general UI, makes a usability problem on a normal
computer, as often these kind of UIs end up treating the keyboard like a
paper weight (for many kinds of apps, the keyboard serves its purpose well).

now, in my case, I typically do my GUIs custom via OpenGL...
this gives a good deal of portability at least (stuff still works on linux,
to what extent it works...).

yeah...

Alex Colvin · Aug 29, 2007

Look up the transmeta code-morphing system. It handled the 16/32-bit
problem by hoisting the mode check out of each instruction.

Skybuck Flying · Aug 29, 2007

Now see my first post why this is slow.

Bye,
Skybuck.

Skybuck Flying · Aug 29, 2007

Apperently my first post wasn't clear enough for ya.

The idea is to:

Have the CPU implement the IF statement in it's logic circuits so that the
IF overhead is prevented.

And the boolean is replaced by some integer for more mode support.

This was the program will use fast 32 bit instructions when possible or use
slower 64 bit emulated instructions.

The cpu must provide 64 bit emulated instructions as well.

Future processors could then replace those under water with true 64 bit
operations possibly, and ofcourse 128 bit in the future etc.

Well it's too late for that now I think, we stuck with current stuff, or
maybe not who knows ?!

Bye,
Skybuck.

Skybuck Flying · Aug 29, 2007

You completely missed the compatibility requirement.

The application must have 64 bit support even on 32 bit systems. Think NTFS.

Simply recompiling int to int64 won't work !

Bye,
Skybuck.

Skybuck Flying · Aug 29, 2007

Hi, thanks very much for your suggestion.

It might be possible to create three libraries:

1. A 32 bit version.

2. A true 64 bit version.

3. A emulated 64 bit version.

Problem 1:

Passing parameters to the routines.

The different versions have different parameters.

Problem 2:

Calling the routines

^^^ Different parameters requires different calls ^^^

Lastly:

This still requires an easy method to generate the 3 libraries with just one
source code:

This could be as simple as redeclaring a type:

Type
// TgenericInteger = int32;
TgenericInteger = int64;

And then rebuilding the library.

Now a solution has to be found for problem 1 and problem 2.

Without adding to much overhead.

Bye,
Skybuck.

Skybuck Flying · Aug 29, 2007

Quadibloc said:
Now that's something different. I thought you were talking about how
the program can call the same subroutine to work on 16 bits, 32 bits,
64 bits, as long as it *tells* the subroutine which one to use by
setting mode bits.

You starting to grasp the concept.

It's not so much about telling routines what to do.

It's telling the cpu what to do.

It could be on a routine to routine basis.

Use the BitMode variable in front of the routine to tell the cpu all
instructions in the routine should be interpreted using the specified
bitmode.

Bye,
Skybuck.

Skybuck Flying · Aug 29, 2007

Skybuck Flying said:
Hi, thanks very much for your suggestion.

It might be possible to create three libraries:

1. A 32 bit version.

2. A true 64 bit version.

3. A emulated 64 bit version.

Problem 1:

Passing parameters to the routines.

The different versions have different parameters.

Problem 2:

Calling the routines

^^^ Different parameters requires different calls ^^^

Actually there is a third problem kinda:

The program loads a certain library at runtime:

It could be a 32 bit or emulated 64 bit.

This creates a debugging problem.

Suppose the code was lastly compiled in 64 bit.

And the program chooses 32 bit.

(Or vice versa)

Then the source code does not match the loaded library.

Bye,
Skybuck.

Skybuck Flying · Aug 29, 2007

Yes that's another benefit.

Shorter instruction encodings me thinks

Bye,
Skybuck.

Skybuck Flying · Aug 29, 2007

Show me an application, source code, asm, anything which:

1. Runs single 32 bit instructions on 32 bit operating system.

2. Runs single 64 bit instructions on 64 bit operating system.

3. Runs multiple 32 bit instructions for 64 bit emulation on 32 bit
operating system.

4. Has the same source code for all three cases.

5. Needs to compile once. (I'll throw in some slack three times).

6. Switches to the optimal/necessary instructions mentioned above at
runtime.

Bye,
Skybuck.

Stephen Sprunk · Aug 29, 2007

Skybuck Flying said:
You completely missed the compatibility requirement.

The application must have 64 bit support even on 32 bit systems. Think
NTFS.

Simply recompiling int to int64 won't work !

Any decent C compiler has a 64-bit integer type regardless of what CPU it's
targeted at; it's a requirement of ISO C99. If targeted at a 32-bit (or 16-
or 8-bit) CPU, the compiler emulates 64-bit operations. This is not novel;
C89 compilers targeting 8- and 16-bit CPUs provided the same emulation for
32-bit integer types.

I'm not familiar with many other languages, but I believe the same is true.
If you use a 64-bit integer type, the compiler or interpreter does
whatever's needed to provide the illusion of a machine that natively
supports such.

S

Stephen Sprunk · Aug 29, 2007

Skybuck Flying said:
Show me an application, source code, asm, anything which:

1. Runs single 32 bit instructions on 32 bit operating system.

2. Runs single 64 bit instructions on 64 bit operating system.

3. Runs multiple 32 bit instructions for 64 bit emulation on 32 bit
operating system.

4. Has the same source code for all three cases.

5. Needs to compile once. (I'll throw in some slack three times).

6. Switches to the optimal/necessary instructions mentioned above at
runtime.

The cost of determining whether 64-bit emulation is needed on a 32-bit
system and falling back to 32-bit operations when it's not is higher than
the cost of just using it all the time.

If you're on a true 64-bit system, run the binary that was recompiled for
that architecture. The source will be identical if it's properly written
(well, except for ASM; I'm talking HLLs).

S

Stephen Sprunk · Aug 29, 2007

Skybuck Flying said:
Apperently my first post wasn't clear enough for ya.

The idea is to:

Have the CPU implement the IF statement in it's logic circuits so that the
IF overhead is prevented.

That logic is already there; the same opcodes are used for 16-, 32-, and
64-bit operations. The opcode for 8-bit operations is typically only
different by one bit.

The problem, which you keep refusing to acknowledge, is that you can't load
or store data of indeterminate size because the compiler (or assembly coder)
needs to know how much space to reserve for objects at compile time. And,
of course, loads are by far the biggest consumer of CPU time in a typical
application -- some spend up to 80% of their time waiting on memory.
Worrying about a few extra _nano_seconds to execute emulated 64-bit
operations on a 32-bit machine is pointless when that same machine just
waited tens of _micro_seconds for the data to show up. Any unpredictable
branches, like doing run-time checks on data to see whether or not to use
emulation, will stall the pipeline and cost up to tens of microseconds again
to save a few nanoseconds of execution time.

And the boolean is replaced by some integer for more mode support.

This was the program will use fast 32 bit instructions when possible or
use slower 64 bit emulated instructions.

The cpu must provide 64 bit emulated instructions as well.

If the CPU provides 64-bit operations, it's not emulation -- it's a 64-bit
CPU. It may take longer to process 64-bit ops than 32-bit ops, but reality
shows that's not the case in shipping CPUs. Either the CPU doesn't do
64-bit ops at all, or it does them just as fast as 32-bit ops.

Well it's too late for that now I think, we stuck with current stuff, or
maybe not who knows ?!

It's obvious you don't even understand the "current stuff". Go read a book
or ten, get some real world experience, and quit wasting others' time with
your inane ideas.

S

Skybuck Flying · Aug 29, 2007

I refuse and reject this argument of yours !

Object Oriented languages have objects which are created at runtime !

The data/memory for the objects are reserved at runtime !

During the constructor/create calls, enough memory can be reserved.

If the programmer can somehow tell the object what it needs the object can
reserve the necessary ammount of memory !

The only thing left to do is tell the cpu how much memory it's supposed to
operate on !

The real problem is with the cpu and the instruction encoding it uses.

Each instruction must specify register(s) to operate on.

This implicitly means the instruction encoding is fixed bit, and can not be
changed.

Some people refuse this explanation and say that this is not true...

They say 16/32 had some kind of bit mode flags.

Some people say: 64 bit has bit mode flags.

I told them:

Show me an example !

Their reponse:

NOTHING.

I will believe it, when I SEE IT !

(alt.lang.asm included, since they might know something)

Bye,
Skybuck.

Stephen Sprunk · Aug 30, 2007

Skybuck Flying said:
I refuse and reject this argument of yours !

Object Oriented languages have objects which are created at runtime !

Most programming languages have that concept, whether they're designed to
make OO easy or not. Dynamic memory allocation has been old news for, what,
30 years now?

The data/memory for the objects are reserved at runtime !

During the constructor/create calls, enough memory can be reserved.

If you're using a constructor, you've already lost the performance war vs. a
compiler's (and CPU's) built-in types, even ones that have to be emulated.

If the programmer can somehow tell the object what it needs the object can
reserve the necessary ammount of memory !

The only thing left to do is tell the cpu how much memory it's supposed to
operate on !

No, there's a lot of other things. For instance, your "flexible" objects
that you proposed in your C++ implementation actually result in multiple
static code paths that you select between using incredibly inefficient (for
this purpose) virtual method calls and operator overloading. Simply
emulating 64-bit operations on CPUs that don't have such is going to be
faster.

The real problem is with the cpu and the instruction encoding it uses.

Each instruction must specify register(s) to operate on.

This implicitly means the instruction encoding is fixed bit, and can not
be changed.

Irrelevant. The actual math operations have the same opcodes for 16-, 32-,
and 64-bit values, and they do the exact same things to all three types of
data.

Some people refuse this explanation and say that this is not true...

They say 16/32 had some kind of bit mode flags.

Some people say: 64 bit has bit mode flags.

Sort of. The mode flags only really control load/store operations. But
they're there.

I told them:

Show me an example !

Their reponse:

NOTHING.

I will believe it, when I SEE IT !

Actually, people have given you exact references, but you know so little
about the x86 and AMD64 architectures that you didn't understand or even
recognize them.

S

SpooK · Aug 30, 2007

Some people say: 64 bit has bit mode flags.

I told them:

Show me an example !

Their reponse:

NOTHING.

I will believe it, when I SEE IT !

Oh gee... lemme crank out some POC here... oh wait... that's right...
it is not our job to spoon-feed you source code, it is your job to
RTFM and hopefully learn from them.

cr88192 · Aug 30, 2007

Stephen Sprunk said:
Any decent C compiler has a 64-bit integer type regardless of what CPU
it's targeted at; it's a requirement of ISO C99. If targeted at a 32-bit
(or 16- or 8-bit) CPU, the compiler emulates 64-bit operations. This is
not novel; C89 compilers targeting 8- and 16-bit CPUs provided the same
emulation for 32-bit integer types.

I'm not familiar with many other languages, but I believe the same is
true. If you use a 64-bit integer type, the compiler or interpreter does
whatever's needed to provide the illusion of a machine that natively
supports such.

yes.

for example, there is also a non-standard extension feature (I think maybe
in gcc in 32-bit land as well, but not checked) known as '__int128', or a
full on 128 bit integer type.

my compiler has a placeholder for this, but as of yet does not implement
this type (can declare variables, but can't assign them or do arithmetic on
them...).

David Brown · Aug 30, 2007

Skybuck said:
Hi, thanks very much for your suggestion.

It might be possible to create three libraries:

1. A 32 bit version.

2. A true 64 bit version.

3. A emulated 64 bit version.

Actually, only 2 and 3 are relevant here, because you first have to
figure out if you need 32-bit or 64-bit data. If you don't know that,
then your problem is either not well enough specified for you to start
coding, or is so vague and general that you are better off using
abstract types rather than fixed sizes (and probably better using a
language such as Python, which has direct support for arbitrarily long
integers, and can be combined with psyco to generate reasonably good
machine code on the fly).

This solves your problems, since you are only ever passing 64-bit data.

Generating the different dlls (or other types of libraries or code) is
easy - it's just a compiler flag to target x86 or amd64 code generation.

Skybuck Flying · Aug 30, 2007

You should read it yourself because you clearly don't understand it.

Bye,
Skybuck.

Moore's Lobby Podcast

Menu

Categories

Platforms

Content

Connect With Us

Network

Flexible Instruction Set Encoding.

Flexible Instruction Set Encoding.

Quadibloc

cr88192

Alex Colvin

Skybuck Flying

Skybuck Flying

Skybuck Flying

Skybuck Flying

Skybuck Flying

Skybuck Flying

Skybuck Flying

Skybuck Flying

Stephen Sprunk

Stephen Sprunk

Stephen Sprunk

Skybuck Flying

Stephen Sprunk

SpooK

cr88192

David Brown

Skybuck Flying

Similar threads