I dunno. maybe I have a different look at hardware and programming.
When I started building that Z80 system, it was in my spare time.
In the evenings mainly.
In my daytime job at that time I was designing hardware for small
and huge projects, some involving designing ISA cards for
the IBM PC / XT whatever it was.
So I am familiar with the different platforms, and the X86 asm too,
although I am so rusty with that, really as I only use C on those platforms.
So, when making my little Z80 project, something that gave me some ideas
that resulted in million $ projects for, my boss, the question was:
What makes this slow?
Answer: disk IO (5 1/4 inch floppy, you must have heard
the rrrrrr crack rrrrrr plonk rrrrr for minutes) while for example compiling something.
There are many solutions, but speeding up a processor by adding more registers and allowed
operations on those, is _not_ one of them.
My solution was: Copy the floppy to RAM, and use the RAM as floppy,
work that way, until finished, that copy everything back to floppy.
As it is all sequential and needs no seeks, it is fast, like format, tick tick tick... tick done.
Now after that speed increase, it compiled a C source faster then gcc now on a 1GHz Athlon.
So, improve where it counts, not in the margin.
I hear J. Larking about ARM, sure RISC processors have a simpler instructions set,
but need many more instructions.
The Z80 was incredibly (is) code efficient.
Just imagine the bit set and test, these days in C we all use integers
(and that may actually be 64 bits long whoopy) for a flag, whoaaaa!, yes you
really need to optimise to work that speed loss away!
The story is much longer of course.
Yesterday I worked some more on the PIC color controller,
added 5 timers, a clock, so it can do color sequencing, set different colors at different times of day,
all is saved in EEPROM, it has a RS232 interface with help menu, and many commands,
it drives 3 PWM color channels, and guess what:
I am sill below 2 kB (it is 4kB code space chip), so below where I have to call code in page 2.
Now I am well aware that some here get the hickups if you mention the 16Fxxx PIC, as they got burned
by bank select, the complicated way to write to external memory, etc, but the point is that
bank select with an assembler like gpasm is extremely simple,
'banksel VRCON' will select the correct bank (1 actually) for VRCON, etc.
In this whole project I did only use the scope when it was finished, the PIC never left the noppp programmer,
as that programmer has now a RS232 connection to the PIC, and the 3 LEDS were simply soldered on the socket
of the programmer, now that saves time.
I use no debugger, and no ICE, nothing. it is 3055 lines of code (subtract some empty lines), and it all works.
need to add the new menu options to the help menu,
then it will likely be released as the next version (0.6) on my site.
So, it is not the processor, and this circuit has as external parts only 3 resistors!
It is the designer, it is the programmer, and it is to look at the situation in a clear way,
Sure, that PIC is not suitable for decoding H264 in real time.
OTOH we use hardware to assist us in cases where software makes little sense, the Z80 has the
DMA chip, the CTC (timer) chip, the DART (serial UART), a chip for serial synchronous communication,
all with a special interrupt structure (daisy chain) optimised for speed.
Who cares if you have to move the value of a register, THAT is not what is setting the speed.
Unless you are doing something you should not be doing.
Sure, companies like MS write such outrageous incredible bloat that it will slow anything down.
they do not want better performance, they have no new ideas, nothing to bring, need to sell,
are in bed with the big processor makers, to add ever more speed and memory to get ever less new stuff.
It all ends with video.
Maybe some voice recognition, but that wont be the main driving force.
And lets get this right, 1GHz is already plenty for playback, and encoding H264 in real time is best done
with the aid of : YOU GUESSED IT extra dedicated hardware.
So much for that.
Sure 4 core 4 GHz, too bad the trend is now towards netbooks.
And _still_ the biggest speed break is disk access.
Maybe you should do some actual programming to get the hang of it again.
On a PIC or Z80 for example

Man ain't PIC cool, I love them.