The ASM instruction you always wanted, but never had?
category: code [glöplog]
architect1: why yes, yes it does. But we want it on 6502 :)
sigflup: that makes no logical sense. It must be:
mov cl,3
print_real_edible_pizzas_from_printer
There. NOW it's perfect :)
sigflup: that makes no logical sense. It must be:
mov cl,3
print_real_edible_pizzas_from_printer
There. NOW it's perfect :)
divz <register> ; divide by zero so I could destroy the world without Devpac throwing a hissy fit!
Ferris>
There's no chip with this instruction! Not even one! How could of the developers fucked up so bad?
There's no chip with this instruction! Not even one! How could of the developers fucked up so bad?
sse add-all-components-and-store-in-first (or somewhere else for that matter) for dotproducts et cetera w/o tedious shuffling
yes, i know sse3 had haddps. getting there.
yes, i know sse3 had haddps. getting there.
had->has
at the time of the i386 DOS coding, I always missed the
instruction (Set Demo Mode Please) which would
1. initialize the 32bit protected mode
2. setup the 320x240 32bpp linear framebuffer gfx mode
3. initialize the sound system
so we wouldn't have to mess with dos4gw, dpmi, vesa, sb and all that crap. Alternativelly, a DOS interruption would have rocked.
but it never happened :(
Code:
sdmp
instruction (Set Demo Mode Please) which would
1. initialize the 32bit protected mode
2. setup the 320x240 32bpp linear framebuffer gfx mode
3. initialize the sound system
so we wouldn't have to mess with dos4gw, dpmi, vesa, sb and all that crap. Alternativelly, a DOS interruption would have rocked.
Code:
mov eax, GFX_320_240_32 | SND_SB_220_44100_16
int 21h
but it never happened :(
perlinnoise cof, cof, cof;
Code:
but on an 68000.
Quote:
rlwinm rA, rS, SH, MB, ME
but on an 68000.
arm code:
which adds r1 and r2 left shifted of 4 bits, and stores the result in r0.
the shift value could be in a register too.
each time you execute this two-instruction LCG pseudo-random generator you get a uniformly-distributed 32-bit random number. Ideal for starfield and other noisy generation.
Code:
add r0,r1,r2 lsl #4
which adds r1 and r2 left shifted of 4 bits, and stores the result in r0.
the shift value could be in a register too.
Code:
add r0, r0, r0 lsl #8 multiplies r0 by 257 mod 2^32
add r0, r0, #47 adds a prime number
each time you execute this two-instruction LCG pseudo-random generator you get a uniformly-distributed 32-bit random number. Ideal for starfield and other noisy generation.
zerkman:
those are just limited FMAD instructions :p
those are just limited FMAD instructions :p
What about:
ROtateFlagsLeft
LoadMemoryAcessOffset
LoadOverflowLeft
FpuUnconditionalCheckKernel ?
ROtateFlagsLeft
LoadMemoryAcessOffset
LoadOverflowLeft
FpuUnconditionalCheckKernel ?
shuffle2:
no, because:
- to use fmad, you need to initiate a destination register with the value to be added, hence at least two instructions and two registers. My solution only uses one single register, and can be called without initialization as many times as needed to generate as many random numbers as needed.
- fmad is floating-point, hence not suitable for a LCG random number generator.
no, because:
- to use fmad, you need to initiate a destination register with the value to be added, hence at least two instructions and two registers. My solution only uses one single register, and can be called without initialization as many times as needed to generate as many random numbers as needed.
- fmad is floating-point, hence not suitable for a LCG random number generator.
6510 move zero page
I miss a simple swap.w on 68000, a rol.w is much too slow (on 020+ the rol is fast though), and a swap.b wouldn't hurt either.
hmm, yeah. Every 16-bit+ processor should have a swap instruction. Except in the case of SIMD, e.g. x86/SSE, where the 'shuffle' instructions are a result of poor SSE (1.0, 2.0, 3.0?) instruction set design.
clz or div on ARM7tdmi aka GBA.
Mooooar registersss
or more Brainsss, eh? ;P
In the past, I've always wondered why some processors do not include more registers. I would have killed for more (general purpose) registers on the 8086.
You'd think that adding a few registers wouldn't hurt anyone. However, adding more registers means more bits are needed in the opcode to define which register is accessed. In addition, most processors have multi-port register banks (multiple registers can be read or sometimes written). These register banks can get quite complex, especially if you have a pipelined design. For instance, you need many more multiplexers and handle write contention, if the CPU can write more than one register at a time.
Adding more registers not only adds to the chip area, making it more expensive, the added complexity increases the access time of the register bank and the CPU will be slower. With modern IC processes (.18 um and smaller), these things are less of a problem, but in the 70ies, 80ies and 90ies these were undoubtedly some of the reasons for register limited designs.
</rant>
You'd think that adding a few registers wouldn't hurt anyone. However, adding more registers means more bits are needed in the opcode to define which register is accessed. In addition, most processors have multi-port register banks (multiple registers can be read or sometimes written). These register banks can get quite complex, especially if you have a pipelined design. For instance, you need many more multiplexers and handle write contention, if the CPU can write more than one register at a time.
Adding more registers not only adds to the chip area, making it more expensive, the added complexity increases the access time of the register bank and the CPU will be slower. With modern IC processes (.18 um and smaller), these things are less of a problem, but in the 70ies, 80ies and 90ies these were undoubtedly some of the reasons for register limited designs.
</rant>
I love the auxillary registers on the z80. Love to do exx, ex af,af' (I rarely use it though), or select carefully my registers so that I can do ex de,hl especially to add directly to (hl). I play a lot with them. I am wondering why the x86 (where the 8086 is from 8080 that is from z80 ancestor I think?) doesn't have them.
Also the z80 has the IX,IY which are slow but you can use IXh,IXl,IYh,IYl, which proved very useful when I just need a loop counter in external loops and can't find anything (or am too lazy to make some ugly automodifying code just before the jump). Recent versions of Winape32 assembly lets you use them directly without having to remember the hex codes as in old assemblers on the CPC.
But yes, the register thing, probably you would need more space in the opcode structure. I was talking with Antitec of Dirty Minds and we always thought, if there would be just one more 16bit register, it would be just fine because there are a lot of routines where you struggle and if you had just one more register it would fit perfectly. That's what we thought. Although today I am more good with Z80 and I love to play with the regs and always find a good solution for my code to fit well with the regs (and a pair of EXXs well placed is a solution too and only loose 2 NOP cycles).
Also the z80 has the IX,IY which are slow but you can use IXh,IXl,IYh,IYl, which proved very useful when I just need a loop counter in external loops and can't find anything (or am too lazy to make some ugly automodifying code just before the jump). Recent versions of Winape32 assembly lets you use them directly without having to remember the hex codes as in old assemblers on the CPC.
But yes, the register thing, probably you would need more space in the opcode structure. I was talking with Antitec of Dirty Minds and we always thought, if there would be just one more 16bit register, it would be just fine because there are a lot of routines where you struggle and if you had just one more register it would fit perfectly. That's what we thought. Although today I am more good with Z80 and I love to play with the regs and always find a good solution for my code to fit well with the regs (and a pair of EXXs well placed is a solution too and only loose 2 NOP cycles).
iq: that's a bit silly don't you think? you're basically asking for a wholly different and complex OS/bios :)
Quote:
I am wondering why the x86 (where the 8086 is from 8080 that is from z80 ancestor I think?) doesn't have them.
Some guys (including Federico Faggin) left Intel to form Zilog after a disagreement on the architecture of the 8085. Intel finalized the 8085 and Zilog produced the z80. They were direct competitors, so neither was probably keen to 'borrow' eachother's ideas.
Z80 auxiliary registers are a hack. The designers understood the want of extra registers but needed to keep compatibility with the mainstream-at-the-time 8080. Maybe it seemed like a great idea at the time, but it was a hack nevertheless: very sharply aimed at the present, but without big future. From Intel's perspective it was a much better idea to move on to the more advanced 16-bitters instead, which they did.
6502, mov. How I miss it.
A question for those who have experience programming the Amiga or Atari ST..
How often does the CPU wait for the graphics processor? I assume graphics memory is shared and the CPU is denied memory access when the GFX processor is reading data to build the screen. I'm not familiar with the memory architecture on these machines.
How often does the CPU wait for the graphics processor? I assume graphics memory is shared and the CPU is denied memory access when the GFX processor is reading data to build the screen. I'm not familiar with the memory architecture on these machines.