216b asm compo - need help
category: general [glöplog]
The demo is nearly finish everything works well but...
I need to downsize my program (310b => 256) and to replace some register by mem variable
(using nasm)
frame:
inc dword [time];; the evil instruction
jmp frame
time dw 0
just adding inc dword [time] make the program 2x or 3x slower. this one is only call between each frame, not in the rendering process.
Why?
I need to downsize my program (310b => 256) and to replace some register by mem variable
(using nasm)
frame:
inc dword [time];; the evil instruction
jmp frame
time dw 0
just adding inc dword [time] make the program 2x or 3x slower. this one is only call between each frame, not in the rendering process.
Why?
do not use evil instructions. download intel's instruction reference to see which ones are good or evil.
[code]
__asm {
mov edi, Buffer
mov ecx, Len
mov eax, 0x41595321 // AYS!
loop0: test eax, 0x100
je lower
cmp ax, 0
jl mi
mov byte ptr [edi], 0x7f
jmp weida
mi: mov byte ptr [edi], 0x80
jmp weida
lower: mov byte ptr [edi], al
weida: inc edi
ror eax, 5
xor al, 10011010b
mov bx, ax
rol eax, 2
add bx, ax
xor ax, bx
ror eax, 3
dec ecx
jnz loop0
}
[code]
__asm {
mov edi, Buffer
mov ecx, Len
mov eax, 0x41595321 // AYS!
loop0: test eax, 0x100
je lower
cmp ax, 0
jl mi
mov byte ptr [edi], 0x7f
jmp weida
mi: mov byte ptr [edi], 0x80
jmp weida
lower: mov byte ptr [edi], al
weida: inc edi
ror eax, 5
xor al, 10011010b
mov bx, ax
rol eax, 2
add bx, ax
xor ax, bx
ror eax, 3
dec ecx
jnz loop0
}
[code]
did you intend to write "time dd 0" instead of "time dw 0"? just not to corrupt the next variable after [time]... otherwise, inc word [time] is smaller and faster in 16bit, but not this faster.
just a quick guess for the slowdown, maybe you mess up caching by changing [time]. this might happen if some memory addressing is bound to [time] in your code, eg. you decide which framebuffer to write by mov ax,[time] / and ax,1
just a quick guess for the slowdown, maybe you mess up caching by changing [time]. this might happen if some memory addressing is bound to [time] in your code, eg. you decide which framebuffer to write by mov ax,[time] / and ax,1
Code:
SECTION .data
angle dd 244
time dd 0
SECTION .code
;;;;;;;;;;;;;;;;;;;;;;;;; Init
org 100h
mov al,13h
int 10h
push 0a000h
pop es
;;;;;;;;;;;;;;;;;;;;;;;;; Palette
...
;;;;;;;;;;;;;;;;;;; init
....
main_loop
loop2
loop1
;putpixel
jnz loop1
jnz loop2
;;screen is draw
inc word[time]
;key Esc
in al, 60h
dec al
jnz main_loop
int 20h
as you can see time is not use in code, simple use of inc [time] slowdown the thing....
right now, for constants I use this trick:
Code:
push 40
push 255
mov si,sp
fild word[si-2]
..
but this increase code size a lot
come on man, inc is not such an evil instruction ;) teh bug lies somewhere else in the code you did not quote. look at this quick test code. whether you uncomment the 'inc word[time]' or not, it runs at the same speed.
maybe you just messed up an offset with the [si] trick and read from [time] when you want to read something else.
did you really put the .data section before .code? then your pc will execute your data as instructions before reaching 'mov al,13h'. check the .com file.
btw why do you use sections at all? just write:
bits 16
org 0x100
Code:
SECTION .code
;;;;;;;;;;;;;;;;;;;;;;;;; Init
org 100h
mov al,13h
int 10h
push 0a000h
pop es
main_loop
dec bl
xor di,di
mov al,200
loop2
mov ah,200
mov bh,bl
add bh,al
loop1
;putpixel
mov [ES:di],bh
inc di
dec ah
jnz loop1
dec al
jnz loop2
;;screen is draw
;inc word[time]
;key Esc
in al, 60h
dec al
jnz main_loop
int 20h
SECTION .data
angle dd 244
time dd 0
maybe you just messed up an offset with the [si] trick and read from [time] when you want to read something else.
did you really put the .data section before .code? then your pc will execute your data as instructions before reaching 'mov al,13h'. check the .com file.
btw why do you use sections at all? just write:
bits 16
org 0x100
You can chop a few bytes by quiting with a simple ret, and initiallizing the screen address ( well almost ) with lds bp,[bx]. Try also to use cx and loop aLabel where you can.
Don't know if the code you pasted above is from your wip, but for your ESC key test, you could use dec ax instead dec al and thus chop 1 more byte.