I have written the following function to check whether a character is a digit or not:
# IsDigit - tests a if a character a digit or not
# arguments:
# $a0 = character byte
# return value:
# $v0 = 1 - digit
# 0 - not a digit
IsDigit:
lb $t0, ($a0) # obtain the character
li $t1, 48 # '0' - character
li $t2, 57 # '9' - character
bge $t0, $t1, condition1
condition1:
ble $t0, $t2, condition2
li $v0, 0
j return
condition2:
li $v0, 1
return:
# return
jr $ra
Is there any better way to do or write this?
Edit: The following is the version-2
IsDigit:
lb $t0, ($a0) # obtain the character
li $t1, 48 # '0' - character
li $t2, 57 # '9' - character
bge $t0, $t1, condition1
j zero
condition1:
ble $t0, $t2, condition2
zero:
li $v0, 0
j return
condition2:
li $v0, 1
j return
return:
# return
jr $ra
Edit-2: the following is version-3
IsDigit:
lb $t0, ($a0) # obtain the character
li $t1, 48 # '0' - character
li $t2, 57 # '9' - character
bge $t0, $t1, con1_fulfilled #bigger tha or equal to 0
j con1_not_fulfilled
con1_fulfilled:
ble $t0, $t2, con2_fullfilled #less than or equal to 9
j con2_not_fulfilled
con2_fullfilled:
li $v0, 1
j return
con1_not_fulfilled:
con2_not_fulfilled:
li $v0, 0
return:
# return
jr $ra
In the general case, you use 2 branches that go to past the if()
body. If either one is taken, the if
body doesn't run. In assembly, you usually want to use the negation of the C condition, because you're jumping past the loop body so it doesn't run. Your later version does it backwards so also need unconditional j
instructions, making your code extra complicated.
The opposite of <=
(le) is >
(gt). For C written to use inclusive ranges (le and ge), asm using the same numerical values should branch on the opposite conditions using exclusive ranges (that exclude the eq
ual case). Or you can adjust your constants and bge $t0, '9'+1
or whatever, which can be useful right at the end of what fits into a 16-bit immediate.
# this does assemble with MARS or clang, handling pseudo-instructions
# and I think it's correct.
IsDigit:
lb $t0, ($a0) # obtain the character
blt $t0, '0', too_low # if( $t0 >= '0'
bgt $t0, '9', too_high # && $t0 <= '9')
# fall through into the if body
li $v0, 1
jr $ra # return 1
too_low:
too_high: # } else {
li $v0, 0
#end_of_else:
jr $ra # return 0
If this wasn't at the end of a function, you could j end_of_else
from the end of the if
body to jump over the else
block. Or in this case, we could have put the li $v0, 0
ahead of the first blt
, to fill the load delay slot instead of stalling the pipeline. (Of course a real MIPS also has branch-delay slots, and you can't have back-to-back branches. But bgt
is a pseudo-instruction anyway so there aren't wouldn't really be back-to-back branches.)
Also, instead of jumping to a common jr $ra
, I simply duplicated the jr $ra
into the other return path. If you had more cleanup to do, you might jump to one common return path. Otherwise tail duplication is a good thing to simplify the branching.
In this specific case, your conditions are related: you're doing a range-check so you only need 1 sub
and then 1 unsigned-compare against the length of the range. See What is the idea behind ^= 32, that converts lowercase letters to upper and vice versa? for more about range-checks on ASCII characters.
And since you're returning a boolean 0/1, you don't want to branch at all, but rather use sltu
to turn a condition into a 0 or 1 in a registers. (This is what MIPS uses instead of a FLAGS register like x86 or ARM). Instructions like ble
between two registers are pseudo-instructions for slt
+ bne
anyway; MIPS does have blez
and bltz
in hardware, as well as bne
and beq
between two registers.
And BTW, the comments on your IsDigit
don't match the code: they say that $a0
is a character, but actually you're using $a0
as a pointer to load a character. So you're passing a char
by reference for no apparent reason, or passing a string and taking the first character.
# IsDigit - tests a if a character a digit or not
# arguments:
# $a0 = character byte (must be zero-extended, or sign-extended which is the same thing for low ASCII bytes like '0'..'9')
# return value:
# $v0 = boolean: 1 -> it is an ASCII decimal digit in [0-9]
IsDigit:
addiu $v0, $a0, -'0' # wraps to a large unsigned value if below '0'
sltiu $v0, $v0, 10 # $v0 = bool($v0 < 10U) (unsigned compare)
jr $ra
MARS's assembler refuses to assemble -'0'
as an immediate, you have to write it as -48
or -0x30
. clang's assembler has no problem with addiu $v0, $a0, -'0'
.
If you write subiu $v0, $a0, '0'
, MARS constructs '0'
using a braindead lui+ori, because it's very simplistic for extended pseudo-instructions that most assemblers don't support. (MIPS doesn't have a subi
instruction, only addi
/addiu
, both of which take sign-extended immediates.)