Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maths doesn't promote to destination type #119

Open
oziphantom opened this issue Jul 4, 2021 · 6 comments
Open

Maths doesn't promote to destination type #119

oziphantom opened this issue Jul 4, 2021 · 6 comments

Comments

@oziphantom
Copy link

So Given

word CurrentNumMovesLeft

CurrentNumMovesLeft = 148 << 6

You then get

    LDA #0
    STA CurrentNumMovesLeft
    LDA #0
    STA CurrentNumMovesLeft + 1

Which is wrong. You have to do
; CurrentNumMovesLeft = word(148) << 6
Which then gets you

    LDA #0
    STA CurrentNumMovesLeft
    LDA #$25
    STA CurrentNumMovesLeft + 1

It should detect that the destination is a word and limit the value to the size of the destination without having everything cast. It should also perform all maths at the highest needed and then only truncate at the end.

@KarolS
Copy link
Owner

KarolS commented Jul 4, 2021

All math has the same rules: the following three pieces of code should all succeed to compile and should all have the same effect:

CurrentNumMovesLeft = 148 << 6

byte tmp
tmp = 148
CurrentNumMovesLeft = tmp << 6

byte tmp, tmp2
tmp = 148
tmp2 = tmp << 6
CurrentNumMovesLeft = tmp2

You cannot have an introduction of a variable of a correct type change the overall result. Most programming languages do something similar. For this reason, the correct value of 148 << 6 has to be 0.

However, this is a good place to introduce a compile warning.

@oziphantom
Copy link
Author

No.
Automatic Promotion is the standard. For example in JAVA

I quote from Herbert Schildt : JAVA, The Complete Reference 8th Edition.

Chapter 4 : Operators

Java’s automatic type promotions produce unexpected results when you are shifting byte and short values. As you know, byte and short values are promoted to int when an expression is evaluated. Furthermore, the result of such an expression is also an int. This means that the outcome of a left shift on a byte or short value will be an int, and the bits shifted left will not be lost until they shift past bit position 31.

From the C11 6.3.1.1 Boolean, characters and integers standard

659 Every integer type has an integer conversion rank defined as follows:
660 — No two signed integer types shall have the same rank, even if they have the same representation.
661 — The rank of a signed integer type shall be greater than the rank of any signed integer type with less precision.
662 — The rank of long long int shall be greater than the rank of long int, which shall be greater than the rank of int, which shall be greater than the rank of short int, which shall be greater than the rank of signed char.
663 — The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type, if any.
664 — The rank of any standard integer type shall be greater than the rank of any extended integer type with the same width.
665 — The rank of char shall equal the rank of signed char and unsigned char.
666 — The rank of _Bool shall be less than the rank of all other standard integer types.
667 — The rank of any enumerated type shall equal the rank of the compatible integer type (see 6.7.2.2).
668 — The rank of any extended signed integer type relative to another extended signed integer type with the same precision is implementation-defined, but still subject to the other rules for determining the integer conversion rank.
669 — For all integer types T1, T2, and T3, if T1 has greater rank than T2 and T2 has greater rank than T3, then T1 has greater rank than T3.
670 The following may be used in an expression wherever an int or unsigned int may be used:
671 — An object or expression with an integer type whose integer conversion rank is less than or equal to the rank of int and unsigned int.
672 — A bit-field of type _Bool, int, signed int, or unsigned int.
673 If an int can represent all values of the original type, the value is converted to an int;
674 otherwise, it is converted to an unsigned int.
675 These are called the integer promotions.48)
676 All other types are unchanged by the integer promotions.
677 The integer promotions preserve value including sign.
678 As discussed earlier, whether a “plain” char is treated as signed is implementation-defined.

Of which C++, Obj-C, C# and Swift also follow suite.

64tass does int or float and promotes to either as needed but doesn't do maths internally as byte or word, you can't cast mid operation to a smaller type.

If you know of any languages that don't follow such rules, I am very interested is seeing them.

To which

CurrentNumMovesLeft = 148 << 6 since CurrentNumMovesLeft is a word and there are no typed variables present the result is 9472 and CurrentNumMovesLeft is assigned to 9472

byte tmp
tmp = 148
CurrentNumMovesLeft = tmp << 6

While the tmp is a byte and a variable the code generated should promote it to a word and perform a 16 bit operation and again the result is 9472 as calculated by the 6502/6809/Z80 or if you detect this is a constant case and optimize it away with a single immediate load.

byte tmp, tmp2
tmp = 148
tmp2 = tmp << 6
CurrentNumMovesLeft = tmp2

Since we are shifting a byte and storing it in a byte, in 6502 code tmp2 will be 0 and thus CurrentNumMovesLeft when assigned the byte variable tmp2 will also be 0, with both bytes being set to 0 as the byte is promoted to a word on assignment. If you detect this as a constant case and optimize it then the result is still 0 as we have requested that result of tmp << 6 to be stored in a byte sized container.

like wise

byte tmp
tmp = ((129 << 2) + (4<<3)) >> 4

tmp is assigned the value 131, not 3
also

byte tmp
tmp = ((7*2.5)+4)*2

tmp is assigned the value 43 not 42 as all sub parts of the expression should be promoted to a floating point number before being converted back to a byte.

@KarolS
Copy link
Owner

KarolS commented Jul 5, 2021

Those languages were designed for 16-bit or 32-bit machines, and that's why they're incapable of doing arithmetic on anything smaller. Smaller types exist in them only as a storage mechanism and an overload disambiguation hint. In each of those languages, the int type was created with the purpose of being the fastest and easiest to implement integer type, which is why all the smaller ones are cast to it.

In none of the languages you mentioned (on any common implementation at least) long x = 4 << 30; (or long long x = 4u << 30;) will give you something else than 0. There are no autopromotions there. The type on the left of the assignment operator has no influence on what happens during the evaluation of the right part. You usually don't even get a warning.

Millfork does exactly the same. It is a fully 8-bit language, that's why its "int" is 8-bit. Any arithmetic on 8-bit operands yields 8-bit results. If Millfork ever gets a 4-bit or a 1-bit integer type (quite usable, some languages like Batari Basic have those), it could promote them to 8-bits, but right now 8-bit is both the smallest type and the default type. Just like in all the other languages you mentioned, if you want your operation to be done at a larger size, you need to provide at least one of the arguments of that larger size.

Going back to my example: long x = (long)4 << 30; (or long long x = (long long)4u << 30;) will give you ~4 billion as you want. Just like what you need to do in Millfork: explicit cast to a larger type.

If byte << byte yielded a word, then either you couldn't assign it to a byte (which would be very annoying, just see any bit-twiddling code in Java), or you could, but then you could assign any word to a byte, yielding tons of bugs (like C, I mean int i = 346346346; char c = i; compiles in C without any warnings). This would also beg the question: what with other operators? Should they also promote to 16-bit? What if the optimizer doesn't catch it and you're stuck with 16-bit math you didn't need? For both performance and simplicity, Millfork does not do any of that, it's an 8-bit language for 8-bit machines, and if you only use bytes, everything is bytes. 148 is a byte. 6 is a byte. A byte shifted by a byte is a byte. Therefore 148 << 6 is a byte. 9472 is not a byte. Therefore 148 << 6 cannot be 9472.

I know that it is sometimes annoying, but it was a trade-off. Other options would have been much worse (inconsistent, slow or error-prone), with a negative impact over the entire language. I'll add warnings to catch some of such overflow cases, but the behaviour won't change, as it is working as intended.

As for 64tass, the expressions you write have no chance of being compiled into executable code, they are all preevaluated constants, so the assembler is free to use whatever.

@oziphantom
Copy link
Author

indeed long x = 4 << 30 does yield 0. really would have hoped that got fixed somewhere along the lines, but I'm sure the code that uses ??( would also break.

GCC will give you

<source>:6:15: warning: result of '(4 << 30)' requires 34 bits to represent, but 'int' only has 32 bits [-Wshift-overflow=]
    6 |    long i = 4 << 30;
      |             ~~^~~~~

while VS will give you nothing. Thanks VS.
although the case of char = int will give you a

Severity	Code	Description
Warning	C4267 '=': conversion from 'int' to 'unsigned char', possible loss of data

Don't you just love standards, there are so many to choose form 😄

But if we only have bytes then how does word temp = $4000 work? This would assign 0 as you can't hold $4000 in a byte and I've not cast it to be a word. word temp = word($4000) would yield the correct result?

word temp
temp = $4000
i = 4
temp = i

since i is a byte and we only assign a byte, temp now equals $4004 and to get it to perform the operation as intended I need to to temp = word(i) so the byte is promoted to word and thus $0004 is assigned to temp? How about temp = i + 0 as this addition between two bytes and the result is a byte?

There is a missing distinction, there is code and there is compile time constants as you noted.

word tmp
tmp = 148 <<6 

in this case 148 << 6 is not code, at no point should the compiler make

LDA #148
ASL A
ASL A
ASL A
ASL A
ASL A
ASL A

it is a constant I'm getting the compiler to generate for me, the same as when I do in 64Tass or any other compiler/assembler.

If I write tmp = 9472 or tmp = 148 << 6 the outcome should be the same, the second may just be a more convenient way for me to express the value. Like wise when positioning something on a screen with 40 chars per line I would be inclined to write it `offset = (10*40)+5 to put it at (5,10) however this will fail as all 3 are bytes.

While

byte tmp, tmp2
tmp = 148
tmp2 = tmp << 6
CurrentNumMovesLeft = tmp2

is not compile time constant as it involves variables that are not const defines. Your optimiser may resolve it to be a const value, but it is an expression that then has a byte cast in it. i.e
CurrentNumMovesLeft =byte(148 <<6)

To make it more explicit perhaps say a [form] could be introduced so one does CurrentNumMovesLeft = [148<<6] and the [] says this is compile time and as high as it needs. If we go back to the offset = (10*40)+5 case where does one put the word for example if one does offset = word(10*40)+5 well the 10*40 are still bytes and hence you get word(144) + 5 = word(149)? or do you do offset = (word(10)*40)+5 so you get word(405)? Do you need to do offset = (word(10)*word(40))+5?
vs having the all constants are evaluated at the highest they need, i.e invoke 64 bit maths internally, and then the resulting operation will be determined by the destination size, will generate working code in the vast majority of the time as the user expects. If said constant wont fit in target size i.e >255 but assigned to a byte, >65536 but assigned to a word, then throw a warning telling the user "this is probably what you mean, please cast if it is". Given Millfork is being used by people who don't know how to code ASM, and they can't read the asm output like I can every time I write some code to double check that it has done what I need. If there is an optimisation within said range, for example I know that a byte addition in part of the equation will not overflow into a word, then I can manually cast that portion into a byte for the code generator to optimise and then if it blows up, be it upon my head.
Under this case

word screenOffset = (10*40)+5 //works 
void func(byte x, byte y) {
    word screenOffset = (x*40) + y
}
//this will resolve to a 
//__mul_u8u8_u16 then a word_add_byte to return word 
// and thus works
word getNESScreenOffset(byte x, byte y) {
    word temp
    temp = (y << 5) +x
} 
//this will do a 16 bit shift of y 5 times into temp and word_add_byte
// and thus works
word getSomeFunc(byte x, byte y, byte z) {
    word temp
    temp = (x + z ) << 2 + (y <<5 )
    //this is a waste but doing it all at 16 bits "works" I can then do
    temp = byte(x + z << 2) + y << 5
    //this now does byte_add_byte and then a 8bit << twice then promotes it to word 
   // does a word_add_byte and then 16bit shift 5 times
}

byte someFunc(byte x, byte y) {
    return (x*y)-24
}
//this is a byte * byte + byte to return a byte and hence does what it does now as defined.

And all of those(except the last one) will fail at the moment. And if you add a warning saying "word screenOffset = (x*40) + y" can't fit result for x > 6 it will either be noise as I never want x to be over 6, or some new person staring at the code going "but why? its a word, how do I make it work?"
making the prototype void func(word x, byte y) { fixes it cleanly but wastes overhead on passing variables. With the real answer being word screenOffset = (word(x)*40) + y but then does that make a u8u8_u16 or a u16u8_u16?

@retrac0
Copy link
Contributor

retrac0 commented Jul 6, 2021

If you know of any languages that don't follow such rules, I am very interested is seeing them.

Assembly? 😄 I mention it because Millfork is targeted somewhere between Assembly and C. Yes, it is a rather unusual decision, but like with avoiding recursion (which is also unusual in most modern languages) it was chosen to help fit the language to the target platform. It's intentionally lower-level than C, and C was already barely a high level language.

Integer promotion is costly. Almost any operation that compares a byte to a 16-bit value ends up promoted in C, for example. A simple byte comparison that would be a CMP b / BNE sequence for 6502 becomes a subroutine to promote the byte to 16 bits and then do a 16 bit comparison. This is one of the main reasons that compliant C compliers for 8-bit machines are so bad at generating code.

@oziphantom
Copy link
Author

So if you compare a byte to a word you want it to do this?

LDA #5
STA $2
LDA #1 
STA $3
LDA #7
STA $4

LDA $4
CMP $2
BCS HIGHER

because that is wrong and is going bug quite a lot. Any byte being compared to word has to convert the byte to a word which it can do by checking the high byte and if it is not zero then it fails. If zero then it compares the low byte.

I figured this might actually need to be tested. So

        word temp
        byte j
        temp = $4000
	j = 5
	if j > temp {
		asm {
			lda #1
			sta $d020
		}
	}

yields

; 
;line:203:HITMill65.mfk
;       temp = $4000
    LDA #0
    STA main$temp
    LDA #$40
    STA main$temp + 1
; 
;line:204:HITMill65.mfk
;       j = 5
    LDA #5
    STA main$j
; 
;line:205:HITMill65.mfk
;       if j > temp {
    LDA #0

so the compiler has worked out the constant values are always false, and it has auto promoted j to a word to do the comparison.

        word temp
        byte j
	temp = $4000
	for j,0,to,5 {
		if j > temp {
			asm {
				lda #1
				sta $d020
			}
		}
	}

yields

;line:203:HITMill65.mfk
;       temp = $4000
    LDA #0
    STA main$temp
    LDA #$40
    STA main$temp + 1
; 
;line:205:HITMill65.mfk
;       for j,0,to,5 {
    LDA #0
    STA main$j
.wh__00188:
; 
;line:206:HITMill65.mfk
;           if j > temp {
    LDA main$temp + 1
    CMP #0
    BCC .cp__00193
    BNE .fi__00192
    LDA main$temp
    CMP main$j
    BCS .fi__00192
.cp__00193:
; 
;line:207:HITMill65.mfk
;               asm {
    LDA #1
    STA $D020
; 
;line:206:HITMill65.mfk
;           if j > temp {
.fi__00192:
; 
;line:205:HITMill65.mfk
;       for j,0,to,5 {
.fp__00190:
    LDA main$j
    CMP #5
    BNE .el__00194
    JMP .ew__00191
    JMP .fi__00195
.el__00194:
    INC main$j
.fi__00195:
    JMP .wh__00188
.ew__00191:

again it does a cmp #0 check on the upper byte, so the byte has been auto promoted to a word for the comparison.

KarolS added a commit that referenced this issue Aug 12, 2021
– Detection of simple byte overflow cases.
– Optimization of 8×8→16 multiplication on 6809.
– Multiplication optimizations on Z80.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants