Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
N64 MIPS assembly introductory tutorial (WIP)
#1
NOTE: This tutorial is far from finished, and as such many explanations are incomplete and potentially misleading to some extent. It will be updated over time, but for now it does at least cover some of the fundamental concepts of MIPS assembly.

You want to learn how to "do ASM" for N64 games? Even though this task seems to be considered nearly impossible by some, it actually isn't very hard once you get the hang of it. Beware though; to actually get the hang of it, a decent amount of discipline and patience will be required.
 
Let's start with the very basics: what is "ASM"? ASM stands for “assembly”, which is a type of programming language. Just like with "normal" programming languages (such as C, Java or C#), there are many different assembly languages. For the N64, the assembly language to code in is called "MIPS r4300i" - we'll get back to this soon. First though: what is the difference between assembly programming languages and "normal" ones?
 
Well, when you code in ASM, you code directly for the processor(s) of the machine in question. This means that you only can use the instructions which are used in the processor itself when coding. Assembly is basically the machine code of a processor (which is actually just patterns of binary data; lots of 0s and 1s in other words), translated into code readable by humans. While you code in a language like C however, you often don't have to care much about how the processor you're coding for works - that is handled by the compiler.
 
What this means (in simpler words), is basically that when you code in assembly, you’re “speaking” directly to the machine with your code. When you code in high-level languages like C# or Java on the other hand, you’ll have to “translate” your code through a compiler for the machine to understand it.
 
The CPU (Central Processing Unit; the ”core” processor of a machine) of the N64 is the “MIPS r4300i”. Therefore, the specific assembly language to code in for N64 software is also called “MIPS r4300i” (it’s often just referred to as “MIPS” in the context of the N64 though). So, to “do ASM” for any N64 game, you’ll first need to learn how to code in the MIPS r4300i assembly language. This is exactly what this tutorial aims to introduce you to, even if you have no prior experience of programming.
 
It is often said that the first program you should write in any language is a “Hello World” program, which simply prints the words “Hello World” on the screen. Therefore, we’ll explain how that could be done for the N64 in this tutorial. We won’t hack some game to print these words though – instead we’ll use an application of our own. This is because hacking games often requires more than just understanding the MIPS language. You have to have some basic knowledge of the game’s inner workings and structure too, in most cases.
 
Writing a “Hello World” program is very easy to do in most high-level programming languages. Here’s an example of how it usually is written in C:
 
Code:
#include "stdio.h"

int main(){

                             printf("Hello World");
                             return 0;

}

As this isn’t a tutorial about learning C programming, you won’t have to worry about most of this. Just know that the program starts reading code from the “int main()” segment, which is a function. 

A function is a piece of code that often has a very specific task; like printing text on the screen or calculating certain values for example. Functions can be executed from anywhere in a program's code, even if they aren’t "close" to the code that "calls" them (when using a function, you say that you "call" it). The “printf” function that is called in the above C code for example, is a function that outputs the text string it has been supplied with.

What you see in the parantheses following the function call (the text string "Hello World), is what is called an "argument" of the function. When calling some functions, you can provide them with arguments. The values of the arguments aren’t pre-determined in the function itself, so their value can vary based on the circumstances of the call. In C’s “printf” function for example, the text string to print is an argument (since you want to be able to print different text strings; if this wasn't a variable, the "printf" function would only be able to print one constant string of text).
 
For the N64 however, it won’t be as easy, as we basically will be designing our application from scratch. In the above C example, “printf” (the function which is called to print the words) is an existing function which is found in the file “stdio.h”. The programmer doesn’t need to write this piece of code themselves, or even understand how it works - they only need to know with what argument(s) to call the function.
 
When coding in MIPS for the N64, we won’t have the privilege of a pre-written piece of code that will print text on the screen for us – we’ll have to make one ourselves. Since this task would be quite complex for beginners however, I have already written a function that works very similarly to printf, but for the N64 instead. And again, you won't be doing anything with this code yourself yet, aside from looking at it and trying to learn from it.

Like printf in C, this function needs to be called with arguments. I designed it to utilize four arguments, but only one is needed to actually print the words (which is the only result we want for now). We’ll discuss the rest of them later. Here is a short overview of the function (what we need to know for now, anyway):
 
Code:
N64 print function:
Argument 1 (A0) = RAM address of text line to print
 
This might seem confusing to you right now; what does “A0” mean?  What is a “RAM address”? Let’s explain it all.
 
“A0” refers to the “register” in which the argument will be stored in. A register in MIPS is basically a variable; a little piece of memory that can store any value that fits in it. There are 32 different registers in MIPS, and some of them have specific purposes.  There are four registers which specifically are intended to store arguments before function calls: “A0”, “A1”, “A2” and “A3”. We’ll discuss the concept of registers further later, but for now this should roughly be what you need to know about them.
 
It is very important to understand what “RAM” is in order to code in almost any assembly language. “RAM” stands for “Random Access Memory”; you might also have heard of “ROM”, which stands for “Read Only Memory”. This might not tell you much right now, but we’ll explore it further.
 
Read Only Memory (ROM) is precisely what its name implies; it is memory which only is read (loaded). In games for the N64 for example, “constant” game data like code, models and textures are always stored in ROM, which in turn is what is stored on the game cartridge. There are other types of game data though; consider Mario’s health in Super Mario 64 for example. This isn’t a constant value, since you can lose and gain health. In other words, it’s a variable (a value which can vary), which cannot be stored in ROM as data stored in ROM can only be read (once again, precisely what its name implies), not written/altered.
 
But where are variables such as Mario’s health stored then? This is where RAM comes in. In RAM, data can be actively and dynamically altered while the application is running. The N64 actually only uses data in RAM to run the game, even if it uses ROM data to set up everything in RAM. This means that all the “constant data” located in ROM (like code, models and textures) can also be found in RAM with N64 games.
 
Aside from this unaltered data directly transferred from ROM, data types like variables are also stored in RAM as they can actively be altered while the game is running there, unlike in ROM. In most games, these include variables such as player health (like we discussed using the example of Mario’s health in Super Mario 64), player position and countless more. 
 
A “RAM address” refers to a location in RAM. You could visualize RAM like
 
All this means that we can print the words “Hello World” in our own application, by using this relatively simple piece of code:
 
Code:
.ORG xxxx
LUI A0, 0x8000
JAL PRINT
NOP
 
Of course, this piece of code alone won’t print anything on the screen; you need to have the function as well.
 
Now, let’s go over what the code actually does and how it works.
 
Code:
LUI A0, 0x8000
 
You can say that there are three parts to this instruction. The text “LUI” is the “opcode” (operation code), which determines what the instruction will do. It stands for “Load Upper Immediate”, which we’ll get to later. You’ll find that all opcode names in MIPS actually are acronyms for what they do.
 
The next “part” is “A0”. This is a register, as we discussed before. Just to make sure that we’re all on the same page, though: A register is a little piece of memory in RAM that can store any value that fits in it, a variable in other words. Their value is assigned by opcodes, which also use them in their operations. There are 32 different registers in MIPS, and some of them have specific purposes. Register A0 is intended to be used for storing argument values before calling a function, for instance.
 
And now finally for the last component of this instruction: “0x8000”. This is a so-called immediate value. An immediate value is basically a value that the processor doesn’t store in any register or other similar section of its memory, so it’s only used in the instruction that it’s found in. Immediate values are mainly used to set the values of registers or manipulate them in other ways, such as decreasing or increasing their values. One important thing to remember about them is that immediate values only can be a maximum of 16 bits (2 bytes) long.
 
And by the way, the prefix “0x” before the “8000” simply means that the value after it is stated in hexadecimal. If you’re not familiar with this, I’d advise a quick Google search, as I’m fairly certain that you’ll be able to find far better explanations than I could ever provide you with.
 
So, what does this all mean then? What does the instruction “LUI A0, 0x8000” do in the end? Well, it loads the (immediate) value of 0x8000 into the upper bits of register A0. This might be hard for you to understand, but I’ll try to help you visualize it.
 
A MIPS register basically consists of 32 bits, or 4 bytes if you will (actually, they have a capacity of 64 bits, but you won’t need to concern yourself with that unless you need to store a REALLY big value). We could visualize an empty register set to 0 like this:

Register A0:
Byte 4:               Byte3:                Byte2:                Byte1:
00                      00                      00                     00

As we know, an immediate value is 2 bytes long. This means that if we’d want to set A0 to a value bigger than 2 bytes, we’d not be able to store that value with other opcodes usually used for assignment, such as adding. Here’s an example:
 
Code:
ADDI A0, ZERO, 0xFFFF
 
This instruction will add the immediate value 0xFFFF (the maximum value which can be stored in 2 bytes) to A0. To be more precise, it actually adds 0xFFFF to the register “ZERO” (a register that always is set to zero) and then stores the result in the register A0. ADDI stands for “ADD Immediate” as you might’ve guessed, we’ll discuss this further later.
 
After executing this instruction, the register’s contents will look like this:
 
Register A0:
Byte 4:               Byte 3:               Byte 2:               Byte 1:
00                      00                      FF                      FF
 
So, even when we stored the largest immediate value possible into the register (0xFFFF), it won’t affect “byte 3” and definitely not “4”. To assign an immediate value to these bytes, we’ll need to use another opcode – LUI.
 
Just like its name (Load Upper Immediate) suggests, this instruction loads an immediate value into the upper “contents” of a register – its upper bits in other words. This means that after executing an instruction such as this one:
 
Code:
LUI A0, 0xFFFF
 
The register’s value will look like this, assuming that it was empty (set to 0) before the instruction was performed:

Register A0:
Byte 4:               Byte 3:               Byte 2:               Byte 1:
FF                      FF                      00                      00
 
So after the first instruction in our source code (LUI A0, 0x8000) is executed, register A0 will look like this:
 
Byte 4:               Byte 3:               Byte 2:               Byte 1:
80                      00                      00                      00

So in summary, the result of executing the instruction “LUI A0, 0x8000”, is that the value of register “A0” is set to 0x80000000.
 
Now, let’s move onto the next instruction. This one will be a lot simpler to explain, and hopefully also to understand:
 
Code:
JAL PRINT
 
You could say that this instruction consists of two components: “JAL” and “PRINT”.
 
“JAL” is the opcode, it’s an acronym of “Jump And Link”. This instruction is used to call functions in MIPS. The instruction “jumps” to the RAM address given to it by the following value, and begins executing the code found at that address. In this case, that address would be "PRINT"... But wait, isn't "PRINT" a string of text? It's not a value that classifies as a valid RAM address, right?

Well, it's what's called a "label". Labels are very useful for assembly programmers; they allow you to reference a RAM address without having to type out the exact value of it.
  


Forum Jump:


Users browsing this thread:
1 Guest(s)