It looks like you survived that huge barrage of "redefining" lessons. If you feel like your brain has been fried, I suggest that you take a well-deserved break before starting on this lesson. About TSC TSC is *the* scripting language of Cave Story. It's incredibly useful, and fairly straightforward to learn. Right now, we're going to learn how to create new TSC commands, so you can very easily extend any assembly functionality to TSC. The Cave Story program is able to read data from TSC files and perform certain actions based on what commands those files contain. Pixel knew that writing dialogue and cutscenes directly into the program was not a great idea. So, being the game-creator that he is, Pixel designed his own scripting language for Cave Story to make things easier and more manageable. We're going to take a look at the TSC parser. What is a parser? Well, I'm glad you asked. The definition of the verb "to parse" is to break down a string of text characters into small parts and then analyze what those parts mean. Essentially, the TSC parser checks a TSC command to see what it is, then it performs the command, then it moves onto the next command. Also, you should know what ASCII is. ASCII is the "American Standard Code for Information Interchange". ASCII is a system of numbering in which each text character, such as A or B or @, is given a certain hex number, so that computers are able to display text information. Every modern computer in the world, from Canada to Japan to the USA, should be able to understand ASCII.[1] ASCII Table Before we begin, you need the ASCII table. An ASCII table tells you which hex code belongs to which text character. For your convenience, I've compiled such a table so that you can easily view it inside this guide: Click Here to View the ASCII Table The TSC Parser If you look at the ever-so-useful Assembly Compendium, you'll find that the TSC parser starts at address 422510. It's important to understand that the parser is CMPing ASCII hex codes to check for a TSC command. Remember to look at the ASCII table I gave you! At the very beginning of the parser, the parser CMPs a single hex code: 3C. Looking at the table, you'll see that 3C is the hex code for <, which is the beginning of every TSC command. If the character isn't <, then the code jumps way down near the end of the parser. The first command the parser checks is the <END command. The parser CMPs 45, 4E, and then 44, which are the ASCII hex codes for E, N, and D, respectively. If all those letters match up, then the parser performs the <END command. If one or more of those letters don't match up, it jumps to address 422666. Notice that 422666 is the beginning of the next command: the <LI+ command. After checking for <LI+, the parser continues to check for commands until it finds the right one. If the command isn't found (i.e. you made a typo during TSC coding), then the game will return an error message. Script Positions MOV ECX,DWORD [4A5AD8] ADD ECX,DWORD [4A5AE0]These two commands are very important. [4A5AD8] holds the pointer to where the current script is loaded. [4A5AE0] holds the script position within that script file. When you add [4A5AE0] to [4A5AD8], you get the pointer to the current character, or the character that is being read by the game. In this case, I used MOV ECX,DWORD [4A5AD8] and ADD ECX,DWORD [4A5AE0] . I used ECX as an example, but you could just as easily substitute EDX or EAX.Let's say that the game is running and you've just opened the door to Arthur's house. After determining that you've got Arthur's Key, the game will jump to the event #0101 and set the script position to the beginning of that event. I've highlighted the < character using a red box because that's the character being read by the game. If you perform the commands MOV ECX,DWORD [4A5AD8] and then ADD ECX,DWORD [4A5AE0] , then ECX will act as a pointer to that character.
Accessing BYTE [ECX] will give you the hex code 3C, which is the hex number of the < character. BYTE [ECX+1] contains the hex code for the S character, BYTE [ECX+2] contains the hex code for O, and BYTE [ECX+3] contains the hex code for U. The parser will first check to see if the command is <END, which isn't the right one, so it'll move on to <LI+, which also isn't the right one. It will keep checking until it finally reaches <SOU. ASCII to Number Function Ever wondered what this function does? ASCII to Number Function: PUSH (Script Position) CALL 00421900 ;returns number into EAX ADD ESP,4The above function takes the current Script Position, and then stores the 4-digit decimal number (inside the TSC file) starting at that Script Position to the register EAX. So let's see an example of that. Here is the <SOU command: Address Command Comments 00424266 MOV EAX,DWORD PTR DS:[4A5AD8] ;Where the script is loaded 0042426B ADD EAX,DWORD PTR DS:[4A5AE0] ;add Script Position 00424271 MOVSX ECX,BYTE PTR DS:[EAX+1] ;get character directly after the current character. 00424275 CMP ECX,53 ;check for S 00424278 JNE SHORT 004242DA ;if not S, then jump to next command, which is <CMU. 0042427A MOV EDX,DWORD PTR DS:[4A5AD8] ;Where the script is loaded 00424280 ADD EDX,DWORD PTR DS:[4A5AE0] ;add Script Position 00424286 MOVSX EAX,BYTE PTR DS:[EDX+2] ;get char that's 2 characters after the current character. 0042428A CMP EAX,4F ;check for O 0042428D JNE SHORT 004242DA ;if not O, then jump to next command <CMU. 0042428F MOV ECX,DWORD PTR DS:[4A5AD8] ;Where the script is loaded 00424295 ADD ECX,DWORD PTR DS:[4A5AE0] ;add Script Position 0042429B MOVSX EDX,BYTE PTR DS:[ECX+3] ;get char that's 3 characters after the current character. 0042429F CMP EDX,55 ;check for U 004242A2 JNE SHORT 004242DA ;if not U, then jump to next command <CMU. 004242A4 MOV EAX,DWORD PTR DS:[4A5AE0] ;get Script Position 004242A9 ADD EAX,4 ;Add 4 to Script Position 004242AC PUSH EAX ;PUSH Script position 004242AD CALL 00421900 ;CALL Ascii to number Function 004242B2 ADD ESP,4 ;Fix the stack 004242B5 MOV DWORD PTR SS:[EBP-24],EAX ;take EAX (which holds the 4-digit TSC number) and store it into variable [EBP-24]. 004242B8 PUSH 1 ;PUSH 1 (this is the Channel #) 004242BA MOV ECX,DWORD PTR SS:[EBP-24] ;take variable [EBP-24] and store it into ECX 004242BD PUSH ECX ;PUSH ECX (ECX is still holding that same 4-digit TSC number) 004242BE CALL 00420640 ;CALL Play Sound Function 004242C3 ADD ESP,8 ;Fix the stack 004242C6 MOV EDX,DWORD PTR DS:[4A5AE0] ;Get Script Position 004242CC ADD EDX,8 ;Add 8 to Script Position 004242CF MOV DWORD PTR DS:[4A5AE0],EDX ;Store Back to Script Position 004242D5 JMP 004252A7 ;Jump Back to Beginning of ParserHey look, it's the MOVSX instruction. See, those past lessons weren't useless after all. But why are we using a byte? This is because each ASCII character is worth 1 byte. Technically, each ASCII character contains 1 hex pair, and each pair of hexadecimal numbers is worth 1 byte. In order to check each letter of a TSC command 1 letter at a time, we have to MOV (or MOVSX) the data 1 byte at a time. Here's a flowchart that explains what happens when the <SOU command is executed: Now here's an even bigger chart that explains <SOU in even more detail: Summary 1. The first thing the parser does is check for characters. This is the basic routine for checking three characters of any TSC command: MOV ECX,DWORD [4A5AD8] ;Notice that I've rewritten this code to make ADD ECX,DWORD [4A5AE0] ;it shorter. CMP BYTE [ECX+1],(first character) JNE (address of next TSC command) CMP BYTE [ECX+2],(2nd character) JNE (address of next TSC command) CMP BYTE [ECX+3],(3rd character) JNE (address of next TSC command)(Yes, you do need to use BYTE, not DWORD, for checking the letters of a TSC command. Otherwise, the above code will not work and you will get an error message.) 2. After that, you can get the 4-digit numbers used after the TSC command using the ASCII to Number Function. Make sure you set the Script Position correctly, then just use the function to grab that 4-digit TSC number. 3. You make the command do whatever action you want it to perform. 4. Next, you add the (number of characters your TSC command takes up) directly to [4A5AE0], also known as the Script Position. This allows the parser to move onto the next command in the TSC file. 5. Finally, make sure you do a JMP 4252A7 at the end of your command, which will lead the code into a series of other jumps that eventually gets you back to the beginning of the parser.Jumping to Other TSC Events The following function is what the <EVE command uses to jump to another event while the script is running: Jump to TSC Event Function: PUSH (event #) ;remember all numbers must be in hex CALL 00421AF0 ;Call this function to Jump to a TSC Event while the script is running. ADD ESP,4 JMP 004252A7 ;this part goes back to the beginning of the parser.If you use this Jump to TSC Event function, you do not need to do part 4 (see the above Summary section). This is because the Script Position is automatically set to the beginning of whatever event you jumped to. So, there is no need to add anything extra to the script position. Previous Lesson: Redefining IMUL and IDIV Next Lesson: The <BBP Command Table of Contents [1]Of course, nowadays computers use Unicode for all sorts of multilingual text support. For Cave Story hacking though, understanding ASCII is enough. |