Assembly Language

Introduction

Assembly language is a low-level programming language for microprocessors (i.e. computers, microcontroller, etc.). In short, assembly language is a representation of the machine code used to program a CPU - a mnemonic that represents an opcode (operation code: a number).  There are many different CPU architectures so the programming code used can be different from depending on the manufacturer and/or target machine.

One of the wonderful things about assembly language is that it can be used to directly manipulate processor registers, memory locations and all the low-level functionalities of the machine.  When working with other high-level languages such as Visual Basic, C, C#, Java, etc. this can be accomplished but sometimes not as direct as assembly language. Not to get off track but assembly language can also be used or injected into these high level languages with some extra work; yes even VB6 can use assembly language as you will see in another article.

 

Where do can I get an assembler?

There are many different assemblers out there, even one which is home-brewed here at Xeno Innovations called, Zero Assembler. zASM is a pre-cursor to the Zero programming language which is one of our ongoing in-house creations similar to C++ (for comparison purposes).  One of my personal favorites is Flat Assembler, created by Tomasz Grysztar in 1999. FASM comes in a few different flavors such as DOS, Linux, WIndows and Unix/libc.

One item to note, the x86 assembly language has two main syntax branches: Intel syntax and AT&T syntax. If you are new to assembly language you may notice that some code examples look more foreign that others. With some slight modifications you can use convert the code easily.

  • FASM - Flat Assembler
    • Developed By: Tomasz Grysztar
    • OS: Windows, DOS, Linux, Unix/libc
  • MASM - Microsoft Macro Assembler (uses Intel syntax)
    •  
  • TASM - Turbo Assembler
    • Developed by: Borland.
    • OS: DOS, Windows
    • Latest Release: 5.0 (1996 [init] - 2002 [patch])
  • GAS - GNU Assembler -
    • Developer(s): GNU Project
    • OS: Cross-platform
    • Latest Release: 2.21 (2010-12-08)
    • Syntax: AT&T (it can use Intel via directive)
    • Website: http://www.nasm.us/
  • NASM - Netwide Assembler
    • Developed by: Simon Tatham, Julian Hall, H. Peter Anvin.
    • OS: Windows, DOS, Unix-like, OS/2, Mac OS
    • Latest Release:
    • Syntax: variation of Intel Assembly Syntax
    • Website: http://www.nasm.us/

Syntax

A while ago someone posted this on the net, a comparison of the two major x86 assembly language syntax branches: Intel syntax and AT&T syntax.

Intel syntax, originally used for documentation of the x86 platform, and AT&T syntax. Intel syntax is dominant in the MS-DOS and Windows world, and AT&T syntax is dominant in the Unix/Linux world, since Unix was created at AT&T Bell Labs. Here is a summary of the main differences between Intel syntax and AT&T syntax:

Attribute AT&T Intel
Parameter order Source comes before the destination (move 5 to eax becomes mov $5, %eax) Destination before source (follows that of many program statements ("a=5" is "mov eax, 5")
Parameter Size Mnemonics are suffixed with a letter indicating the size of the operands (e.g., "q" for qword, "l" for dword, "w" for word, and "b" for byte) Derived from the name of the register that is used (e.g., rax, eax, ax, al)
Immediate value sigils Prefixed with a "$", and registers must be prefixed with a "%" The assembler automatically detects the type of symbols; i.e., if they are registers, constants or something else.
Effective addresses General syntax DISP(BASE,INDEX,SCALE)

Example: movl mem_location(%ebx,%ecx,4), %eax

Use variables, and need to be in square brackets; additionally, size keywords like byte, word, or dword have to be used.

Example: mov eax, dword [ebx + ecx*4 + mem_location]

Some of the assemblers are capable of using both types of syntax or even syntax of another assembler altogether.  An example of this: TASM has a directive to be able to compile MASM code.  Another cross-syntax is, GAS. Though it is based on AT&T syntax, it contains a directive that can that allows it to use Intel syntax.

Example of GAS using Intel syntax:

__asm__ __volatile__(".intel_syntax noprefix\n\t" 
                     "pop edx\n\t" 
                     "mov eax,edx\n\t" 
                     ".att_syntax prefix\n\t" 
                     : /* no outputs */ 
                     : "d" (save_var), "a" (temp_var) /* inputs */ 
                     : "eax", "edx" /* clobber list */ 

 

Though some may argue that Assembly is "dead", it is anything but.  To the art of using assembly language is alive and well, I assure you.