Programming Languages Exhibit: NO Detours, NO Shortcuts

What is Assembly Language?

Assembly Language is a low level programming language and considered as the native language of computers. It is a close approximation of the binary machine code and is referred to as Assembly Code.

It is the same as machine language, only that instead of numbers, letter sequences which are easier to memorize and understand are used to write commands. It maps human readable mnemonics to machine instructions, thus allows machine level programming without writing in the machine language.

History of Assembly Language

Early computer systems were programmed literally by hand. Front panel switches were used as input device for entering instructions and data. They represented addresses, data, and other significant function in the system. Specific switches were toggled to operate.

For example, to be able to run a specific program, a certain switch which represented a certain address needs to be toggled. After that, another switch representing the data that would be used in that address would also be toggled. When all of the preparations were already made, the final run switch was then toggled signaling the run of the program.

Basically, programming back then required a certain talent in memorizing and focus for you to remember every procedure that was needed for a certain program. The programmer also needed to know every instruction set in a processor. It would allow the programmer to convert those instructions into bit patterns so that the panel switches would be set correctly.

Because of the fact that everything was being manipulated manually, the programs were very much prone to errors. Not only that, they were also very likely to be very slow because raw manpower was needed for the program to run.

With the advent of new technologies, programs were written to perform those manual entries on the premise of having a larger memory. Small monitor programs that used hex keypads or terminals to enter instructions became popular as well as paper tapes and punch cards which were used as storage media for programs.

Since programs were still hand-coded, conversion from mnemonics to instructions were still performed manually. Because of that, programmers thought of a way to increase the efficiency of their every work thus the idea of writing a program to interpret another was a major breakthrough. This program would run as a translator of mnemonics to instructions. Advantages of having such changes reduced errors, provided faster translation times and easier editing.

Who's Who?

Nathaniel Rochster

Nathaniel Rochster wrote the first assembler that was used in IBM 701 in 1954.

Stan Poley

Stan Poley is the author of SOPA (Symbolic Optimal Assembly Program). SOPA was written in 1955 and used as the assembly language for IBM 650.

What does it do?

Assemblers are programs which generate machine code instructions from a source code program written in assembly language. The features provided by an assembler are:

allows the programmer to use mnemonics when writing source code programs.
variables are represented by symbolic names, not as memory locations
symbolic code is easier to read and follow
error checking is provided
changes can be quickly and easily incorporated with a re-assembly
programming aids are included for relocation and expression evaluation

In writing assembly language programs for micro-computers, it is essential that a standardized format be followed. Most manufacturers provide assemblers which are programs used to generate machine code instructions for the actual processor to execute.

The assembler converts the written assembly language source program into a format which runs on the processor. Each machine code instruction (the binary or hex value) is replaced by a mnemonic. A mnemonic is an abbreviation which represents the actual instruction.

Binary	Hex	Mnemonic
01001111	4f	Clra
00110110	36	Psha
01001101	4d	tsta

CLRA - Clears the A accumulator

PSHA - Saves A accumulator on Stack

TSTA - Test A accumulator for 0

Mnemonics are used because they:

· are more meaningful than hex or binary values

· can reduce the risks of commiting errors

· are easier to remember than bit values

Assemblers also accept certain characters to represent number bases and addressing modes.

$ prefix or h suffix for hexadecimal

$24 or 24h

D for decimal numbers

24D 67

B for binary numbers

0101111B

O or Q for octal numbers

377O 232Q

# for immediate addressing

LDAA #$34

,X for indexed addressing

LDAA 01,X

Assembly language statements are written one per line. They are machine code programs that consist of sequence of assembly language statements, each of which contains a mnemonic. Each line of an assembly language program is split into four fields, as shown below:

“LABEL” “OPCODE” “OPERAND” “COMMENTS"

The label field is optional. A label is an identifier (or text string symbol). Labels are used extensively in programs to reduce reliance upon programmers remembering where data or code is located. A label can be used to refer to:

a memory location
the value of a piece of data
the address of a program, sub-routine, code portion etc.

The maximum length of a label differs between assemblers. Some accepts up to 32 characters long while others accept only four characters. A label, when declared, is suffixed by a colon, and begins with a valid character (A..Z). Consider the following example.

START: LDAA #24H

Here, the label START is equal to the address of the instruction LDAA #24H. The label is used in the program as a reference, eg,

JMP START

This would result in the processor jumping to the location (address) associated with the label START, thus executing the instruction LDAA #24H immediately after the JMP instruction. When a label is referenced later on in the program, it is done so without the colon suffix.

An advantage of using labels is that inserting or re-arranging code statements do not necessitate re-working actual machine instructions. A simple re-assembly is all that is required. In hand-coding, such changes can take hours to perform.

Each instruction consists of an opcode and possible one or more operands. In the instruction:

JMP START

- the opcode is JMP and the operand is the address of the label START

The opcode field contains a mnemonic. Opcode stands for operation code, ie, a machine code instruction. The opcode may also require additional information (operands). This additional information is separated from the opcode by using a space (or tab stop).

The operand field consists of additional information or data that the opcode requires. In certain types of addressing modes, the operand is used to specify:

constants or labels
immediate data
data contained in another accumulator or register
an address

Examples of operands are:

TAB ; operand specified by opcode
LDAA 0100H ; two byte operand
LDAA START ; label operand
LDAA #0FH ; immediate operand

The comment field is optional and is used by the programmer to explain how the coded program works. Comments are preceded by a semi-colon. The assembler, when generating instructions from the source file, ignores all comments. Consider the following examples:

ORG 0100H ;H means hexadecimal values

;This program starts at address 0100 hex

STATUS: DFB 23H ;This byte is identified as STATUS, and is

;initialized to a value of 23 hex

CODE: LDAA STATUS ;The label called CODE is identified as a

;machine code instruction which loads the

;An accumulator with the contents of the

;memory location associated with the label

;STATUS, ie, the value 23

JMP CODE ;Jump to the address associated with CODE

Note that the programmer does not need to worry about bit patterns, hex values, and the addresses of STATUS or CODE. The assembler, when fed the above program, will generate the correct code. The code output from the assembler will be:

Memory location Byte value

0100 23

0101 B6

0102 01

0103 00

0104 7E

0105 01

0106 01

Location 0100 holds the value associated with the label STATUS

Locations 0101 to 0103 perform the LDAA STATUS instruction

Locations 0104 to 0106 perform the JMP CODE instruction

The statement ORG 0100H in the above program is not a machine code instruction. It is an instruction to the assembler, which instructs the assembler to generate the code to run at the designated origin address. Instructions to assemblers are called pseudo-ops. These are used for:

reserving memory for data variables, arrays and structures
determining the start address of the program
determining the entry address of the program
initializing variable values

The assembler does not generate any machine code instructions for pseudo-ops or comments. Assemblers scan the source program generating machine instructions. Sometimes, the assembler reaches a reference to a variable which has not yet been defined. This is referred to as a forward reference problem. The assembler can tackle this problem in a number of ways. It is resolved in a two pass assembler as follows:

On the first pass, the assembler simply reads the source file, counts up the number of locations that each instruction will take and builds a symbol table in memory which lists all the defined variables cross-referenced to their associated memory address.

On the second pass, the assembler substitutes opcodes for the mnemonics and variable names are replaced by the memory locations obtained from the symbol table.

What do CS instructors say about Assembly Programming?

What can we say about Assembly?

"Assembly Language is like studying. It takes so much time. There are other programming languages that are easier, faster and more comfortable to use, but we just have to deal with it. We can't settle taking short cuts. "

- Mary Grace R. Lumenario

“Assembly Language is sooooooo tiring… and at the same time mind bugging because of those mnemonics. Several lines of codes in assembly language can just be a single line or even a single built-in function in a certain high level language. I don’t want assembly..:D”

- Angelo Paolo V. Aruta

Sample Hello World

include '%fasminc%/win32ax.inc'

.code

start:

invoke MessageBox,HWND_DESKTOP,"Hello World!","Win32 Assembly",MB_OK

invoke ExitProcess,0

.end start

References

http://www.friedspace.com/assembly/intro.php

http://physinfo.ulb.ac.be/cit_courseware/asm/asm_1.htm

http://education.yahoo.com/reference/dictionary/entry/assembly%20language

Bloggers

CMSC 124 T-4L

Aruta, Angelo Paolo V. 2007-17678

Lumenario, Mary Grace R. 2007-63926

Programming Languages Exhibit

Tuesday, September 18, 2012

NO Detours, NO Shortcuts - Just Assembly.

No comments:

Post a Comment

Authors