Lesson One. Some Historical Background. The 'C' programming language was designed and developed by Brian Kernighan, and Dennis Ritchie at The Bell Research Labs. 'C' is a Language specifically created in order to allow the programmer access to almost all of the machine's internals - registers, I/O slots and absolute addresses. However, at the same time, 'C' allows for as much data hiding and programme text modularisation as is needed to allow very complex multi-programmer projects to be constructed in an organised and timely fashion. During the early 1960s computer Operating Systems started to become very much more complex with the introduction of multi-terminal and multi-process capabilities. Prior to this time Operating Systems had been carefully and laboriously crafted using assembler codes, and many programming teams realised that in order to have a working o/s in anything like a reasonable time this was now longer economically feasible. This then was the motivation to produce the 'C' Language, which was first implemented in assembler on a Digital Equipment Corporation PDP-7. Of course once a simple assembler version was working it was possible to rewrite the compiler in 'C' itself. This was done in short order and therefore as soon as the PDP-11 was introduced by DEC it was only necessary to change the code generator section of the compiler and the new machine had a compiler in just a few weeks. 'C' was then used to implement the UNIX o/s. This means, that a complete UNIX can be transported, or to use the simple jargon of today; 'ported to a new machine in literally just a few months by a small team of competent programmers. Enough of the past. Lets see the various actions, or compilation phases through which the `C' compilation system has to go in order that your file of `C' program text can be converted working program. Assuming that you are able to work an editor and can enter a script and create a file. Please enter the following tiny program. #ident "@(#) Hello World - my first program" #include char *format = "%s", *hello = "Hello World...\n"; main() { printf ( format, hello ); } Now save it in a file called hello.c. Lower case is allowed - encouraged, no less - under the UNIX operating system. Now type: cc -o hello hello.c The computer will apparently pause for a few moments and then the Shell, or Command Line Interpreter prompt will re-appear. Now type: hello Lo and behold the computer will print Hello World... Let's just look at what the computer did during the little pause. The first action is to activate a preliminary process called the pre-processor. In the case of hello.c all it does is to replace the line #include with the file stdio.h from the include files library. The file stdio.h provides us with a convenient way of telling the compiler that all the i/o functions exist. There are a few other little things in stdio.h but they need not concern us at this stage. In order to see what the pre-processor actually outputs, you might like to issue the command: cc -P hello.c The 'cc' command will activate the 'C' compilation system and the -P option will stop the compilation process after the pre-processing stage, and another file will have appeared in your directory. Have a look, find hello.i and use the editor in view mode to have a look at it. So issue the command: view hello.i You will see that a number of lines of text have been added at the front of the hello.c program. What's all this stuff? Well, have a look in the file called /usr/include/stdio.h again using the view command. view /usr/include/stdio.h Look familiar? Now the next stage of getting from your program text to an executing program is the compilation of your text into an assembler code program. After all that is what a compiler is for - to turn a high level language script into a program. Lets see what happens by issuing the command cc -S hello.c Once again there is another file in your directory - this time with a .s suffix. Lets have a look at it in the same way as the .i file view hello.s You will doubtless notice a few recognizable symbols and what appears to be a pile of gibberish. The gibberish is in fact the nmemonics for the machine instructions which are going to make the computer do what you have programmed it to do. Now this assembler code has to be turned into machine instructions. To do this issue the command. cc -g -c hello.s Now, yet again there is another file in your directory - this time the suffix is ".o". This file is called the object file. It contains the machine instructions corresponding exactly to the nmemonic codes in the .s file. If you wish you can look at these machine codes using one of the commands available to examine object files. dis -L -t .data hello.o >hello.dis The output from these commands won't be very meaningful to you at this stage, the purpose of asking you to use them is merely to register in your mind the fact that an object file is created as a result of the assembly process. The next stage in the compilation process is called by a variety of names - "loading", "linking", "link editing". What happens is that the machine instructions in the object file ( .o ) are joined to many more instructions selected from an enormous collection of functions in a library. This phase of the compilation process is invoked by the command:- cc -o hello hello.o Now, at last, you have a program to execute! So make it do it's thing by putting the name of the executable file as a response to the Shell or Command Line Interpreter prompt. hello Presto, the output from your program appears on the screen. Hello World... You are now allowed to rejoice and have a nice warm fuzzy to hold! You have successfully entered a `C' program, compiled it, linked it, and finally, executed it! Having gone through all the various stages of editing, pre-processing, compiling, assembling, linking, and finally executing, by hand as it were, you can now rest assured that all the stages are automated by the 'cc' command, and you can forget how to invoke them! Just remember that the computer has to do them in order for you to have a program to execute. The single command you use to activate the C Compiler is: cc -o hello hello.c The word after the -o option is the name of the executable file, if you don't provide a name here the compiler dreams up the name "a.out". The source file MUST have the .c extension otherwise the compiler complains and stops working. Notes: The command names used in the above text are those of standard UNIX, Your particular system may well use a different name for the 'C' compiler. bcc - for Borland 'C'. gcc - GNU 'C', which is standard on the Linux operating system. lc - Lattice 'C', available on IBM and clone P.C.s as well as the Amiga. Check in the Documentation which came with your compiler. The same notions apply to the text editor. Differences between 'C' and other languages. In the years since 'C' was developed it has changed remarkable little. This fact is a bouquet to the authors, who had the vision and understanding to create a language which has endured so well. The strengths and weaknesses should be pointed out here. The big plus is that it is possible to do everything ( well at least 99.9% ) in 'C' while other languages compel you to write a procedure, subroutine or function in assembler code. 'C' has very good facilities for creating tables of constant data within the source file. 'C' doesn't do very much to protect you from yourself. This means that the resulting code executes faster than most other high level languages, but a much greater degree of both care and understanding is demanded from the programmer. 'C' is not a closely typed language, although the newer compilers are offering type checking as part of the language itself as opposed to having to use a separate program for mechanised debugging. 'C' is a small language with very few intrinsic operations. All the heavy work is done by explicit library function calls. 'C' allows you to directly and conveniently access most of the internals of the machine ( the memory, input output slots, and CPU registers ) from the language without having to resort to assembler code. 'C' compilers have an optimisation phase which can be invoked if desired. The output code can be optimised for either speed or memory usage. The code will be just as good as that produced by an assembly code programmer of normal skill - real guru programmers can do only slightly better. Copyright notice:- (c) 1993 Christopher Sawtell. I assert the right to be known as the author, and owner of the intellectual property rights of all the files in this material, except for the quoted examples which have their individual copyright notices. Permission is granted for onward copying, but not modification, of this course and its use for personal study only, provided all the copyright notices are left in the text and are printed in full on any subsequent paper reproduction. -- +----------------------------------------------------------------------+ | NAME Christopher Sawtell | | SMAIL 215 Ollivier's Road, Linwood, Christchurch, 8001. New Zealand.| | EMAIL chris@gerty.equinox.gen.nz | | PHONE +64-3-389-3200 ( gmt +13 - your discretion is requested ) | +----------------------------------------------------------------------+