Lesson 7. De-bugging Strategies. >>>>>>>> Proper Preparation Prevents Piss-Poor Performance. <<<<<<<< This lesson is really a essay about how to go about writing programs. I know that by far the best way to greatly reduce the amount of effort required to get a program going properly is to avoid making mistakes in the first palace! Now this might seem to be stating the absolute obvious, and it is but after looking at many programs it would seem that there is a very definite need to say it. So how does one go about reducing the probability of making mistakes? There are many strategies, and over the years I have evolved my own set. I have found that some of the most important are: 1) Document what you are going to do before yes BEFORE you write any code. Set up the source files for the section of the program you are going to write and put some lines of explanation as to what you intend to do in this file. Be as precise as you can, but don't go into the detail of explaining in English, or your First Language, exactly what every statement does. 2) Make sure that you keep each file as small as is sensible. Some program authors say that one should put only one function in a file. It's my personal opinion that this is going a little bit over the top, but certainly you should not have more than one logical activity in a source file. It's easier to find a needle in a tiny haystack than in a big one! 3) Always use names for the objects in your program which are fully descriptive, or at the very least are meaningful nmemonics. Put yourself in the position of some poor soul who - a couple of years later, after you have long finished with the project, and left the country - has been given the task of adding a small feature to your exquisite program. Now in the rush to get your masterpiece finished you decided to use variable names like "a4" and "isb51" simply so that you can get the line typed a fraction of a second faster than if you used something like "customer_address[POST_CODE]" and "input_status_block[LOW_FUEL_TANK_#3]. The difference in ease of understanding is obvious, isn't it? However judging by some programs which I have seen published in both magazines and in the public domain program sources, the point has still to be made. 4) ALWAYS take great care with the layout of your code. It's my opinion that the opening brace of ALL program structures should be on a new line. Also if you put them in the leftmost column for structs, enums, and initialised tables, as well as functions, then the 'find function' keystrokes ( "[[" and "]]" ) in vi will find them as well as the functions themselves. Make sure you have the "showmatch" facility in vi turned on. ( And watch the cursor jump when you enter the right hand brace, bracket, or parenthesis. ) 5) Try as hard as you can to have as few global variables as possible. Some people say never have any globals. This is perhaps a bit too severe but global variables are a clearly documented source of programming errors. If it's impossible to perform a logical activity in an efficient way without having a global or two, then confine the scope of the globals to just the one file by marking the defining declaration "static". This stops the compiler producing a symbol which the linking loader will make available to all the files in your source. 6) Never EVER put 'magic numbers' in you source code. Always define constants in a header file with #define lines or enum statements. Here is an example:- /* ----------------------------------------- */ #include enum status_input_names { radiator_temperature, oil_temperature, fuel_pressure, energy_output, revolutions_per_minute }; char *stats[] = { "radiator_temperature", "oil_temperature", "fuel_pressure", "energy_output", "revolutions_per_minute" }; #define NUMBER_OF_INPUTS ( sizeof ( stats ) / sizeof ( stats[0])) main() { enum status_input_names name; printf ( "Number of Inputs is: %d\n", NUMBER_OF_INPUTS ); for ( name = radiator_temperature; name < NUMBER_OF_INPUTS; name++) { printf ( "\n%s", stats[ name ] ); } printf ( "\n\n" ); } /* ----------------------------------------- */ Note that as a side effect we have available the meaningful symbols radiator_temperature etc. as indices into the array of status input names and the symbol NUMBER_OF_INPUTS available for use as a terminator in the 'for' loop. This is quite legal because sizeof is a pseudo-function and the value is evaluated at the time of compilation and not when the program is executed. This means that the result of the division in the macro is calculated at the time of compilation and this result is used as a literal in the 'for' loop. No division takes place each time the loop is executed. To illustrate the point I would like to tell you a little story which is fictitious, but which has a ring of truth about it. Your employer has just landed what seems to be a lucrative contract with an inventor of a completely new type of engine. We are assured that after initial proving trials one of the larger Japanese motor manufactures is going to come across with umpteen millions to complete the development of the design. You are told to write a program which has to be a simple and straightforward exercise in order to do the job as cheaply as possible. Now, the customer - a some-what impulsive type - realises that his engine is not being monitored closely enough when it starts to rapidly dis-assemble itself under high speed and heavy load. You have to add a few extra parameters to the monitoring program by yesterday morning! You just add the extra parameters into the enumand the array of pointers to the character strings. So: enum status_input_names { radiator_temperature, radiator_pressure, fuel_temperature, fuel_pressure, oil_temperature, oil_pressure, exhaust_manifold_temperature }; Let's continue the story about the Japanese purchase. Mr. Honda ( jun ) has come across with the money and the result is that you are now a team leader in the software section of Honda Software ( YourCountry ) Ltd. The project of which you are now leader is to completely rewrite your monitoring program and add a whole lot of extra channels as well as to make the printouts much more readable so that your cheap, cheerful, and aesthetic-free program can be sold as the "Ultimate Engine Monitoring Package" from the now world famous Honda Real-time Software Systems. You set to work, Honda et. al. imagine that there is going to be a complete redesign of the software at a cost of many million Yen. You being an ingenious type have written the code so that it is easy to enhance. The new features required are that the printouts have to be printed with the units of measure appended to the values which have to scaled and processed so that the number printed is a real physical value instead of the previous arrangement where the raw transducer output was just dumped onto a screen. What do you have to do? Thinking along the line of "Get the Data arranged correctly first". You take you old code and expand it so that all the items of information required for each channel are collected into a struct. enum status_input_names { radiator_temperature, radiator_pressure, fuel_temperature, fuel_pressure, oil_temperature, oil_pressure, exhaust_manifold_temperature, power_output, torque }; typedef struct channel { char *name; /* Channel Name to be displayed on screen. */ int nx; /* position of name on screen x co-ordinate. */ int ny; /* ditto for y */ int unit_of_measure; /* index into units of measure array */ char value; /* raw datum value from 8 bit ADC */ char lower_limit; /* For alarms. */ char upper_limit; float processed_value; /* The number to go on screen. */ float offset; float scale_factor; int vx; /* Position of value on screen. */ int vy; }CHANNEL; enum units_of_measure { kPa, degC, kW, rpm, Volts, Amps, Newtons }; char *units { "kPa", "degC", "kW", "rpm", "Volts", "Amps", "Newtons" }; CHANNEL data [] = { { "radiator temperature", { "radiator pressure", { "fuel temperature", { "fuel pressure", { "oil temperature", { "oil pressure", { "exhaust manifold temperature", { "power output", { "torque", }; #define NUMBER_OF_INPUTS sizeof (data ) / sizeof ( data[0] ) Now the lesson preparation is to find the single little bug in the above program fragment, to finish the initialisation of the data array of type CHANNEL and to have a bit of a crack at creating a screen layout program to display its contents. Hint: Use printf(); ( Leave all the values which originate from the real world as zero. ) Here are some more tips for young players. 1) Don't get confused between the logical equality operator, == and the assignment to a variable operator. = This is probably the most frequent mistake made by 'C' beginners, and has the great disadvantage that, under most circumstances, the compiler will quite happily accept your mistake. 2) Make sure that you are aware of the difference between the logical and bit operators. && This is the logical AND function. || This is the logical OR function. The result is ALWAYS either a 0 or a 1. & This is the bitwise AND function used for masks etc. The result is expressed in all the bits of the word. 3) Similarly to 2 be aware of the difference between the logical complementation and the bitwise one's complement operators. ! This is the logical NOT operator. ~ This is the bitwise ones complement op. Some further explanation is required. In deference to machine efficiency a LOGICAL variable is said to be true when it is non-zero. So let's set a variable to be TRUE. 00000000000000000000000000000001 A word representing TRUE. Now let's do a logical NOT !. 00000000000000000000000000000000 There is a all zero word, a FALSE. 00000000000000000000000000000001 That word again. TRUE. Now for a bitwise complement ~. 11111111111111111111111111111110 Now look we've got a word which is non-zero, still TRUE. Is this what you intended? 4) It is very easy to fall into the hole of getting the '{' & '}'; '[' & ']'; '(' & ')'; symbol pairs all messed up and the computer thinks that the block structure is quite different from that which you intend. Make sure that you use an editor which tells you the matching symbol. The UNIX editor vi does this provided that you turn on the option. Also take great care with your layout so that the block structure is absolutely obvious, and whatever style you choose do take care to stick by it throughout the whole of the project. A personal layout paradigm is like this: Example 1. function_type function_name ( a, b ) type a; type b; { type variable_one, variable_two; if ( logical_expression ) { variable_one = A_DEFINED_CONSTANT; if ( !return_value = some_function_or_other ( a, variable_one, &variable_two ) ) { error ( "function_name" ); exit ( FAILURE ); } else { return ( return_value + variable_two ); } } /* End of "if ( logical_expression )" block */ } /* End of function */ This layout is easy to do using vi with this initialisation script in either the environment variable EXINIT or the file ${HOME}/.exrc:- set showmode autoindent autowrite tabstop=2 shiftwidth=2 showmatch wm=1 Example 2. void printUandG() { char *format = "\n\ User is: %s\n\ Group is: %s\n\n\ Effective User is: %s\n\ Effective Group is: %s\n\n"; ( void ) fprintf ( tty, format, passwd_p->pw_name, group_p->gr_name, epasswd_p->pw_name, egroup_p->gr_name ); } Notice how it is possible to split up format statements with a '\' as the last character on the line, and that it is convenient to arrange for a nice output format without having to count the field widths. Note however that when using this technique that the '\' character MUST be the VERY LAST one on the line. Not even a space may follow it! In summary I *ALWAYS* put the opening brace on a new line, set the tabs so that the indentation is just two spaces, ( use more and you very quickly run out of "line", especially on an eighty column screen ). If a statement is too long to fit on a line I break the line up with the arguments set out one to a line and I then the indentation rule to the parentheses "()" as well. Sample immediately above. Probably as a hang-over from a particular pretty printing program which reset the indentation position after the printing of the closing brace "}", I am in the habit of doing it as well. Long "if" and "for" statements get broken up in the same way. This is an example of it all. The fragment of code is taken from a curses oriented data input function. /* ** Put all the cursor positions to zero. */ for ( i = 0; s[i].element_name != ( char *) NULL && s[i].element_value != ( char *) NULL; i = ( s[i].dependent_function == NULL ) ? s[i].next : s[i].dependent_next ) { /* Note that it is the brace and NOT the */ /* "for" which moves the indentation level. */ s[i].cursor_position = 0; } /* ** Go to start of list and hop over any constants. */ for ( i = edit_mode = current_element = 0; s[i].element_value == ( char *) NULL ; current_element = i = s[i].next ) continue; /* Note EMPTY statement. */ /* ** Loop through the elements, stopping at end of table marker, ** which is an element with neither a pointer to an element_name nor ** one to a element_value. */ while ( s[i].element_name != ( char *) NULL && s[i].element_value != ( char *) NULL ) { int c; /* Varable which holds the character from the keyboard. */ /* ** Et Cetera for many lines. */ } Note the commenting style. The lefthand comments provide a general overview of what is happening and the righthand ones a more detailed view. The double stars make a good marker so it is easy to separate the code and the comments at a glance. The null statement. You should be aware that the ";" on its own is translated by the compiler as a no-operation statement. The usefullness of this is that you can do little things, such as counting up a list of objects, or positioning a pointer entirely within a "for" or "while" statement. ( See example above ). There is, as always, a flip side. It is HORRIBLY EASY to put a ";" at the end of the line after the closing right parenthesis - after all you do just that for function calls! The suggestion is to both mark deliberate null statements with a comment and to use the statement "continue;". Using the assert macro will pick up these errors at run time. The assert macro. Refer to the Programmers Reference Manual section 3X and find the documentation on this most useful tool. As usual an example is by far the best wasy to explain it. /* ----------------------------------------- */ #ident "@(#) assert-demo.c" #include #include #define TOP_ROW 10 #define TOP_COL 10 main() { int row, col; for ( row = 1; row <= TOP_ROW; row++); { assert ( row <= TOP_ROW ); for ( col = 1; col <= TOP_COL; col++ ) { assert ( col <= TOP_COL ); printf ( "%4d", row * col ); } printf ( "\n" ); } } /* ----------------------------------------- */ Which produces the output:- Assertion failed: row <= TOP_ROW , file assert-demo.c, line 15 ABORT instruction (core dumped) It does this because the varable "row" is incremented to one greater than The value of TOP_ROW. Note two things: 1) The sense of the logical condition. The assert is asserted as soon as the result of the logical condition is FALSE. Have a look at the file /usr/include/assert. Where is the ";" being used as an empty program statement? 2) The unix operating system has dumped out an image of the executing program for examination using a symbolic debugger. Have a play with "sdb" in preparation for the lesson which deals with it in more detail. Lets remove the errant semi-colon, re-compile and re-run the program. 1 2 3 4 5 6 7 8 9 10 2 4 6 8 10 12 14 16 18 20 3 6 9 12 15 18 21 24 27 30 4 8 12 16 20 24 28 32 36 40 5 10 15 20 25 30 35 40 45 50 6 12 18 24 30 36 42 48 54 60 7 14 21 28 35 42 49 56 63 70 8 16 24 32 40 48 56 64 72 80 9 18 27 36 45 54 63 72 81 90 10 20 30 40 50 60 70 80 90 100 Here's the ten times multiplication table, for you to give to to the nearest primary-school child! I would agree that it is not possible to compare the value of a program layout with a real work of fine art such as a John Constable painting or a Michaelangelo statue, I do think a well laid out and literate example of programming is not only much easier to read and understand, but also it does have a certain aesthetic appeal. Copyright notice:- (c) 1993 Christopher Sawtell. I assert the right to be known as the author, and owner of the intellectual property rights of all the files in this material, except for the quoted examples which have their individual copyright notices. Permission is granted for onward copying, but not modification, of this course and its use for personal study only, provided all the copyright notices are left in the text and are printed in full on any subsequent paper reproduction. -- +----------------------------------------------------------------------+ | NAME Christopher Sawtell | | SMAIL 215 Ollivier's Road, Linwood, Christchurch, 8001. New Zealand.| | EMAIL chris@gerty.equinox.gen.nz | | PHONE +64-3-389-3200 ( gmt +13 - your discretion is requested ) | +----------------------------------------------------------------------+