CPS 202 Class 2 Chapter 2 This chapter discusses the very basics of the C language. Once we are past this chapter, these essentials will be taken for granted. It is imperative that you understand them, to gain a firm foundation. We saw last week this simple program: #include main() { printf("Hello World!\n"); return(0); } All of the essential building blocks of C, except for one or two, are included in this tiny program. If your program follows this general model, then it should compile, build, and execute properly. Before we get to the details of the program's structure, though, let's go over those terms: compile, build, execute... and throw in preprocess. Preprocessing ------------- The very first step in turning source code into an executable program is to run it through the preprocessor. In most compilers, this step is handled automatically. The preprocessor does several things that we will discuss in this class, as they come up one at a time. Source code, by the way, is a program you have written. Compiling --------- The next step is to take the preprocessed file and compile it into object code. Object code is machine language, instructions that the processor or the operating system can understand in a native mode. If your program successfully compiles, then you are nearly finished. Most of the errors that are found in source code are found in the compile phase. Compiling is one half of the build process. Linking ------- After your program is preprocessed and compiled, and the object code version of the program exists, it must be combined with other pieces of object code that are supplied by the compiler. This step is called linking, and is the second half of a build. A much smaller number of errors are found in the linking stage. Executing --------- After your file is linked, it will be executable. This is where you get to actually run your program and find out if it worked the way you intended. In the case of our hello program, we actually see the words "Hello World" appear on the screen. Since we wrote a C program, and did not include any Windows functionality, we saw the message appear in a DOS box. The program ----------- There are a few elements that appear in every program, and some that will appear in every program that we will write in this class. It might be a good idea, then, to save yourself some time by creating template file to always start out your program. You can open the same file each time and immediately do a "Save As" or cut and paste your template. Here's a first version: #include main() { return(0); } This is essentially the Hello World program with the printf command stripped out. Let's look at each of the pieces here. Directives ---------- The first line, the "include" line, is special. It has a # (pound sign) in front of it. The # is used to mark a line that is to be processed by the preprocessor. They are called "directives." There are several kinds of directives, but we will encounter just two kinds in this class. The first is the include. It has two possible formats: #include #include "file.h" You use the angle bracket version when you are including a system-level file, like stdio.h. This is a file that is supplied by the compiler, and which is probably stored in a special place that the compiler knows about. The double quote version is most often used to include files that you create yourself. We will learn about all the different things that an included file might contain later on in the course. These files are called "header files." That's what the .h stands for. They are called header files because they contain code that should be part of the top of your program. Header files are plain text, and can be viewed by any one. When the preprocessor hits an include line, it actually goes out to the file system and finds the specified file, and then places its entire content into your file. The other major directive we will concern ourselves with is the define. With a define, we can give a value a name, to make it easier to use in our program. This is the format of a define: #define NAME value #define FREEZING_POINT 32.0 #define MONTHS 12 #define PI 3.141 We call these constants, or macros. The good thing about a constant is that it cannot be changed by the program. You would not want to accidentally change the freezing point of water to 40 degrees. What the preprocessor does is that it literally replaces the word with the value. printf("The freezing point of water is %f\n",FREEZING_POINT); becomes printf("The freezing point of water is %f\n",32.0); This may not sound like a big deal - but imagine this. You're creating a program that uses the value of PI a lot. You type in 3.14195 dozens of times in your program. Worse, sometimes you typed 3.141 and sometimes 31.4159. Only one of these is right, 3.141, but it may not be precise enough. The others are blatantly wrong. By doing #define PI 3.14159 and then using "PI" instead of the number, you ensure that you always use the correct value. Furthermore, if you need even more precision, you can do that very easily by changing one place: #define PI 3.14159265 Functions --------- Functions are one of the main building blocks of the C language. They are available in other languages, especially C derivatives. You may have heard them called methods, subroutines, or calls. In C, they are functions. Functions are things that we can use in our programs to do specific actions. The printf function we have already used, and its meaning is pretty clear. It prints data to the screen. It has a lot more power than just that, though, and we will look into that next class. We will also, eventually, learn to write our own functions. For now, we will stick with the ones supplied by the compiler. The one function of our own that we will use, however, is the main function. One of the firmest rules in all of C is that every program must have a main. The main function is where the computer goes to start running the program. The main function must be named main, all in lower case. For now, just the form "main()" is necessary - we will learn more about main later. The other thing you will notice in our template file are the curly brace characters. One, the opening curly brace, appears directly underneath the main. It indicates that all code following the brace is a part of main, all the way down to the ending curly brace. Generally speaking, anything between two curly braces is called a block. In the specific case of a function, the block is called the body of the function. Inside the body of the function, you can include all sort of statements, including mathmatical calculations, loops, conditions, and function calls. In the case of our template, there is one statement, the return statement. This is a special kind of statement, one which ends every function. Sometimes the return can be left out, and when it is, its presence is implied. I recommend that you include a return(0) in every program you write - if you don't, Visual C++ will warn you that it is missing. One point to make now, to avoid confusion. When you read through the text, you will see the return statement used without parentheses. I will never leave them off, except in one specific case we'll see later. The parentheses are, in fact, optional. The reason that I always use them is that this is just the way I learned, and I've stuck with the same way forever. I think of return as a function, even though it really isn't, and functions require the use of the parentheses. For others who know that return is not a function, the parentheses are, perhaps, just extra typing, and they leave them off. Either way is fine. keep them The other thing you may have noticed about the two statements we've seen thus far, the return and the printf, is that they both end with a semi colon. The semi colon is used to end a statement in C, and especially in these first few weeks, you can assume that every statement you write must end in a semi colon. We'll get to the exceptions later. If you leave off a semi colon, the compiler will complain at you. printf("Hello World!\n") return(0); Here, the compiler will say that an error was found on the "return" line, when the actual error is on the printf line. This is because the error was not detected until the word "return" was hit. This is a common problem for novice programmers, and a habit you should get into quickly. Notice that the include line does not have a semi colon, and neither would a define line - these are not lines for the compiler, they are for the preprocessor, and the semi colon is not used. Note, too, that the main line and the curly brace lines have no semi colons - these lines are for the compiler, but they are not statements. Let's get back to functions: There are three main kinds of functions. The first is the ones we've used so far - the ones supplied by the compiler. There are two types of compiler-supplied functions. One is the standard library. The C standard defines a list of over 100 standard library functions. The second are compiler-specific libraries. These vary, but the compiler might include libraries for fonts, or graphics, or colors, which are not part of the C standard. We will be discussing only standard C library functions in this class. The second major class of functions are user-defined functions. Here, you create a function to do a task that might be repeated over and over again. Or to do discrete tasks, like opening a database file, reading from it, writing to it, and closing it. To create our own functions, we'll combine standard library functions and C statements within our code. This is covered later. The third major class of functions are purchased function libraries. These are known as third-party products, things that you did not write and which your compiler manufacturer did not write. These libraries might include functions to read and write HTML code, or to create calendars, or to do graphics. These are less common, and we certainly won't be covering them. Comments -------- The next subject is comments. Comments are critically important to a successful program. They explain what the program does, who wrote it and when, the logic behind a series of statements, and explanation of tricky bits of code. Comments are not for the benefit of the compiler. They are for the benefit of the reader. When you write code, chances are someone else will read it at some point. Without comments, the intent of the program may not be immediately clear, but with them, the reader can skip wasting time trying to figure out what the code does and just get into fixing it or modifying it. This person will greatly appreciate a well-commented program. The other person you might help is yourself. I've often zipped off code so quickly that I do not comment it - and when I come back to that code later, I've lost the train of thought that made that code so easy, and I have to waste time figuring it out again. There is one kind of comment defined in the C standard, and a new commenting style has fallen into usage, and which is part of the C99 standard. This style is called line comments. Block comments, as the name implies, are used to block off part of a program. The size of the block is limitless. The block comment starts with the characters /* and end with */. Here is a sample: /* Program: hello.c Author: Steve Mount */ This code is detected by the preprocessor actually, and literally is stripped out of the source code. The compiler never sees it. The one thing you have to be careful of is that nested comments are not available. So: printf("Hello World!\n"); /* Print "hello world!" */ return(0); can be commented out, it seems, by placing the /* before the printf and the */ after the return. Looks reasonable, but the problem is, the block stops on the first */, which is in the comment for the printf. This leads us to one of the reasons why the line comment is so widely-used. The line comment works only on a single line. The start of a line comment is a double slash: // Everything from the // to the end of the line is a comment. Now: printf("Hello World!\n"); // Print "hello world!" return(0); With this code, we can safely comment out both lines with the starting and ending comment tags. I suggest the use of the double comment - I think it is cleaner looking. But it is not a part of the C standard before C99, and though most compilers accept it, many do not. Check you compiler with a small test program. This allows me to add one more thing to our program templates. Something like this: /* CPS202 Homework: chapter 2 problem 5 Name: Steve Mount */ I want to see something like this at the top of all of your source code. Though I'm using a new email/web technique for submissions, there may be times that your code and your name become separated. By including this at the top, I'll always know who it belongs to. See page 15 for several different styles, pick one, and stick with it. Variables --------- There are several types of data in C. We will start slow by learning about only a few: the integer and the floating point. Variables are an important part of any programming language. They allow us to store and manipulate data. In C, before a variable can be used, it must be declared. To declare a variable, specify its type, and its name: int length; float volume; Two basic types in C are ints and floats. Once a name is given to a data type, it will always hold that data type. To set a variable equal to something, we use assignment: length=4; volume=6.556; Note that the data type is no longer needed, just the name of the variable. You can combine declaration and assignment using initialization: int length=4; float volume=6.556; Note that the main difference between an int and a float is that an int cannot hold decimal values, and a float can. In C, you must always be careful not to use unintialized variables - they can cause trouble. int i, j; j = i; Here, j is being set to be the same as the value of i, but i has no value. It actually does have a value, but we call it a garbage value because the value is not defined, and can change from one run of the program to another. One final point about Visual C++: it will often complain about your program, in the form of warnings, when you use float. To get around this, you can use doubles instead, or append an f to a number: float f; f = 1.5; // May cause Visual C++ to warn f = 1.5f; // No warning Constants --------- We've already discussed constants in our talk about the preprocessor, but a few more points. The data in a constant can only be accessed, it can never be changed. #define PI 3.141 PI = 3.14159; // illegal I will always use uppercase for my defines, as does the book. This is not required, but is common practice among all C programmers. Identifiers ----------- The rules for naming variables, and constants, and many other things in C, are basically all the same. Names may be of a reasonable length - you'll not want to exceed 10 or 20 characters just due to typing, though longer is probably OK. The characters must be in this set: a-z A-Z 0-9 _ (underscore) All other characters are illegal for use in C. So: float $amount; int #apples; are both illegal. Also, no identifier can be the same as a C keyword, listed on page 24, and no identifier can start with a number. Spacing ------- Spacing within a program can be a highly person choice. What is a universal truth, though, is that there must be some spacing. The key thing to remember is that the C compiler does not care about spacing - it removes all extra spaces not contained in strings (like "Hello World" would always have two spaces). So, this: printf("Hello World!\n"); and this: printf ( "Hello World\n!" ) ; are the same in the compiler's eyes. Basic math primer ----------------- We will get to more detail about math operaters later, but to just touch on the basics: + adds - subtracts * multiplies / divides int area, h, W; h = 10; w = 5; area = h * w; // area is 50 If you're going to divide, be sure to use floats. We will get to the whys later. Printing data ------------- We can use printf not only to print strings, but to print data as well. We will learn a lot more about it later, but here's a brief overview. To print a float, use %f in the string, and place the float variable in the printf, with commas separating the string and the variable: printf("The volume is %f\n", volume); To print an int, use %d: printf("The area is %d\n", area); To print more than one, just continue the list: printf("The width is %d. The height is %d\n",w,h); Reading data ------------ To read data from the user, use the scanf command. You tell scanf what kind of data to read in the same way you tell printf the kind of data to print. To read a single numeric value, use %d or %f appropriately: int height; float temperature; scanf("%d",&height); scanf("%f",&temperature); The key thing to remember is the & in front of the variable name. The reason for this will no be clear for a while, but leaving it off can crash the program. Prompting the user ------------------ Very often, you will be prompting the user for data input. To do this, combine a printf with a scanf: printf("Please enter the width: "); scanf("%d",&width); printf does not need scanf, and scanf does not need printf, but they can work together for a user interface, telling the user what to type and then waiting for him to type it. Note that I left the \n off the printf - this keeps the cursor just following the colon, a perfect place to type.