Saturday, May 23, 2009

How Java Works

Introduction to How Java Works

Have you ever wondered how computer programs work? Have you ever wanted to learn how to write your own computer programs? Whether you are 14 years old and hoping to learn how to write your first game, or you are 70 years old and have been curious about computer programming for 20 years.I'm going to teach you how computer programs work by teaching you how to program in the Java programming language.

In order to teach you about computer programming, I am going to make several assumptions from the start:

#I am going to assume that you know nothing about computer programming now. If you already know something then the first part of this article will seem elementary to you. Please feel free to skip forward until you get to something you don't know.

#I am going to assume you do know something about the computer you are using. That is, I am going to assume you already know how to edit a file, copy and delete files, rename files, find information on your system, etc.

#For simplicity, I am going to assume that you are using a machine running Windows 95, 98, 2000, NT or XP. It should be relatively straightforward for people running other operating systems to map the concepts over to those.

#I am going to assume that you have a desire to learn.

All of the tools you need to start programming in Java are widely available on the Web for free. There is also a huge amount of educational material for Java available on the Web, so once you finish this article you can easily go learn more to advance your skills. You can learn Java programming here without spending any money on compilers, development environments, reading materials, etc. Once you learn Java it is easy to learn other languages, so this is a good place to start.

Having said these things, we are ready to go. Let's get started!

A Little Terminology

Keep in mind that I am assuming that you know nothing about programming. Here are several vocabulary terms that will make things understandable:

#Computer program - A computer program is a set of instructions that tell a computer exactly what to do. The instructions might tell the computer to add up a set of numbers, or compare two numbers and make a decision based on the result, or whatever. But a computer program is simply a set of instructions for the computer, like a recipe is a set of instructions for a cook or musical notes are a set of instructions for a musician. The computer follows your instructions exactly and in the process does something useful -- like balancing a checkbook or displaying a game on the screen or implementing a word processor.

#Programming language - In order for a computer to recognize the instructions you give it, those instructions need to be written in a language the computer understands -- a programming language. There are many computer programming languages -- Fortran, Cobol, Basic, Pascal, C, C++, Java, Perl -- just like there are many spoken languages. They all express approximately the same concepts in different ways.

#Compiler - A compiler translates a computer program written in a human-readable computer language (like Java) into a form that a computer can execute. You have probably seen EXE files on your computer. These EXE files are the output of compilers. They contain executables -- machine-readable programs translated from human-readable programs.

In order for you to start writing computer programs in a programming language called Java, you need a compiler for the Java language. The next section guides you through the process of downloading and installing a compiler. Once you have a compiler, we can get started. This process is going to take several hours, much of that time being download time for several large files. You are also going to need about 40 megabytes of free disk space (make sure you have the space available before you get started).


Downloading the Java Compiler

In order to get a Java development environment set up on your machine -- you "develop" (write) computer programs using a "development environment" -- you will have to complete the following steps:

1Download a large file containing the Java development environment (the compiler and other tools).

2Download a large file containing the Java documentation.

3If you do not already have WinZip (or an equivalent) on your machine, you will need to download a large file containing WinZip and install it.

4Install the Java development environment.

5Install the documentation.

6Adjust several environment variables.

7Test everything out.

Before getting started, it would make things easier if you create a new directory in your temp directory to hold the files we are about to download. We will call this the download directory.

Step 1 - Download the Java development environment

Go to the page http://java.sun.com/j2se/1.4.2/download.html. Download the SDK software by clicking on the "Download J2SE SDK" link. You will be shown a licensing agreement. Click Accept. Select your operating system and download the file to your download directory. This is a huge file, and it will take several hours to download over a normal phone-line modem. The next two files are also large.

Step 2 - Download the Java documentation

Download the documentation by selecting your operating system and clicking the SDK 1.4.1 documentation link.

Step 3 - Download and install WinZip

If you do not have a version of WinZip or an equivalent on your machine, go to the page http://www.winzip.com/ and download an evaluation copy of WinZip. Run the EXE you get to install it. We will use it in a moment to install the documentation.

Step 4 - Install the development kit

Run the j2sdk-1_4_1-*.exe file that you downloaded in step 1. It will unpack and install the development kit automatically.

Step 5 - Install the documentation

Read the installation instructions for the documentation. They will instruct you to move the documentation file to same directory as that containing the development kit you just installed. Unzip the documentation and it will drop into the proper place.

Step 6 - Adjust your environment

As instructed on this page, you need to change your path variable. This is most easily done by opening an MS-DOS prompt and typing PATH to see what the path is set to currently. Then open autoexec.bat in Notepad and make the changes to PATH specified in the instructions.


Step 7 - Test

Now you should be able to open another MS-DOS window and type javac. If everything is set up properly, then you should see a two-line blob of text come out that tells you how to use javac. That means you are ready to go. If you see the message "Bad Command or File Name" it means you are not ready to go. Figure out what you did wrong by rereading the installation instructions. Make sure the PATH is set properly and working. Go back and reread the Programmer's Creed above and be persistent until the problem is resolved.

You are now the proud owner of a machine that can compile Java programs. You are ready to start writing software!

By the way, one of the things you just unpacked is a demo directory full of neat examples. All of the examples are ready to run, so you might want to find the directory and play with some of the samples. Many of them make sounds, so be sure to turn on your speakers. To run the examples, find pages with names like example1.html and load them into your usual Web browser.

Your First Program

Your first program will be short and sweet. It is going to create a drawing area and draw a diagonal line across it. To create this program you will need to:

1.Open Notepad and type in (or cut and paste) the program
Save the program
2.Compile the program with the Java compiler to create a Java applet
3.Fix any problems
4.Create an HTML web page to "hold" the Java Applet you created
5.Run the Java applet

Here is the program we will use for this demonstration:

import java.awt.Graphics;

public class FirstApplet extends java.applet.Applet
{

public void paint(Graphics g)
{
g.drawLine(0, 0, 200, 200);
}
}

Step 1 - Type in the program

Create a new directory to hold your program. Open up Notepad (or any other text editor that can create TXT files). Type or cut and paste the program into the Notepad window. This is important: When you type the program in, case matters. That means that you must type the uppercase and lowercase characters exactly as they appear in the program. Review the programmer's creed above. If you do not type it EXACTLY as shown, it is not going to work.

Step 2 - Save the file

Save the file to the filename FirstApplet.java in the directory that you created in step 1. Case matters in the filename. Make sure the 'F' and 'A' are uppercase and all other characters are lowercase, as shown.

Step 3 - Compile the program

Open an MS-DOS window. Change directory ("cd") to the directory containing FirstApplet.java. Type:


javac FirstApplet.java
Case matters! Either it will work, in which case nothing will be printed to the window, or there will be errors. If there are no errors, a file named FirstApplet.class will be created in the directory right next to FirstApplet.java.

(Make sure that the file is saved to the name FirstApplet.java and not FirstApplet.java.txt. This is most easily done by typing dir in the MS-DOS window and looking at the file name. If it has a .txt extension, remove it by renaming the file. Or run the Windows Explorer and select Options in the View menu. Make sure that the "Hide MD-DOS File Extensions for file types that are registered" box is NOT checked, and then look at the filename with the explorer. Change it if necessary.)

Step 4 - Fix any problems

If there are errors, fix them. Compare your program to the program above and get them to match exactly. Keep recompiling until you see no errors. If javac seems to not be working, look back at the previous section and fix your installation.

Step 5 - Create an HTML Page

Create an HTML page to hold the applet. Open another Notepad window. Type into it the following:









Save this file in the same directory with the name applet.htm.

[If you have never worked with HTML before, please read How a Web Page Works. The applet tag is how you access a Java applet from a web page.]

Step 6 - Run the Applet

In your MS-DOS window, type:


appletviewer applet.htm

Pull the applet viewer a little bigger to see the whole line. You should also be able to load the HTML page into any modern browser like Netscape Navigator or Microsoft Internet Explorer and see approximately the same thing.

You have successfully created your first program!!!

Understanding What Just Happened

So what just happened? First, you wrote a piece of code for an extremely simple Java applet. An applet is a Java program that can run within a Web browser, as opposed to a Java application, which is a stand-alone program that runs on your local machine (Java applications are slightly more complicated and somewhat less popular, so we will start with applets). We compiled the applet using javac. We then created an extremely simple Web page to "hold" the applet. We ran the applet using appletviewer, but you can just as easily run it in a browser.
The program itself is about 10 lines long:


import java.awt.Graphics;

public class FirstApplet extends java.applet.Applet
{

public void paint(Graphics g)
{
g.drawLine(0, 0, 200, 200);
}
}

This is about the simplest Java applet you can create. To fully understand it you will have to learn a fair amount, particularly in the area of object oriented programming techniques. Since I am assuming that you have zero programming experience, what I would like you to do is focus your attention on just one line in this program for the moment:

g.drawLine(0, 0, 200, 200);

This is the line in this program that does the work. It draws the diagonal line. The rest of the program is scaffolding that supports that one line, and we can ignore the scaffolding for the moment. What happened here was that we told the computer to draw one line from the upper left hand corner (0,0) to the bottom right hand corner (200, 200). The computer drew it just like we told it to. That is the essence of computer programming!

(Note also that in the HTML page, we set the size of the applet's window in step 5 above to have a width of 200 and a height of 200.)

In this program, we called a method (a.k.a. function) called drawLine and we passed it four parameters (0, 0, 200, 200). The line ends in a semicolon. The semicolon acts like the period at the end of the sentence. The line begins with g., signifying that we want to call the method named drawLine on the specific object named g (which you can see one line up is of the class Graphics -- we will get into classes and methods of classes in much more detail ).

A method is simply a command -- it tells the computer to do something. In this case, drawLine tells the computer to draw a line between the points specified: (0, 0) and (200, 200). You can think of the window as having its 0,0 coordinate in the upper left corner, with positive X and Y axes extending to the right and down. Each dot on the screen (each pixel) is one increment on the scale.



Try experimenting by using different numbers for the four parameters. Change a number or two, save your changes, recompile with javac and rerun after each change in appletviewer, and see what you discover.

What other functions are available besides drawLine? You find this out by looking at the documentation for the Graphics class. When you installed the Java development kit and unpacked the documentation, one of the files unloaded in the process is called java.awt.Graphics.html, and it is on your machine. This is the file that explains the Graphics class. On my machine, the exact path to this file is D:\jdk1.1.7\docs\api\java.awt.Graphics.html. On your machine the path is likely to be slightly different, but close -- it depends on exactly where you installed things. Find the file and open it. Up toward the top of this file there is a section called "Method Index." This is a list of all of the methods this class supports. The drawLine method is one of them, but you can see many others. You can draw, among other things:

Lines
Arcs
Ovals
Polygons
Rectangles
Strings
Characters

Read about and try experimenting with some of these different methods to discover what is possible. For example, try this code:

g.drawLine(0, 0, 200, 200);
g.drawRect(0, 0, 200, 200);
g.drawLine(200, 0, 0, 200);

It will draw a box with two diagonals (be sure to pull the window big enough to see the whole thing). Try drawing other shapes. Read about and try changing the color with the setColor method. For example:


import java.awt.Graphics;
import java.awt.Color;

public class FirstApplet extends java.applet.Applet
{

public void paint(Graphics g)
{
g.setColor(Color.red);
g.fillRect(0, 0, 200, 200);
g.setColor(Color.black);
g.drawLine(0, 0, 200, 200);
g.drawLine(200, 0, 0, 200);
}
}

Note the addition of the new import line in the second line of the program. The output of this program looks like this:



One thing that might be going through your head right now is, "How did he know to use Color.red rather than simply red, and how did he know to add the second import line?" You learn things like that by example. Because I just showed you an example of how to call the setColor method, you now know that whenever you want to change the color you will use Color. followed by a color name as a parameter to the setColor method, and you will add the appropriate import line at the top of the program. If you look up setColor, it has a link that will tell you about the Color class, and in it is a list of all the valid color names along with techniques for creating new (unnamed) colors. You read that information, you store it in your head and now you know how to change colors in Java. That is the essence of becoming a computer programmer -- you learn techniques and remember them for the next program you write. You learn the techniques either by reading an example (as you did here) or by reading through the documentation or by looking at example code (as in the demo directory). If you have a brain that likes exploring and learning and remembering things, then you will love programming!

In this section, you have learned how to write linear, sequential code -- blocks of code that consist of method calls starting at the top and working toward the bottom (try drawing one of the lines before you draw the red rectangle and watch what happens -- it will be covered over by the rectangle and made invisible. The order of lines in the code sequence is important). Sequential lines of code form the basic core of any computer program. Experiment with all the different drawing methods and see what you can discover.


Bugs and Debugging

One thing that you are going to notice as you learn about programming is that you tend to make a fair number of mistakes and assumptions that cause your program to either: 1) not compile, or 2) produce output that you don't expect when it executes. These problems are referred to as bugs, and the act of removing them is called debugging. About half of the time of any programmer is spent debugging.
You will have plenty of time and opportunity to create your own bugs, but to get more familiar with the possibilities let's create a few. In your program, try erasing one of the semicolons at the end of a line and try compiling the program with javac. The compiler will give you an error message. This is called a compiler error, and you have to eliminate all of them before you can execute your program. Try misspelling a function name, leaving out a "{" or eliminating one of the import lines to get used to different compiler errors. The first time you see a certain type of compiler error it can be frustrating, but by experimenting like this -- with known errors that you create on purpose -- you can get familiar with many of the common errors.

A bug, also known as an execution (or run-time) error, occurs when the program compiles fine and runs, but then does not produce the output you planned on it producing. For example, this code produces a red rectangle with two diagonal lines across it:


g.setColor(Color.red);
g.fillRect(0, 0, 200, 200);
g.setColor(Color.black);
g.drawLine(0, 0, 200, 200);
g.drawLine(200, 0, 0, 200);

The following code, on the other hand, produces just the red rectangle (which covers over the two lines):


g.setColor(Color.black);
g.drawLine(0, 0, 200, 200);
g.drawLine(200, 0, 0, 200);
g.setColor(Color.red);
g.fillRect(0, 0, 200, 200);

The code is almost exactly the same but looks completely different when it executes. If you are expecting to see two diagonal lines, then the code in the second case contains a bug.

Here's another example:


g.drawLine(0, 0, 200, 200);
g.drawRect(0, 0, 200, 200);
g.drawLine(200, 0, 0, 200);

This code produces a black outlined box and two diagonals. This next piece of code produces only one diagonal:


g.drawLine(0, 0, 200, 200);
g.drawRect(0, 0, 200, 200);
g.drawLine(0, 200, 0, 200);

Again, if you expected to see two diagonals, then the second piece of code contains a bug (look at the second piece of code until you understand what went wrong). This sort of bug can take a long time to find because it is subtle.

You will have plenty of time to practice finding your own bugs. The average programmer spends about half of his or her time tracking down, finding and eliminating bugs. Try not to get frustrated when they occur -- they are a normal part of programming life.

Variables

All programs use variables to hold pieces of data temporarily. For example, if at some point in a program you ask a user for a number, you will store it in a variable so that you can use it later.
Variables must be defined (or declared) in a program before you can use them, and you must give each variable a specific type. For example, you might declare one variable to have a type that allows it to hold numbers, and another variable to have a type that allows it to hold a person's name. (Because Java requires you to specifically define variables before you use them and state the type of value you plan to store in a variable, Java is called a strongly typed language. Certain languages don't have these requirements. In general, when creating large programs, strong typing tends to reduce the number of programming errors that you make.)


import java.awt.Graphics;
import java.awt.Color;

public class FirstApplet extends java.applet.Applet
{

public void paint(Graphics g)
{
int width = 200;
int height = 200;
g.drawRect(0, 0, width, height);
g.drawLine(0, 0, width, height);
g.drawLine(width, 0, 0, height);
}
}

In this program, we have declared two variables named width and height. We have declared their type to be int. An int variable can hold an integer (a whole number such as 1, 2, 3). We have initialized both variables to 200. We could just as easily have said:


int width;
width = 200;
int height;
height = 200;

The first form is simply a bit quicker to type.

The act of setting a variable to its first value is called initializing the variable. A common programming bug occurs when you forget to initialize a variable. To see that bug, try eliminating the initialization part of the code (the "= 200" part) and recompile the program to see what happens. What you will find is that the compiler complains about this problem. That's a very nice feature, by the way. It will save you lots of wasted time.

There are two types of variables in Java -- simple (primitive) variables and classes.

The int type is simple. The variable can hold a number. That is all that it can do. You declare an int , set it to a value and use it. Classes, on the other hand, can contain multiple parts and have methods that make them easier to use. A good example of a straightforward class is the Rectangle class, so let's start with it.

One of the limitations of the program we have been working on so far is the fact that it assumes the window is 200 by 200 pixels. What if we wanted to ask the window, "How big are you?," and then size our rectangle and diagonals to fit? If you go back and look on the documentation page for the Graphics class (java.awt.Graphics.html -- the file that lists all the available drawing functions), you will see that one of the functions is called getClipBounds. Click on this function name to see the full description. This function accepts no parameters but instead returns a value of type Rectangle. The rectangle it returns contains the width and height of the available drawing area. If you click on Rectangle in this documentation page you will be taken to the documentation page for the Rectangle class (java.awt.Graphics.html). Looking in the Variable Index section at the top of the page, you find that this class contains four variables named x, y, width and height, respectively. What we want to do, therefore, is get the clip boundary rectangle using getClipBounds and then extract the width and height from that rectangle and save the values in the width and height variables we created in the previous example, like this:


import java.awt.Graphics;
import java.awt.Color;
import java.awt.Rectangle;

public class FirstApplet extends java.applet.Applet

{

public void paint(Graphics g)
{
int width;
int height;
Rectangle r;

r = g.getClipBounds();
width = r.width - 1;
height = r.height - 1;

g.drawRect(0, 0, width, height);
g.drawLine(0, 0, width, height);
g.drawLine(width, 0, 0, height);
}
}

When you run this example, what you will notice is that the rectangle and diagonals exactly fit the drawing area. Plus, when you change the size of the window, the rectangle and diagonals redraw themselves at the new size automatically. There are five new concepts introduced in this code, so let's look at them:

First, because we are using the Rectangle class we need to import java.awt.Rectangle on the third line of the program.

We have declared three variables in this program. Two (width and height) are of type int and one (r) is of type Rectangle.

We used the getClipBounds function to get the size of the drawing area. It accepts no parameters so we passed it none ("()"), but it returns a Rectangle. We wrote the line, "r = g.getClipBounds();" to say, "Please place the returned rectangle into the variable r."

The variable r, being of the class Rectangle, actually contains four variables -- x, y, width, and height (you learn these names by reading the documentation for the Rectangle class). To access them you use the "." (dot) operator. So the phrase "r.width" says, "Inside the variable r retrieve the value named width." That value is placed into our local variable called width. In the process, we subtracted 1. Try leaving the subtraction out and see what happens. Also try subtracting five instead and see what happens.

Finally, we used width and height in the drawing functions.
One question commonly asked at this point is, "Did we really need to declare variables named width and height?" The answer is, "No." We could have typed r.width - 1 directly into the drawing function. Creating the variables simply makes things a little easier to read, and it's therefore a good habit to fall into.
Java supports several simple variable types. Here are three of the most common:

int - integer (whole number) values (1, 2, 3...)
float - decimal values (3.14159, for example)
char - character values (a, b, c...)

You can perform math operations on simple types. Java understands + (addition), - (subtraction), * (multiplication), / (division) and several others. Here's an example of how you might use these operations in a program. Let's say that you want to calculate the volume of a sphere with a diameter of 10 feet. The following code would handle it:

float diameter = 10;
float radius;
float volume;

radius = diameter / 2.0;
volume = 4.0 / 3.0 * 3.14159 * radius * radius * radius;

The first calculation says, "Divide the value in the variable named diameter by 2.0 and place the result in the variable named radius." You can see that the "=" sign here means, "Place the result from the calculation on the right into the variable named on the left."

Looping

One of the things that computers do very well is perform repetitive calculations or operations. In the previous sections, we have seen how to write "sequential blocks of code," so the next thing we should discuss is the techniques for causing a sequential block of code to occur repeatedly.
For example, let's say that I ask you to draw the following figure:



A good place to start would be to draw the horizontal lines, like this:



One way to draw the lines would be to create a sequential block of code:


import java.awt.Graphics;

public class FirstApplet extends java.applet.Applet
{

public void paint(Graphics g)
{
int y;
y = 10;
g.drawLine(10, y, 210, y);
y = y + 25;
g.drawLine(10, y, 210, y);
y = y + 25;
g.drawLine(10, y, 210, y);
y = y + 25;
g.drawLine(10, y, 210, y);
y = y + 25;
g.drawLine(10, y, 210, y);
y = y + 25;
g.drawLine(10, y, 210, y);
y = y + 25;
g.drawLine(10, y, 210, y);
y = y + 25;
g.drawLine(10, y, 210, y);
y = y + 25;
g.drawLine(10, y, 210, y);
}
}
(For some new programmers, the statement "y = y + 25;" looks odd the first time they see it. What it means is, "Take the value currently in the variable y, add 25 to it and place the result back into the variable y." So if y contains 10 before the line is executed, it will contain 35 immediately after the line is executed.)

Most people who look at this code immediately notice that it contains the same two lines repeated over and over. In this particular case the repetition is not so bad, but you can imagine that if you wanted to create a grid with thousands of rows and columns, this approach would make program-writing very tiring. The solution to this problem is a loop, as shown below:


import java.awt.Graphics;

public class FirstApplet extends java.applet.Applet
{

public void paint(Graphics g)
{
int y;
y = 10;
while (y <= 210)
{
g.drawLine(10, y, 210, y);
y = y + 25;
}
}
}
When you run this program, you will see that it draws nine horizontal lines 200 pixels long.

The while statement is a looping statement in Java. The statement tells Java to behave in the following way: At the while statement, Java looks at the expression in the parentheses and asks, "Is y less than or equal to 210?"

If the answer is yes, then Java enters the block of code bracketed by braces -- "{" and "}". The looping part occurs at the end of the block of code. When Java reaches the ending brace, it loops back up to the while statement and asks the question again. This looping sequence may occur many times.
If the answer is no, it skips over the code bracketed by braces and continues.
So you can see that when you run this program, initially y is 10. Ten is less than 210, so Java enters the block in braces, draws a line from (10,10) to (210, 10), sets y to 35 and then goes back up to the while statement. Thirty-five is less than 210, so Java enters the block in braces, draws a line from (10,35) to (210, 35), sets y to 60 and then goes back up to the while statement. This sequence repeats until y eventually gets to be greater than 210. Then the program quits.
We can complete our grid by adding a second loop to the program, like this:


import java.awt.Graphics;

public class FirstApplet extends java.applet.Applet
{

public void paint(Graphics g)
{
int x, y;
y = 10;
while (y <= 210)
{
g.drawLine(10, y, 210, y);
y = y + 25;
}
x = 10;
while (x <= 210)
{
g.drawLine(x, 10, x, 210);
x = x + 25;
}
}
}
You can see that a while statement has three parts:

There is an initialization step that sets y to 10.
Then there is an evaluation step inside the parentheses of the while statement.
Then, somewhere in the while statement there is an increment step that increases the value of y.
Java supports another way of doing the same thing that is a little more compact than a while statement. It is called a for statement. If you have a while statement that looks like this:


y = 10;
while (y <= 210)
{
g.drawLine(10, y, 210, y);
y = y + 25;
}
then the equivalent for statement looks like this:


for (y = 10; y <= 210; y = y + 25)
{
g.drawLine(10, y, 210, y);
}
You can see that all the for statement does is condense the initialization, evaluation and incrementing lines into a short, single line. It simply shortens the programs you write, nothing more.

While we are here, two quick points about loops:

In many cases, it would be just as easy to initialize y to 210 and then decrement it by 25 each time through the loop. The evaluation would ask, "Is y greater than or equal to 10?" The choice is yours. Most people find it easier to add than subtract in their heads, but you might be different.

The increment step is very important. Let's say you were to accidentally leave out the part that says "y = y + 25;" inside the loop. What would happen is that the value of y would never change -- it would always be 10. So it would never become greater than 210 and the loop would continue forever (or until you stop it by turning off the computer or closing the window). This condition is called an infinite loop. It is a bug that is pretty common.
To get some practice with looping, try writing programs to draw the following figures:














Tuesday, May 19, 2009

How Perl Works

Introduction to How PERL Works

Perl is a fairly straightforward, widely known and well-respected scripting language. It is used for a variety of tasks (for example, you can use it to create the equivalent of DOS batch files or C shell scripts), but in the context of Web development it is used to develop CGI scripts.

One of the nice things about Perl is that, because it is a scripting language, people give away source code for their programs. This gives you the opportunity to learn Perl by example, and you can also download and modify thousands of Perl scripts for your own use. One of the bad things about Perl is that much of this free code is impossible to understand. Perl lends itself to an unbelievably cryptic style!

If you know the C programming language, this will be especially easy for you. Perl is easy to use once you know the basics. In this article, we're going to start at the beginning and show you how to do the most common programming tasks using Perl. By the end of this article, you will be able to write your own Perl scripts with relative ease, and read cryptic scripts written by others with somewhat less ease, but this will be a good starting point.

Getting Started

To start with Perl you need the Perl interpreter. On any UNIX machine there is a 99.99-percent probability that it's already there. On a Windows machine or a Mac, you need to download the latest release of the language and install it on your machine. (See the links at the end of this article for more information.) Perl is widely available on the Web and is free.

Next, make sure you look in the DOCS directory that comes with Perl -- there will be user-manual-type stuff in there. At some point, it would not hurt to read through all of the documentation, or at least scan it. Initially it will be cumbersome, but after reading this article it will make much more sense.

Hello World

Once you have Perl loaded, make sure you have your path properly set to include the Perl executable. Then, open a text editor and create a text file. In the file, place the following line:

print "Hello World!\n";

Name the file "test1.pl". At the command prompt, type:
perl test1.pl

Perl will run and execute the code in the text file. You should see the words "Hello World!" printed to stdout (standard out). As you can see, it is extremely easy to create and run programs in Perl. (If you are using UNIX, you can place a comment like #! /usr/bin/perl on the first line, and then you will not have to type the word "perl" at the command line.)

The print command prints things to stdout. The \n notation is a line feed. That would be more clear if you modified the test program to look like this (# denotes a comment):


# Print on two lines
print "Hello\nWorld!\n";

Note that the print command understood that it should interpret the "\n" as a line feed and not as the literal characters. The interpretation occurred not because of the print command, but because of the use of double quotes (a practice called quoting in Perl). If you were to use single quotes instead, as in:


print 'Hello\nWorld!\n';
the \n character would not be interpreted but instead would be used literally.

There is also the backquote character: `. A pair of these imply that what is inside the quotes should be interpreted as an operating system command, and that command should be executed with the output of the command being printed. If you were to place inside the backquotes a command-line operation from the operating system, it would execute. For example, on Windows NT you can say:


print `cmd /c dir`;
to run the DIR command and see a list of files from the current directory.

You will also see the / character used for quoting regular expressions.

The print command understands commas as separators. For example:


print 'hello', "\n", 'world!';
However, you will also see a period:


print 'hello'. "\n". 'world!';
The period is actually a string concatenation operator.

There is also a printf operator for C folks.

Variables

Variables are interesting in Perl. You do not declare them, and you always use a $ to denote them. They come into existence at first use. For example:

$s = "Hello\nWorld\n";
$t = 'Hello\nWorld\n';
print $s, "\n", $t;
Or:


$i = 5;
$j = $i + 5;
print $i, "\t", $i + 1, "\t", $j; # \t = tab
Or:


$a = "Hello ";
$b = "World\n";
$c = $a . $b; # note use of . to concat strings
print $c;
Since . is string concatenation, .= has the expected meaning in the same way that "+=" does in C. Therefore, you can say:


$a = "Hello ";
$b = "World\n";
$a .= $b;
print $a;
You can also create arrays:


@a = ('cat', 'dog', 'eel');
print @a, "\n";
print $#a, "\n"; # The value of the highest index, zero based
print $a[0], "\n";
print $a[0], $a[1], $a[2], "\n";
The $# notation gets the highest index in the array, equivalent to the number of elements in the array minus 1. As in C, all arrays start indexing at zero.

You can also create hashes:


%h = ('dog', 'bark', 'cat', 'meow', 'eel', 'zap');
print "The dog says ", $h{'dog'};
Here, 'bark' is associated with the word 'dog', 'meow' with 'cat', and so on. A more expressive syntax for the same declaration is:


%h = (
dog => 'bark',
cat => 'meow',
eel => 'zap'
);
The => operator quotes the left string and acts as a comma.


Loops and Ifs

You can create a simple for loop like you do in C:

for ($i = 0; $i < 10; $i++)
{
print $i, "\n";
}

While statements are easy:


$i = 0;
while ( $i < 10 )
{
print $i, "\n";
$i++;
}
If statements are similarly easy:


for ($i = 0; $i < 10; $i++)
{
if ($i != 5)
{
print $i, "\n";
}
}

The boolean operators work like they do in C:

&& and
|| or
! not

For numbers:

== equal
!= not equal
<, <=, >, >= (as expected)

Others:

eq
ne
lt
le
gt
ge

If you have an array, you can loop through it easily with foreach:


@a = ('dog', 'cat', 'eel');
foreach $b (@a)
{
print $b, "\n";
}

Foreach takes each element of the array @a and places it in $b until @a is exhausted.

Functions

You create a subroutine with the word sub. All variables passed to the subroutine arrive in an array called _. Therefore, the following code works:

show ('cat', 'dog', 'eel');

sub show
{
for ($i = 0; $i <= $#_; $i++)
{
print $_[$i], "\n";
}
}

Remember that $# returns the highest index in the array (the number of elements minus 1), so $#_ is the number of parameters minus 1. If you like that sort of obtuseness, then you will love PERL.

You can declare local variables in a subroutine with the word local, as in:


sub xxx
{
local ($a, $b, $c)
...
}
You can also call a function using &, as in:


&show ('a', 'b', 'c');
The & symbol is required only when there is ambiguity, but some programmers use it all the time.

To return a value from a subroutine, use the keyword return.

Reading

Reading from STDIN
To read in data from the stdin (standard in), use the STDIN handle. For example:


print "Enter high number: ";
$i = ;
for ($j = 0; $j <= $i; $j++)
{
print $j, "\n";
}

As long as you enter an integer number, this program will work as expected. reads a line at a time. You can also use getc to read one character, as in:


$i = getc(STDIN);
Or use read:


read(STDIN, $i, 1);
The 1 in the third parameter to the read command is the length of the input to read.

Reading Environment Variables
PERL defines a global hash named ENV, and you can use it to retrieve the values of environment variables. For example:


print $ENV{'PATH'};

PERL Note
The name of the environment variable must be upper case.



Reading Command Line Arguments

PERL defines a global array ARGV, which contains any command line arguments passed to the script. $#ARGV is the number of arguments passed minus 1, $ARGV[0] is the first argument passed, $ARGV[1] is the second, and so on.

You should now be able to read and write simple Perl scripts.

Sunday, May 17, 2009

How C Programming Works

Introduction to How C Programming Works

The C programming language is a popular and widely used programming language for creating computer programs. Programmers around the world embrace C because it gives maximum control and efficiency to the programmer.

If you are a programmer, or if you are interested in becoming a programmer, there are a couple of benefits you gain from learning C:

#You will be able to read and write code for a large number of platforms -- everything from microcontrollers to the most advanced scientific systems can be written in C, and many modern operating systems are written in C.

#The jump to the object oriented C++ language becomes much easier. C++ is an extension of C, and it is nearly impossible to learn C++ without learning C first.

What is C?

C is a computer programming language. That means that you can use C to create lists of instructions for a computer to follow. C is one of thousands of programming languages currently in use. C has been around for several decades and has won widespread acceptance because it gives programmers maximum control and efficiency. C is an easy language to learn. It is a bit more cryptic in its style than some other languages, but you get beyond that fairly quickly.



C is what is called a compiled language. This means that once you write your C program, you must run it through a C compiler to turn your program into an executable that the computer can run (execute). The C program is the human-readable form, while the executable that comes out of the compiler is the machine-readable and executable form. What this means is that to write and run a C program, you must have access to a C compiler. If you are using a UNIX machine (for example, if you are writing CGI scripts in C on your host's UNIX computer, or if you are a student working on a lab's UNIX machine), the C compiler is available for free. It is called either "cc" or "gcc" and is available on the command line. If you are a student, then the school will likely provide you with a compiler -- find out what the school is using and learn about it. If you are working at home on a Windows machine, you are going to need to download a free C compiler or purchase a commercial compiler. A widely used commercial compiler is Microsoft's Visual C++ environment (it compiles both C and C++ programs). Unfortunately, this program costs several hundred dollars. If you do not have hundreds of dollars to spend on a commercial compiler, then you can use one of the free compilers available on the Web.

We will start at the beginning with an extremely simple C program and build up from there. I will assume that you are using the UNIX command line and gcc as your environment for these examples; if you are not, all of the code will still work fine -- you will simply need to understand and use whatever compiler you have available.

The Simplest C Program

Let's start with the simplest possible C program and use it both to understand the basics of C and the C compilation process. Type the following program into a standard text editor (vi or emacs on UNIX, Notepad on Windows or TeachText on a Macintosh). Then save the program to a file named samp.c. If you leave off .c, you will probably get some sort of error when you compile it, so make sure you remember the .c. Also, make sure that your editor does not automatically append some extra characters (such as .txt) to the name of the file.

Here's the first program:

#include

int main()
{
printf("This is output from my first program!\n");
return 0;
}

When executed, this program instructs the computer to print out the line "This is output from my first program!" -- then the program quits. You can't get much simpler than that!


To compile this code, take the following steps:

#On a UNIX machine, type gcc samp.c -o samp (if gcc does not work, try cc). This line invokes the C compiler called gcc, asks it to compile samp.c and asks it to place the executable file it creates under the name samp. To run the program, type samp (or, on some UNIX machines, ./samp).

#On a DOS or Windows machine using DJGPP, at an MS-DOS prompt type gcc samp.c -o samp.exe. This line invokes the C compiler called gcc, asks it to compile samp.c and asks it to place the executable file it creates under the name samp.exe. To run the program, type samp.

#If you are working with some other compiler or development system, read and follow the directions for the compiler you are using to compile and execute the program.

You should see the output "This is output from my first program!" when you run the program. Here is what happened when you compiled the program:




If you mistype the program, it either will not compile or it will not run. If the program does not compile or does not run correctly, edit it again and see where you went wrong in your typing. Fix the error and try again.

The Simplest C Program: What's Happening?

Let's walk through this program and start to see what the different lines are doing :

#This C program starts with #include . This line includes the "standard I/O library" into your program. The standard I/O library lets you read input from the keyboard (called "standard in"), write output to the screen (called "standard out"), process text files stored on the disk, and so on. It is an extremely useful library. C has a large number of standard libraries like stdio, including string, time and math libraries. A library is simply a package of code that someone else has written to make your life easier (we'll discuss libraries a bit later).

#The line int main() declares the main function. Every C program must have a function named main somewhere in the code. We will learn more about functions shortly. At run time, program execution starts at the first line of the main function.

#In C, the { and } symbols mark the beginning and end of a block of code. In this case, the block of code making up the main function contains two lines.

#The printf statement in C allows you to send output to standard out (for us, the screen). The portion in quotes is called the format string and describes how the data is to be formatted when printed. The format string can contain string literals such as "This is output from my first program!," symbols for carriage returns (\n), and operators as placeholders for variables (see below). If you are using UNIX, you can type man 3 printf to get complete documentation for the printf function. If not, see the documentation included with your compiler for details about the printf function.

#The return 0; line causes the function to return an error code of 0(no error) to the shell that started execution. More on this capability a bit later.

Variables

As a programmer, you will frequently want your program to "remember" a value. For example, if your program requests a value from the user, or if it calculates a value, you will want to remember it somewhere so you can use it later. The way your program remembers things is by using variables. For example:

int b;

This line says, "I want to create a space called b that is able to hold one integer value." A variable has a name (in this case, b) and a type (in this case, int, an integer). You can store a value in b by saying something like:


b = 5;

You can use the value in b by saying something like:

printf("%d", b);

In C, there are several standard types for variables:

#int - integer (whole number) values

#float - floating point values

#char - single character values (such as "m" or "Z")


Printf

The printf statement allows you to send output to standard out. For us, standard out is generally the screen (although you can redirect standard out into a text file or another command).

Here is another program that will help you learn more about printf:

#include

int main()
{
int a, b, c;
a = 5;
b = 7;
c = a + b;
printf("%d + %d = %d\n", a, b, c);
return 0;
}

Type this program into a file and save it as add.c. Compile it with the line gcc add.c -o add and then run it by typing add (or ./add). You will see the line "5 + 7 = 12" as output.

Here is an explanation of the different lines in this program:

#The line int a, b, c; declares three integer variables named a, b and c. Integer variables hold whole numbers.

#The next line initializes the variable named a to the value 5.

#The next line sets b to 7.

#The next line adds a and b and "assigns" the result to c.
The computer adds the value in a (5) to the value in b (7) to form the result 12, and then places that new value (12) into the variable c. The variable c is assigned the value 12. For this reason, the = in this line is called "the assignment operator."

#The printf statement then prints the line "5 + 7 = 12." The %d placeholders in the printf statement act as placeholders for values. There are three %d placeholders, and at the end of the printf line there are the three variable names: a, b and c. C matches up the first %d with a and substitutes 5 there. It matches the second %d with b and substitutes 7. It matches the third %d with c and substitutes 12. Then it prints the completed line to the screen: 5 + 7 = 12. The +, the = and the spacing are a part of the format line and get embedded automatically between the %d operators as specified by the programmer.

Printf: Reading User Values

The previous program is good, but it would be better if it read in the values 5 and 7from the user instead of using constants. Try this program instead:


#include

int main()
{
int a, b, c;
printf("Enter the first value:");
scanf("%d", &a);
printf("Enter the second value:");
scanf("%d", &b);
c = a + b;
printf("%d + %d = %d\n", a, b, c);
return 0;
}

Here's how this program works when you execute it:

Make the changes, then compile and run the program to make sure it works. Note that scanf uses the same sort of format string as printf (type man scanf for more info). Also note the & in front of a and b. This is the address operator in C: It returns the address of the variable (this will not make sense until we discuss pointers). You must use the & operator in scanf on any variable of type char, int, or float, as well as structure types (which we will get to shortly). If you leave out the & operator, you will get an error when you run the program. Try it so that you can see what that sort of run-time error looks like.

Let's look at some variations to understand printf completely. Here is the simplest printf statement:


printf("Hello");

This call to printf has a format string that tells printf to send the word "Hello" to standard out. Contrast it with this:


printf("Hello\n");

The difference between the two is that the second version sends the word "Hello" followed by a carriage return to standard out.

The following line shows how to output the value of a variable using printf.


printf("%d", b);

The %d is a placeholder that will be replaced by the value of the variable b when the printf statement is executed. Often, you will want to embed the value within some other words. One way to accomplish that is like this:

printf("The temperature is ");
printf("%d", b);
printf(" degrees\n");

An easier way is to say this:


printf("The temperature is %d degrees\n", b);

You can also use multiple %d placeholders in one printf statement:


printf("%d + %d = %d\n", a, b, c);

In the printf statement, it is extremely important that the number of operators in the format string corresponds exactly with the number and type of the variables following it. For example, if the format string contains three %d operators, then it must be followed by exactly three parameters and they must have the same types in the same order as those specified by the operators.

You can print all of the normal C types with printf by using different placeholders:

#int (integer values) uses %d

#float (floating point values) uses %f

#char (single character values) uses %c

#character strings (arrays of characters, discussed later) use %s

You can learn more about the nuances of printf on a UNIX machine by typing man 3 printf. Any other C compiler you are using will probably come with a manual or a help file that contains a description of printf.

Scanf

The scanf function allows you to accept input from standard in, which for us is generally the keyboard. The scanf function can do a lot of different things, but it is generally unreliable unless used in the simplest ways. It is unreliable because it does not handle human errors very well. But for simple programs it is good enough and easy-to-use.

The simplest application of scanf looks like this:

scanf("%d", &b);

The program will read in an integer value that the user enters on the keyboard (%d is for integers, as is printf, so b must be declared as an int) and place that value into b.

The scanf function uses the same placeholders as printf:

int uses %d
float uses %f
char uses %c
character strings (discussed later) use %s

You MUST put & in front of the variable used in scanf. The reason why will become clear once you learn about pointers. It is easy to forget the & sign, and when you forget it your program will almost always crash when you run it.

In general, it is best to use scanf as shown here -- to read a single value from the keyboard. Use multiple calls to scanf to read multiple values. In any real program, you will use the gets or fgets functions instead to read text a line at a time. Then you will "parse" the line to read its values. The reason that you do that is so you can detect errors in the input and handle them as you see fit.

The printf and scanf functions will take a bit of practice to be completely understood, but once mastered they are extremely useful.

Try This!

#Modify this program so that it accepts three values instead of two and adds all three together:

#include

int main()
{
int a, b, c;
printf("Enter the first value:");
scanf("%d", &a);
printf("Enter the second value:");
scanf("%d", &b);
c = a + b;
printf("%d + %d = %d\n", a, b, c);
return 0;
}

#Try deleting or adding random characters or words in one of the previous programs and watch how the compiler reacts to these errors.
For example, delete the b variable in the first line of the above program and see what the compiler does when you forget to declare a variable. Delete a semicolon and see what happens. Leave out one of the braces. Remove one of the parentheses next to the main function. Make each error by itself and then run the program through the compiler to see what happens. By simulating errors like these, you can learn about different compiler errors, and that will make your typos easier to find when you make them for real.

C Errors to Avoid

#Using the wrong character case - Case matters in C, so you cannot type Printf or PRINTF. It must be printf.

#Forgetting to use the & in scanf

#Too many or too few parameters following the format statement in printf or scanf

#Forgetting to declare a variable name before using it


Branching and Looping

In C, both if statements and while loops rely on the idea of Boolean expressions. Here is a simple C program demonstrating an if statement:

#include

int main()
{
int b;
printf("Enter a value:");
scanf("%d", &b);
if (b < 0)
printf("The value is negative\n");
return 0;
}

This program accepts a number from the user. It then tests the number using an if statement to see if it is less than 0. If it is, the program prints a message. Otherwise, the program is silent. The (b < 0) portion of the program is the Boolean expression. C evaluates this expression to decide whether or not to print the message. If the Boolean expression evaluates to True, then C executes the single line immediately following the if statement (or a block of lines within braces immediately following the if statement). If the Boolean expression is False, then C skips the line or block of lines immediately following the if statement.

Here's slightly more complex example:


#include

int main()
{
int b;
printf("Enter a value:");
scanf("%d", &b);
if (b < 0)
printf("The value is negative\n");
else if (b == 0)
printf("The value is zero\n");
else
printf("The value is positive\n");
return 0;
}

In this example, the else if and else sections evaluate for zero and positive values as well.

Here is a more complicated Boolean expression:


if ((x==y) && (j>k))
z=1;
else
q=10;

This statement says, "If the value in variable x equals the value in variable y, and if the value in variable j is greater than the value in variable k, then set the variable z to 1, otherwise set the variable q to 10." You will use if statements like this throughout your C programs to make decisions. In general, most of the decisions you make will be simple ones like the first example; but on occasion, things get more complicated.

Notice that C uses == to test for equality, while it uses = to assign a value to a variable. The && in C represents a Boolean AND operation.

Here are all of the Boolean operators in C:

equality ==
less than <
Greater than >
<= <=
>= >=
inequality !=
and &&
or ||
not !

You'll find that while statements are just as easy to use as if statements. For example:


while (a < b)
{
printf("%d\n", a);
a = a + 1;
}

This causes the two lines within the braces to be executed repeatedly until a is greater than or equal to b. The while statement in general works like this:


C also provides a do-while structure:


do
{
printf("%d\n", a);
a = a + 1;
}
while (a < b);

The for loop in C is simply a shorthand way of expressing a while statement. For example, suppose you have the following code in C:


x=1;
while (x<10)
{
blah blah blah
x++; /* x++ is the same as saying x=x+1 */
}

You can convert this into a for loop as follows:


for(x=1; x<10; x++)

{
blah blah blah
}

Note that the while loop contains an initialization step (x=1), a test step (x<10), and an increment step (x++). The for loop lets you put all three parts onto one line, but you can put anything into those three parts. For example, suppose you have the following loop:


a=1;
b=6;
while (a < b)
{
a++;
printf("%d\n",a);
}

You can place this into a for statement as well:


for (a=1,b=6; a < b; a++,printf("%d\n",a));
It is slightly confusing, but it is possible. The comma operator lets you separate 1several different statements in the initialization and increment sections of the for loop (but not in the test section). Many C programmers like to pack a lot of information into a single line of C code; but a lot of people think it makes the code harder to understand, so they break it up.

= vs. == in Boolean expressions

The == sign is a problem in C because every now and then you may forget and type just = in a Boolean expression. This is an easy mistake to make, but to the compiler there is a very important difference. C will accept either = and == in a Boolean expression -- the behavior of the program changes remarkably between the two, however.
Boolean expressions evaluate to integers in C, and integers can be used inside of Boolean expressions. The integer value 0 in C is False, while any other integer value is True. The following is legal in C:


#include

int main()
{
int a;

printf("Enter a number:");
scanf("%d", &a);
if (a)
{
printf("The value is True\n");
}
return 0;
}
If a is anything other than 0, the printf statement gets executed.

In C, a statement like if (a=b) means, "Assign b to a, and then test a for its Boolean value." So if a becomes 0, the if statement is False; otherwise, it is True. The value of a changes in the process. This is not the intended behavior if you meant to type == (although this feature is useful when used correctly), so be careful with your = and == usage.

Looping: A Real Example

Let's say that you would like to create a program that prints a Fahrenheit-to-Celsius conversion table. This is easily accomplished with a for loop or a while loop:

#include

int main()
{
int a;
a = 0;
while (a <= 100)
{
printf("%4d degrees F = %4d degrees C\n",
a, (a - 32) * 5 / 9);
a = a + 10;
}
return 0;
}

If you run this program, it will produce a table of values starting at 0 degrees F and ending at 100 degrees F. The output will look like this:


0 degrees F = -17 degrees C
10 degrees F = -12 degrees C
20 degrees F = -6 degrees C
30 degrees F = -1 degrees C
40 degrees F = 4 degrees C
50 degrees F = 10 degrees C
60 degrees F = 15 degrees C
70 degrees F = 21 degrees C
80 degrees F = 26 degrees C
90 degrees F = 32 degrees C
100 degrees F = 37 degrees C

The table's values are in increments of 10 degrees. You can see that you can easily change the starting, ending or increment values of the table that the program produces.

If you wanted your values to be more accurate, you could use floating point values instead:


#include

int main()
{
float a;
a = 0;
while (a <= 100)
{
printf("%6.2f degrees F = %6.2f degrees C\n",
a, (a - 32.0) * 5.0 / 9.0);
a = a + 10;
}
return 0;
}

You can see that the declaration for a has been changed to a float, and the %f symbol replaces the %d symbol in the printf statement. In addition, the %f symbol has some formatting applied to it: The value will be printed with six digits preceding the decimal point and two digits following the decimal point.

Now let's say that we wanted to modify the program so that the temperature 98.6 is inserted in the table at the proper position. That is, we want the table to increment every 10 degrees, but we also want the table to include an extra line for 98.6 degrees F because that is the normal body temperature for a human being. The following program accomplishes the goal:


#include

int main()
{
float a;
a = 0;
while (a <= 100)
{
if (a > 98.6)
{
printf("%6.2f degrees F = %6.2f degrees C\n",
98.6, (98.6 - 32.0) * 5.0 / 9.0);
}
printf("%6.2f degrees F = %6.2f degrees C\n",
a, (a - 32.0) * 5.0 / 9.0);
a = a + 10;
}
return 0;
}

This program works if the ending value is 100, but if you change the ending value to 200 you will find that the program has a bug. It prints the line for 98.6 degrees too many times. We can fix that problem in several different ways. Here is one way:


#include

int main()
{
float a, b;
a = 0;
b = -1;
while (a <= 100)
{
if ((a > 98.6) && (b < 98.6))
{
printf("%6.2f degrees F = %6.2f degrees C\n",
98.6, (98.6 - 32.0) * 5.0 / 9.0);
}
printf("%6.2f degrees F = %6.2f degrees C\n",
a, (a - 32.0) * 5.0 / 9.0);
b = a;
a = a + 10;
}
return 0;
}


Arrays

In this section, we will create a small C program that generates 10 random numbers and sorts them. To do that, we will use a new variable arrangement called an array.


An array lets you declare and work with a collection of values of the same type. For example, you might want to create a collection of five integers. One way to do it would be to declare five integers directly:


int a, b, c, d, e;

This is okay, but what if you needed a thousand integers? An easier way is to declare an array of five integers:


int a[5];

The five separate integers inside this array are accessed by an index. All arrays start at index zero and go to n-1 in C. Thus, int a[5]; contains five elements. For example:


int a[5];

a[0] = 12;
a[1] = 9;
a[2] = 14;
a[3] = 5;
a[4] = 1;

One of the nice things about array indexing is that you can use a loop to manipulate the index. For example, the following code initializes all of the values in the array to 0:


int a[5];
int i;

for (i=0; i<5; i++)
a[i] = 0;

The following code initializes the values in the array sequentially and then prints them out:


#include

int main()
{
int a[5];
int i;

for (i=0; i<5; i++)
a[i] = i;
for (i=0; i<5; i++)
printf("a[%d] = %d\n", i, a[i]);
}

Arrays are used all the time in C. To understand a common usage, start an editor and enter the following code:


#include

#define MAX 10

int a[MAX];
int rand_seed=10;

/* from K&R
- returns random number between 0 and 32767.*/
int rand()
{
rand_seed = rand_seed * 1103515245 +12345;
return (unsigned int)(rand_seed / 65536) % 32768;
}

int main()
{
int i,t,x,y;

/* fill array */
for (i=0; i < MAX; i++)
{
a[i]=rand();
printf("%d\n",a[i]);
}

/* more stuff will go here in a minute */

return 0;
}

This code contains several new concepts. The #define line declares a constant named MAX and sets it to 10. Constant names are traditionally written in all caps to make them obvious in the code. The line int a[MAX]; shows you how to declare an array of integers in C. Note that because of the position of the array's declaration, it is global to the entire program.

The line int rand_seed=10 also declares a global variable, this time named rand_seed, that is initialized to 10 each time the program begins. This value is the starting seed for the random number code that follows. In a real random number generator, the seed should initialize as a random value, such as the system time. Here, the rand function will produce the same values each time you run the program.

The line int rand() is a function declaration. The rand function accepts no parameters and returns an integer value. We will learn more about functions later. The four lines that follow implement the rand function. We will ignore them for now.

The main function is normal. Four local integers are declared, and the array is filled with 10 random values using a for loop. Note that the array a contains 10 individual integers. You point to a specific integer in the array using square brackets. So a[0] refers to the first integer in the array, a[1] refers to the second, and so on. The line starting with /* and ending with */ is called a comment. The compiler completely ignores the line. You can place notes to yourself or other programmers in comments.

Now add the following code in place of the more stuff ... comment:


/* bubble sort the array */

for (x=0; x < MAX-1; x++)
for (y=0; y < MAX-x-1; y++)
if (a[y] > a[y+1])
{
t=a[y];
a[y]=a[y+1];
a[y+1]=t;
}
/* print sorted array */
printf("--------------------\n");
for (i=0; i < MAX; i++)
printf("%d\n",a[i]);

This code sorts the random values and prints them in sorted order. Each time you run it, you will get the same values. If you would like to change the values that are sorted, change the value of rand_seed each time you run the program.

The only easy way to truly understand what this code is doing is to execute it "by hand." That is, assume MAX is 4 to make it a little more manageable, take out a sheet of paper and pretend you are the computer. Draw the array on your paper and put four random, unsorted values into the array. Execute each line of the sorting section of the code and draw out exactly what happens. You will find that, each time through the inner loop, the larger values in the array are pushed toward the bottom of the array and the smaller values bubble up toward the top.

More on Arrays

Variable Types

There are three standard variable types in C:

Integer: int
Floating point: float
Character: char


An int is a 4-byte integer value. A float is a 4-byte floating point value. A char is a 1-byte single character (like "a" or "3"). A string is declared as an array of characters.

There are a number of derivative types:

double (8-byte floating point value)
short (2-byte integer)
unsigned short or unsigned int (positive integers, no sign bit)


Operators and Operator Precedence

The operators in C are similar to the operators in most languages:

+ - addition
- - subtraction
/ - division
* - multiplication
% - mod

The / operator performs integer division if both operands are integers, and performs floating point division otherwise. For example:


void main()
{
float a;
a=10/3;
printf("%f\n",a);
}

This code prints out a floating point value since a is declared as type float, but a will be 3.0 because the code performed an integer division.

Operator precedence in C is also similar to that in most other languages. Division and multiplication occur first, then addition and subtraction. The result of the calculation 5+3*4 is 17, not 32, because the * operator has higher precedence than + in C. You can use parentheses to change the normal precedence ordering: (5+3)*4 is 32. The 5+3 is evaluated first because it is in parentheses. We'll get into precedence later -- it becomes somewhat complicated in C once pointers are introduced.

Typecasting

C allows you to perform type conversions on the fly. You do this especially often when using pointers. Typecasting also occurs during the assignment operation for certain types. For example, in the code above, the integer value was automatically converted to a float.

You do typecasting in C by placing the type name in parentheses and putting it in front of the value you want to change. Thus, in the above code, replacing the line a=10/3; with a=(float)10/3; produces 3.33333 as the result because 10 is converted to a floating point value before the division.

Typedef

You declare named, user-defined types in C with the typedef statement. The following example shows a type that appears often in C code:


#define TRUE 1
#define FALSE 0
typedef int boolean;

void main()
{
boolean b;

b=FALSE;
blah blah blah
}

This code allows you to declare Boolean types in C programs.

If you do not like the word "float'' for real numbers, you can say:


typedef float real;
and then later say:


real r1,r2,r3;
You can place typedef statements anywhere in a C program as long as they come prior to their first use in the code.

Structures

Structures in C allow you to group variable into a package. Here's an example:


struct rec
{
int a,b,c;
float d,e,f;
};

struct rec r;

As shown here, whenever you want to declare structures of the type rec, you have to say struct rec. This line is very easy to forget, and you get many compiler errors because you absent-mindedly leave out the struct. You can compress the code into the form:


struct rec
{
int a,b,c;
float d,e,f;
} r;

where the type declaration for rec and the variable r are declared in the same statement. Or you can create a typedef statement for the structure name. For example, if you do not like saying struct rec r every time you want to declare a record, you can say:


typedef struct rec rec_type;
and then declare records of type rec_type by saying:


rec_type r;
You access fields of structure using a period, for example, r.a=5;.

Arrays

You declare arrays by inserting an array size after a normal declaration, as shown below:

int a[10]; /* array of integers */
char s[100]; /* array of characters
(a C string) */
float f[20]; /* array of reals */
struct rec r[50]; /* array of records */

Incrementing

Long Way Short Way
i=i+1; i++;
i=i-1; i--;
i=i+3; i += 3;
i=i*j; i *= j;

Functions

Most languages allow you to create functions of some sort. Functions let you chop up a long program into named sections so that the sections can be reused throughout the program. Functions accept parameters and return a result. C functions can accept an unlimited number of parameters. In general, C does not care in what order you put your functions in the program, so long as a the function name is known to the compiler before it is called.

We have already talked a little about functions. The rand function seen previously is about as simple as a function can get. It accepts no parameters and returns an integer result:


int rand()
/* from K&R
- produces a random number between 0 and 32767.*/
{
rand_seed = rand_seed * 1103515245 +12345;
return (unsigned int)(rand_seed / 65536) % 32768;
}

The int rand() line declares the function rand to the rest of the program and specifies that rand will accept no parameters and return an integer result. This function has no local variables, but if it needed locals, they would go right below the opening { (C allows you to declare variables after any { -- they exist until the program reaches the matching } and then they disappear. A function's local variables therefore vanish as soon as the matching } is reached in the function. While they exist, local variables live on the system stack.) Note that there is no ; after the () in the first line. If you accidentally put one in, you will get a huge cascade of error messages from the compiler that make no sense. Also note that even though there are no parameters, you must use the (). They tell the compiler that you are declaring a function rather than simply declaring an int.

The return statement is important to any function that returns a result. It specifies the value that the function will return and causes the function to exit immediately. This means that you can place multiple return statements in the function to give it multiple exit points. If you do not place a return statement in a function, the function returns when it reaches } and returns a random value (many compilers will warn you if you fail to return a specific value). In C, a function can return values of any type: int, float, char, struct, etc.

There are several correct ways to call the rand function. For example: x=rand();. The variable x is assigned the value returned by rand in this statement. Note that you must use () in the function call, even though no parameter is passed. Otherwise, x is given the memory address of the rand function, which is generally not what you intended.

You might also call rand this way:


if (rand() > 100)
Or this way:


rand();

In the latter case, the function is called but the value returned by rand is discarded. You may never want to do this with rand, but many functions return some kind of error code through the function name, and if you are not concerned with the error code (for example, because you know that an error is impossible) you can discard it in this way.

Functions can use a void return type if you intend to return nothing. For example:


void print_header()
{
printf("Program Number 1\n");
printf("by Marshall Brain\n");
printf("Version 1.0, released 12/26/91\n");
}

This function returns no value. You can call it with the following statement:


print_header();

You must include () in the call. If you do not, the function is not called, even though it will compile correctly on many systems.

C functions can accept parameters of any type. For example:


int fact(int i)
{
int j,k;

j=1;
for (k=2; k<=i; k++)
j=j*k;
return j;
}

returns the factorial of i, which is passed in as an integer parameter. Separate multiple parameters with commas:


int add (int i, int j)
{
return i+j;
}

C has evolved over the years. You will sometimes see functions such as add written in the "old style," as shown below:


int add(i,j)
int i;
int j;
{
return i+j;
}

It is important to be able to read code written in the older style. There is no difference in the way it executes; it is just a different notation. You should use the "new style," (known as ANSI C) with the type declared as part of the parameter list, unless you know you will be shipping the code to someone who has access only to an "old style" (non-ANSI) compiler.

Functions: Function Prototypes

It is now considered good form to use function prototypes for all functions in your program. A prototype declares the function name, its parameters, and its return type to the rest of the program prior to the function's actual declaration. To understand why function prototypes are useful, enter the following code and run it:

#include

void main()
{
printf("%d\n",add(3));
}

int add(int i, int j)
{
return i+j;
}

This code compiles on many compilers without giving you a warning, even though add expects two parameters but receives only one. It works because many C compilers do not check for parameter matching either in type or count. You can waste an enormous amount of time debugging code in which you are simply passing one too many or too few parameters by mistake. The above code compiles properly, but it produces the wrong answer.

To solve this problem, C lets you place function prototypes at the beginning of (actually, anywhere in) a program. If you do so, C checks the types and counts of all parameter lists. Try compiling the following:


#include

int add (int,int); /* function prototype for add */

void main()
{
printf("%d\n",add(3));
}

int add(int i, int j)
{
return i+j;
}

The prototype causes the compiler to flag an error on the printf statement.

Place one prototype for each function at the beginning of your program. They can save you a great deal of debugging time, and they also solve the problem you get when you compile with functions that you use before they are declared. For example, the following code will not compile:


#include

void main()
{
printf("%d\n",add(3));
}

float add(int i, int j)
{
return i+j;
}

Why, you might ask, will it compile when add returns an int but not when it returns a float? Because older C compilers default to an int return value. Using a prototype will solve this problem. "Old style" (non-ANSI) compilers allow prototypes, but the parameter list for the prototype must be empty. Old style compilers do no error checking on parameter lists.

Libraries

Libraries are very important in C because the C language supports only the most basic features that it needs. C does not even contain I/O functions to read from the keyboard and write to the screen. Anything that extends beyond the basic language must be written by a programmer. The resulting chunks of code are often placed in libraries to make them easily reusable. We have seen the standard I/O, or stdio, library already: Standard libraries exist for standard I/O, math functions, string handling, time manipulation, and so on. You can use libraries in your own programs to split up your programs into modules. This makes them easier to understand, test, and debug, and also makes it possible to reuse code from other programs that you write.

You can create your own libraries easily. As an example, we will take some code from a previous article in this series and make a library out of two of its functions. Here's the code we will start with:


#include

#define MAX 10

int a[MAX];
int rand_seed=10;

int rand()
/* from K&R
- produces a random number between 0 and 32767.*/
{
rand_seed = rand_seed * 1103515245 +12345;
return (unsigned int)(rand_seed / 65536) % 32768;
}

void main()
{
int i,t,x,y;

/* fill array */
for (i=0; i < MAX; i++)
{
a[i]=rand();
printf("%d\n",a[i]);
}

/* bubble sort the array */
for (x=0; x < MAX-1; x++)
for (y=0; y < MAX-x-1; y++)
if (a[y] > a[y+1])
{
t=a[y];
a[y]=a[y+1];
a[y+1]=t;
}

/* print sorted array */
printf("--------------------\n");
for (i=0; i < MAX; i++)
printf("%d\n",a[i]);
}

This code fills an array with random numbers, sorts them using a bubble sort, and then displays the sorted list.

Take the bubble sort code, and use what you learned in the previous article to make a function from it. Since both the array a and the constant MAX are known globally, the function you create needs no parameters, nor does it need to return a result. However, you should use local variables for x, y, and t.

Once you have tested the function to make sure it is working, pass in the number of elements as a parameter rather than using MAX:


#include

#define MAX 10

int a[MAX];
int rand_seed=10;

/* from K&R
- returns random number between 0 and 32767.*/
int rand()
{
rand_seed = rand_seed * 1103515245 +12345;
return (unsigned int)(rand_seed / 65536) % 32768;
}

void bubble_sort(int m)
{
int x,y,t;
for (x=0; x < m-1; x++)
for (y=0; y < m-x-1; y++)
if (a[y] > a[y+1])
{
t=a[y];
a[y]=a[y+1];
a[y+1]=t;
}
}

void main()
{
int i,t,x,y;
/* fill array */
for (i=0; i < MAX; i++)
{
a[i]=rand();
printf("%d\n",a[i]);
}
bubble_sort(MAX);
/* print sorted array */
printf("--------------------\n");
for (i=0; i < MAX; i++)
printf("%d\n",a[i]);
}

You can also generalize the bubble_sort function even more by passing in a as a parameter:


bubble_sort(int m, int a[])

This line says, "Accept the integer array a of any size as a parameter." Nothing in the body of the bubble_sort function needs to change. To call bubble_sort, change the call to:


bubble_sort(MAX, a);

Note that &a has not been used in the function call even though the sort will change a. The reason for this will become clear once you understand pointers.


Making a Library

Since the rand and bubble_sort functions in the previous program are useful, you will probably want to reuse them in other programs you write. You can put them into a utility library to make their reuse easier.

Every library consists of two parts: a header file and the actual code file. The header file, normally denoted by a .h suffix, contains information about the library that programs using it need to know. In general, the header file contains constants and types, along with prototypes for functions available in the library. Enter the following header file and save it to a file named util.h.


/* util.h */
extern int rand();
extern void bubble_sort(int, int []);

These two lines are function prototypes. The word "extern" in C represents functions that will be linked in later. If you are using an old-style compiler, remove the parameters from the parameter list of bubble_sort.

Enter the following code into a file named util.c.


/* util.c */
#include "util.h"

int rand_seed=10;

/* from K&R
- produces a random number between 0 and 32767.*/
int rand()
{
rand_seed = rand_seed * 1103515245 +12345;
return (unsigned int)(rand_seed / 65536) % 32768;
}

void bubble_sort(int m,int a[])
{
int x,y,t;
for (x=0; x < m-1; x++)
for (y=0; y < m-x-1; y++)
if (a[y] > a[y+1])
{
t=a[y];
a[y]=a[y+1];
a[y+1]=t;
}
}

Note that the file includes its own header file (util.h) and that it uses quotes instead of the symbols < and> , which are used only for system libraries. As you can see, this looks like normal C code. Note that the variable rand_seed, because it is not in the header file, cannot be seen or modified by a program using this library. This is called information hiding. Adding the word static in front of int enforces the hiding completely.

Enter the following main program in a file named main.c.


#include
#include "util.h"

#define MAX 10

int a[MAX];

void main()
{
int i,t,x,y;
/* fill array */
for (i=0; i < MAX; i++)
{
a[i]=rand();
printf("%d\n",a[i]);
}

bubble_sort(MAX,a);

/* print sorted array */
printf("--------------------\n");
for (i=0; i < MAX; i++)
printf("%d\n",a[i]);
}

This code includes the utility library. The main benefit of using a library is that the code in the main program is much shorter.

Compiling and Running with a Library

To compile the library, type the following at the command line (assuming you are using UNIX) (replace gcc with cc if your system uses cc):


gcc -c -g util.c

The -c causes the compiler to produce an object file for the library. The object file contains the library's machine code. It cannot be executed until it is linked to a program file that contains a main function. The machine code resides in a separate file named util.o.

To compile the main program, type the following:


gcc -c -g main.c
This line creates a file named main.o that contains the machine code for the main program. To create the final executable that contains the machine code for the entire program, link the two object files by typing the following:


gcc -o main main.o util.o
This links main.o and util.o to form an executable named main. To run it, type main.

Makefiles make working with libraries a bit easier.

Makefiles

It can be cumbersome to type all of the gcc lines over and over again, especially if you are making a lot of changes to the code and it has several libraries. The make facility solves this problem. You can use the following makefile to replace the compilation sequence above:

main: main.o util.o
gcc -o main main.o util.o
main.o: main.c util.h
gcc -c -g main.c
util.o: util.c util.h
gcc -c -g util.c

Enter this into a file named makefile, and type maketo build the executable. Note that you must precede all gcc lines with a tab. (Eight spaces will not suffice -- it must be a tab. All other lines must be flush left.)

This makefile contains two types of lines. The lines appearing flush left are dependency lines. The lines preceded by a tab are executable lines, which can contain any valid UNIX command. A dependency line says that some file is dependent on some other set of files. For example, main.o: main.c util.h says that the file main.o is dependent on the files main.c and util.h. If either of these two files changes, the following executable line(s) should be executed to recreate main.o.

Note that the final executable produced by the whole makefile is main, on line 1 in the makefile. The final result of the makefile should always go on line 1, which in this makefile says that the file main is dependent on main.o and util.o. If either of these changes, execute the line gcc -o main main.o util.o to recreate main.

It is possible to put multiple lines to be executed below a dependency line -- they must all start with a tab. A large program may have several libraries and a main program. The makefile automatically recompiles everything that needs to be recompiled because of a change.

If you are not working on a UNIX machine, your compiler almost certainly has functionality equivalent to makefiles. Read the documentation for your compiler to learn how to use it.

Now you understand why you have been including stdio.h in earlier programs. It is simply a standard library that someone created long ago and made available to other programmers to make their lives easier.


Text Files

Text files in C are straightforward and easy to understand. All text file functions and types in C come from the stdio library.
When you need text I/O in a C program, and you need only one source for input information and one sink for output information, you can rely on stdin (standard in) and stdout (standard out). You can then use input and output redirection at the command line to move different information streams through the program. There are six different I/O commands in that you can use with stdin and stdout:

printf - prints formatted output to stdout
scanf - reads formatted input from stdin
puts - prints a string to stdout
gets - reads a string from stdin
putc - prints a character to stdout
getc, getchar - reads a character from stdin

The advantage of stdin and stdout is that they are easy to use. Likewise, the ability to redirect I/O is very powerful. For example, maybe you want to create a program that reads from stdin and counts the number of characters:

#include
#include

void main()
{
char s[1000];
int count=0;
while (gets(s))
count += strlen(s);
printf("%d\n",count);
}

Enter this code and run it. It waits for input from stdin, so type a few lines. When you are done, press CTRL-D to signal end-of-file (eof). The gets function reads a line until it detects eof, then returns a 0 so that the while loop ends. When you press CTRL-D, you see a count of the number of characters in stdout (the screen). (Use man gets or your compiler's documentation to learn more about the gets function.)

Now, suppose you want to count the characters in a file. If you compiled the program to an executable named xxx, you can type the following:


xxx < filename
Instead of accepting input from the keyboard, the contents of the file named filename will be used instead. You can achieve the same result using pipes:


cat < filename | xxx
You can also redirect the output to a file:


xxx < filename > out

This command places the character count produced by the program in a text file named out.

Sometimes, you need to use a text file directly. For example, you might need to open a specific file and read from or write to it. You might want to manage several streams of input or output or create a program like a text editor that can save and recall data or configuration files on command. In that case, use the text file functions in stdio:

fopen - opens a text file
fclose - closes a text file
feof - detects end-of-file marker in a file
fprintf - prints formatted output to a file
fscanf - reads formatted input from a file
fputs - prints a string to a file
fgets - reads a string from a file
fputc - prints a character to a file
fgetc - reads a character from a file


Text Files: Opening

You use fopen to open a file. It opens a file for a specified mode (the three most common are r, w, and a, for read, write, and append). It then returns a file pointer that you use to access the file. For example, suppose you want to open a file and write the numbers 1 to 10 in it. You could use the following code:

#include
#define MAX 10

int main()
{
FILE *f;
int x;
f=fopen("out","w");
if (!f)
return 1;
for(x=1; x<=MAX; x++)
fprintf(f,"%d\n",x);
fclose(f);
return 0;
}

The fopen statement here opens a file named out with the w mode. This is a destructive write mode, which means that if out does not exist it is created, but if it does exist it is destroyed and a new file is created in its place. The fopen command returns a pointer to the file, which is stored in the variable f. This variable is used to refer to the file. If the file cannot be opened for some reason, f will contain NULL.

The fprintf statement should look very familiar: It is just like printf but uses the file pointer as its first parameter. The fclose statement closes the file when you are done.

Text Files: Reading

To read a file, open it with r mode. In general, it is not a good idea to use fscanf for reading: Unless the file is perfectly formatted, fscanf will not handle it correctly. Instead, use fgets to read in each line and then parse out the pieces you need.

The following code demonstrates the process of reading a file and dumping its contents to the screen:


#include

int main()
{
FILE *f;
char s[1000];

f=fopen("infile","r");
if (!f)
return 1;
while (fgets(s,1000,f)!=NULL)
printf("%s",s);
fclose(f);
return 0;
}

The fgets statement returns a NULL value at the end-of-file marker. It reads a line (up to 1,000 characters in this case) and then prints it to stdout. Notice that the printf statement does not include \n in the format string, because fgets adds \n to the end of each line it reads. Thus, you can tell if a line is not complete in the event that it overflows the maximum line length specified in the second parameter to fgets.

Pointers

Pointers are used everywhere in C, so if you want to use the C language fully you have to have a very good understanding of pointers. They have to become comfortable for you. The goal of this section and the next several that follow is to help you build a complete understanding of pointers and how C uses them. For most people it takes a little time and some practice to become fully comfortable with pointers, but once you master them you are a full-fledged C programmer.

C uses pointers in three different ways:

C uses pointers to create dynamic data structures -- data structures built up from blocks of memory allocated from the heap at run-time.

C uses pointers to handle variable parameters passed to functions.

Pointers in C provide an alternative way to access information stored in arrays. Pointer techniques are especially valuable when you work with strings. There is an intimate link between arrays and pointers in C.
In some cases, C programmers also use pointers because they make the code slightly more efficient. What you will find is that, once you are completely comfortable with pointers, you tend to use them all the time.

We will start this discussion with a basic introduction to pointers and the concepts surrounding pointers, and then move on to the three techniques described above. Especially on this article, you will want to read things twice. The first time through you can learn all the concepts. The second time through you can work on binding the concepts together into an integrated whole in your mind. After you make your way through the material the second time, it will make a lot of sense.

Pointers: Why?

Imagine that you would like to create a text editor -- a program that lets you edit normal ASCII text files, like "vi" on UNIX or "Notepad" on Windows. A text editor is a fairly common thing for someone to create because, if you think about it, a text editor is probably a programmer's most commonly used piece of software. The text editor is a programmer's intimate link to the computer -- it is where you enter all of your thoughts and then manipulate them. Obviously, with anything you use that often and work with that closely, you want it to be just right. Therefore many programmers create their own editors and customize them to suit their individual working styles and preferences.

So one day you sit down to begin working on your editor. After thinking about the features you want, you begin to think about the "data structure" for your editor. That is, you begin thinking about how you will store the document you are editing in memory so that you can manipulate it in your program. What you need is a way to store the information you are entering in a form that can be manipulated quickly and easily. You believe that one way to do that is to organize the data on the basis of lines of characters. Given what we have discussed so far, the only thing you have at your disposal at this point is an array. You think, "Well, a typical line is 80 characters long, and a typical file is no more than 1,000 lines long." You therefore declare a two-dimensional array, like this:


char doc[1000][80];

This declaration requests an array of 1,000 80-character lines. This array has a total size of 80,000 characters.

As you think about your editor and its data structure some more, however, you might realize three things:

Some documents are long lists. Every line is short, but there are thousands of lines.

Some special-purpose text files have very long lines. For example, a certain data file might have lines containing 542 characters, with each character representing the amino acid pairs in segments of DNA.

In most modern editors, you can open multiple files at one time.
Let's say you set a maximum of 10 open files at once, a maximum line length of 1,000 characters and a maximum file size of 50,000 lines. Your declaration now looks like this:

char doc[50000][1000][10];

That doesn't seem like an unreasonable thing, until you pull out your calculator, multiply 50,000 by 1,000 by 10 and realize the array contains 500 million characters! Most computers today are going to have a problem with an array that size. They simply do not have the RAM, or even the virtual memory space, to support an array that large. If users were to try to run three or four copies of this program simultaneously on even the largest multi-user system, it would put a severe strain on the facilities.

Even if the computer would accept a request for such a large array, you can see that it is an extravagant waste of space. It seems strange to declare a 500 million character array when, in the vast majority of cases, you will run this editor to look at 100 line files that consume at most 4,000 or 5,000 bytes. The problem with an array is the fact that you have to declare it to have its maximum size in every dimension from the beginning. Those maximum sizes often multiply together to form very large numbers. Also, if you happen to need to be able to edit an odd file with a 2,000 character line in it, you are out of luck. There is really no way for you to predict and handle the maximum line length of a text file, because, technically, that number is infinite.

Pointers are designed to solve this problem. With pointers, you can create dynamic data structures. Instead of declaring your worst-case memory consumption up-front in an array, you instead allocate memory from the heap while the program is running. That way you can use the exact amount of memory a document needs, with no waste. In addition, when you close a document you can return the memory to the heap so that other parts of the program can use it. With pointers, memory can be recycled while the program is running.

By the way, if you read the previous discussion and one of the big questions you have is, "What IS a byte, really?," then the article How Bits and Bytes Work will help you understand the concepts, as well as things like "mega," "giga" and "tera." Go take a look and then come back.


Pointer Basics

To understand pointers, it helps to compare them to normal variables.
A "normal variable" is a location in memory that can hold a value. For example, when you declare a variable i as an integer, four bytes of memory are set aside for it. In your program, you refer to that location in memory by the name i. At the machine level that location has a memory address. The four bytes at that address are known to you, the programmer, as i, and the four bytes can hold one integer value.

A pointer is different. A pointer is a variable that points to another variable. This means that a pointer holds the memory address of another variable. Put another way, the pointer does not hold a value in the traditional sense; instead, it holds the address of another variable. A pointer "points to" that other variable by holding a copy of its address.

Because a pointer holds an address rather than a value, it has two parts. The pointer itself holds the address. That address points to a value. There is the pointer and the value pointed to. This fact can be a little confusing until you get comfortable with it, but once you get comfortable it becomes extremely powerful.

The following example code shows a typical pointer:


#include

int main()
{
int i,j;
int *p; /* a pointer to an integer */
p = &i;
*p=5;
j=i;
printf("%d %d %d\n", i, j, *p);
return 0;
}

The first declaration in this program declares two normal integer variables named i and j. The line int *p declares a pointer named p. This line asks the compiler to declare a variable p that is a pointer to an integer. The * indicates that a pointer is being declared rather than a normal variable. You can create a pointer to anything: a float, a structure, a char, and so on. Just use a * to indicate that you want a pointer rather than a normal variable.

The line p = &i; will definitely be new to you. In C, & is called the address operator. The expression &i means, "The memory address of the variable i." Thus, the expression p = &i; means, "Assign to p the address of i." Once you execute this statement, p "points to" i. Before you do so, p contains a random, unknown address, and its use will likely cause a segmentation fault or similar program crash.

One good way to visualize what is happening is to draw a picture. After i, j and p are declared, the world looks like this:

In this drawing the three variables i, j and p have been declared, but none of the three has been initialized. The two integer variables are therefore drawn as boxes containing question marks -- they could contain any value at this point in the program's execution. The pointer is drawn as a circle to distinguish it from a normal variable that holds a value, and the random arrows indicate that it can be pointing anywhere at this moment.

After the line p = &I;, p is initialized and it points to i, like this:

Once p points to i, the memory location i has two names. It is still known as i, but now it is known as *p as well. This is how C talks about the two parts of a pointer variable: p is the location holding the address, while *p is the location pointed to by that address. Therefore *p=5 means that the location pointed to by p should be set to 5, like this:

Because the location *p is also i, i also takes on the value 5. Consequently, j=i; sets j to 5, and the printf statement produces 5 5 5.

The main feature of a pointer is its two-part nature. The pointer itself holds an address. The pointer also points to a value of a specific type - the value at the address the point holds. The pointer itself, in this case, is p. The value pointed to is *p.

Pointers: Understanding Memory Addresses

The previous discussion becomes a little clearer if you understand how memory addresses work in a computer's hardware. If you have not read it already, now would be a good time to read How Bits and Bytes Work to fully understand bits, bytes and words.

All computers have memory, also known as RAM (random access memory). For example, your computer might have 16 or 32 or 64 megabytes of RAM installed right now. RAM holds the programs that your computer is currently running along with the data they are currently manipulating (their variables and data structures). Memory can be thought of simply as an array of bytes. In this array, every memory location has its own address -- the address of the first byte is 0, followed by 1, 2, 3, and so on. Memory addresses act just like the indexes of a normal array. The computer can access any address in memory at any time (hence the name "random access memory"). It can also group bytes together as it needs to to form larger variables, arrays, and structures. For example, a floating point variable consumes 4 contiguous bytes in memory. You might make the following global declaration in a program:


float f;
This statement says, "Declare a location named f that can hold one floating point value." When the program runs, the computer reserves space for the variable f somewhere in memory. That location has a fixed address in the memory space, like this:



While you think of the variable f, the computer thinks of a specific address in memory (for example, 248,440). Therefore, when you create a statement like this:


f = 3.14;
The compiler might translate that into, "Load the value 3.14 into memory location 248,440." The computer is always thinking of memory in terms of addresses and values at those addresses.

There are, by the way, several interesting side effects to the way your computer treats memory. For example, say that you include the following code in one of your programs:


int i, s[4], t[4], u=0;

for (i=0; i<=4; i++)
{
s[i] = i;
t[i] =i;
}
printf("s:t\n");
for (i=0; i<=4; i++)
printf("%d:%d\n", s[i], t[i]);
printf("u = %d\n", u);
The output that you see from the program will probably look like this:


s:t
1:5
2:2
3:3
4:4
5:5
u = 5

Why are t[0] and u incorrect? If you look carefully at the code, you can see that the for loops are writing one element past the end of each array. In memory, the arrays are placed adjacent to one another, as shown here:



Therefore, when you try to write to s[4], which does not exist, the system writes into t[0] instead because t[0] is where s[4] ought to be. When you write into t[4], you are really writing into u. As far as the computer is concerned, s[4] is simply an address, and it can write into it. As you can see however, even though the computer executes the program, it is not correct or valid. The program corrupts the array t in the process of running. If you execute the following statement, more severe consequences result:


s[1000000] = 5;
The location s[1000000] is more than likely outside of your program's memory space. In other words, you are writing into memory that your program does not own. On a system with protected memory spaces (UNIX, Windows 98/NT), this sort of statement will cause the system to terminate execution of the program. On other systems (Windows 3.1, the Mac), however, the system is not aware of what you are doing. You end up damaging the code or variables in another application. The effect of the violation can range from nothing at all to a complete system crash. In memory, i, s, t and u are all placed next to one another at specific addresses. Therefore, if you write past the boundaries of a variable, the computer will do what you say but it will end up corrupting another memory location.

Because C and C++ do not perform any sort of range checking when you access an element of an array, it is essential that you, as a programmer, pay careful attention to array ranges yourself and keep within the array's appropriate boundaries. Unintentionally reading or writing outside of array boundaries always leads to faulty program behavior.

As another example, try the following:


#include

int main()
{
int i,j;
int *p; /* a pointer to an integer */
printf("%d %d\n", p, &i);
p = &i;
printf("%d %d\n", p, &i);
return 0;
}

This code tells the compiler to print out the address held in p, along with the address of i. The variable p starts off with some crazy value or with 0. The address of i is generally a large value. For example, when I ran this code, I received the following output:


0 2147478276
2147478276 2147478276
which means that the address of i is 2147478276. Once the statement p = &i; has been executed, p contains the address of i. Try this as well:


#include

void main()
{
int *p; /* a pointer to an integer */

printf("%d\n",*p);
}

This code tells the compiler to print the value that p points to. However, p has not been initialized yet; it contains the address 0 or some random address. In most cases, a segmentation fault (or some other run-time error) results, which means that you have used a pointer that points to an invalid area of memory. Almost always, an uninitialized pointer or a bad pointer address is the cause of segmentation faults.

Having said all of this, we can now look at pointers in a whole new light. Take this program, for example:


#include

int main()
{
int i;
int *p; /* a pointer to an integer */
p = &i;
*p=5;
printf("%d %d\n", i, *p);
return 0;
}

Here is what's happening:




The variable i consumes 4 bytes of memory. The pointer p also consumes 4 bytes (on most machines in use today, a pointer consumes 4 bytes of memory. Memory addresses are 32-bits long on most CPUs today, although there is a increasing trend toward 64-bit addressing). The location of i has a specific address, in this case 248,440. The pointer p holds that address once you say p = &i;. The variables *p and i are therefore equivalent.

The pointer p literally holds the address of i. When you say something like this in a program:


printf("%d", p);
what comes out is the actual address of the variable i.

Pointers: Pointing to the Same Address

Here is a cool aspect of C: Any number of pointers can point to the same address. For example, you could declare p, q, and r as integer pointers and set all of them to point to i, as shown here:

int i;
int *p, *q, *r;

p = &i;
q = &i;
r = p;

Note that in this code, r points to the same thing that p points to, which is i. You can assign pointers to one another, and the address is copied from the right-hand side to the left-hand side during the assignment. After executing the above code, this is how things would look:




The variable i now has four names: i, *p, *q and *r. There is no limit on the number of pointers that can hold (and therefore point to) the same address.

Pointers: Common Bugs

Bug #1 - Uninitialized pointers
One of the easiest ways to create a pointer bug is to try to reference the value of a pointer even though the pointer is uninitialized and does not yet point to a valid address. For example:


int *p;

*p = 12;

The pointer p is uninitialized and points to a random location in memory when you declare it. It could be pointing into the system stack, or the global variables, or into the program's code space, or into the operating system. When you say *p=12;, the program will simply try to write a 12 to whatever random location p points to. The program may explode immediately, or may wait half an hour and then explode, or it may subtly corrupt data in another part of your program and you may never realize it. This can make this error very hard to track down. Make sure you initialize all pointers to a valid address before dereferencing them.

Bug #2 - Invalid Pointer References
An invalid pointer reference occurs when a pointer's value is referenced even though the pointer doesn't point to a valid block.

One way to create this error is to say p=q;, when q is uninitialized. The pointer p will then become uninitialized as well, and any reference to *p is an invalid pointer reference.

The only way to avoid this bug is to draw pictures of each step of the program and make sure that all pointers point somewhere. Invalid pointer references cause a program to crash inexplicably for the same reasons given in Bug #1.

Bug #3 - Zero Pointer Reference
A zero pointer reference occurs whenever a pointer pointing to zero is used in a statement that attempts to reference a block. For example, if p is a pointer to an integer, the following code is invalid:


p = 0;
*p = 12;

There is no block pointed to by p. Therefore, trying to read or write anything from or to that block is an invalid zero pointer reference. There are good, valid reasons to point a pointer to zero, as we will see in later articles. Dereferencing such a pointer, however, is invalid.

All of these bugs are fatal to a program that contains them. You must watch your code so that these bugs do not occur. The best way to do that is to draw pictures of the code's execution step by step.

Using Pointers for Function Parameters

Most C programmers first use pointers to implement something called variable parameters in functions. You have actually been using variable parameters in the scanf function -- that's why you've had to use the & (the address operator) on variables used with scanf. Now that you understand pointers you can see what has really been going on.

To understand how variable parameters work, lets see how we might go about implementing a swap function in C. To implement a swap function, what you would like to do is pass in two variables and have the function swap their values. Here's one attempt at an implementation -- enter and execute the following code and see what happens:


#include

void swap(int i, int j)
{
int t;

t=i;
i=j;
j=t;
}

void main()
{
int a,b;

a=5;
b=10;
printf("%d %d\n", a, b);
swap(a,b);
printf("%d %d\n", a, b);
}

When you execute this program, you will find that no swapping takes place. The values of a and b are passed to swap, and the swap function does swap them, but when the function returns nothing happens.

To make this function work correctly you can use pointers, as shown below:


#include

void swap(int *i, int *j)
{
int t;
t = *i;
*i = *j;
*j = t;
}

void main()
{
int a,b;
a=5;
b=10;
printf("%d %d\n",a,b);
swap(&a,&b);
printf("%d %d\n",a,b);
}

To get an idea of what this code does, print it out, draw the two integers a and b, and enter 5 and 10 in them. Now draw the two pointers i and j, along with the integer t. When swap is called, it is passed the addresses of a and b. Thus, i points to a (draw an arrow from i to a) and j points to b (draw another arrow from b to j). Once the pointers are initialized by the function call, *i is another name for a, and *j is another name for b. Now run the code in swap. When the code uses *i and *j, it really means a and b. When the function completes, a and b have been swapped.



Suppose you accidentally forget the & when the swap function is called, and that the swap line accidentally looks like this: swap(a, b);. This causes a segmentation fault. When you leave out the &, the value of a is passed instead of its address. Therefore, i points to an invalid location in memory and the system crashes when *i is used.

This is also why scanf crashes if you forget the & on variables passed to it. The scanf function is using pointers to put the value it reads back into the variable you have passed. Without the &, scanf is passed a bad address and crashes.

Variable parameters are one of the most common uses of pointers in C. Now you understand what's happening!

Dynamic Data Structures

Dynamic data structures are data structures that grow and shrink as you need them to by allocating and deallocating memory from a place called the heap. They are extremely important in C because they allow the programmer to exactly control memory consumption.

Dynamic data structures allocate blocks of memory from the heap as required, and link those blocks together into some kind of data structure using pointers. When the data structure no longer needs a block of memory, it will return the block to the heap for reuse. This recycling makes very efficient use of memory.