C Tutorial: avoid mallocs and string copies

Avoid excessive string copying

Much of the work related to things like reading data from users or reading configuration files involves parsing that data to extract strings of interest. In these cases, it is tempting to allocate new buffers to hold these strings and copy the appropriate characters into them.

This is a perfectly valid approach. However, in certain cases it is not necessary to allocate space and copy strings. If all the data that you need is in the original string and you will not be overwriting that original buffer before you make use of the data, it is quite possible to simply terminate the strings you need (store a 0 at the end of the string) and keep track of pointers to the strings instead of copying the data.

The advantages of doing this are increased performance and reduced bugs: increased performance because of less copying and dealing with managing memory and reduced bugs in forgetting to free memory when no longer needed.

Example

Below is a simple example that shows how we can parse a string that contains tokens separated with / characters into those individual tokens.

This particular example could have been done with the strtok library function but the same technique, with a bit of extra code, can be used to handle things that strtok cannot handle, such as scanning through quoted strings or handling escape characters. For more complex tasks, you're better off implementing a state machine or using a formal lexical analyzer.

/* parse tokens without allocating buffers and copying strings For the parsing used here, you can use strtok() to accomplish the same thing. For other types of parsing, strtok() may not be sufficient. Paul Krzyzanowski */ #include <stdio.h> #define MAXTOKENS 256 int main(int argc, char **argv) { char name[] = "//abc/def////ghi/jkl"; /* test string */ char *item[MAXTOKENS]; /* this holds our array of tokens */ char separator = '/'; char *s; /* where we're scanning */ int i=0; /* current item */ for (s=name; *s == separator; s++) ; /* skip initial separators */ item[i] = s; /* first token */ for (; *s; s++) { if (*s == separator) { /* end of item */ *s = 0; /* mark end of string */ while (*(s+1) == separator) s++; /* skip all separators */ item[++i] = s+1; } } /* print our list of tokens */ int j; for (j=0; j <= i; j++) printf("item %d: \"%s\"\n", j, item[j]); }

Download this file

Save this file by control-clicking or right clicking the download link and then saving it as nomalloc.c.

Compile this program via:

gcc -o nomalloc nomalloc.c

If you don't have gcc, You may need to substitute the gcc command with cc or another name of your compiler.

Run the program:

./nomalloc

Recommended

The Practice of Programming

 

The C Programming Language

 

The UNIX Programming Environment