Strings & Character Arrays
Why Strings Matter
Most programs spend the majority of their time processing text—reading configuration files, parsing JSON from web APIs, processing user input, formatting log messages, and building command-line output. In higher-level languages like Python or JavaScript, strings "just work." In C, you build them yourself out of raw character arrays. This is both a burden (you must manage memory manually) and a superpower (you control every byte and can achieve zero-copy performance that higher-level languages can only dream of).
String handling is also the #1 source of security vulnerabilities in C programs. Buffer overflows in string functions have enabled some of the most famous exploits in computing history—the Morris Worm (1988), Code Red (2001), and Slammer (2003) all exploited C string bugs. Learning to handle strings safely isn't just about writing correct code; it's about writing secure code.
1. C Strings Are Just Character Arrays with a Sentinel
C has no built-in string type. Instead, a "string" is simply a contiguous sequence of char values in memory, terminated by a special sentinel byte: the null character, written as ' ', with ASCII value 0.
Memory layout of "Hello" (6 bytes, not 5!):
Index: [0] [1] [2] [3] [4] [5]
Char: 'H' 'e' 'l' 'l' 'o' ' '
Hex: 0x48 0x65 0x6C 0x6C 0x6F 0x00
The null terminator is not a letter you see; it's the invisible stop sign that tells every string function where the text ends. Without it, printf("%s", str) would keep reading memory forever, spewing garbage until it randomly encounters a zero byte or crashes.
This design has profound implications:
- A string of N visible characters always requires N+1 bytes of storage.
- The null terminator is the string's "length" marker—there is no separate length field.
- To find the length of a string, you must scan every character until you find
' '. This is an O(n) operation.
2. Three Ways to Declare Strings
// Method 1: String literal (compiler auto-sizes and null-terminates)
char greeting[] = "Hello"; // allocates 6 bytes: H e l l o
// Method 2: Character array with explicit initializer
char name[6] = {'J', 'o', 'h', 'n', ' '}; // you MUST include
// Method 3: Pointer to a string literal (READ-ONLY!)
char *msg = "Hello"; // msg points to read-only memory; DO NOT modify!
Critical distinction: char greeting[] = "Hello"; creates a writable array on the stack (you can change greeting[0] = 'J';). char *msg = "Hello"; points to a string literal stored in read-only memory. Attempting msg[0] = 'J'; causes undefined behavior (usually a crash).
3. The DANGER of scanf and the Safety of fgets
char name[20];
scanf("%s", name); // DANGEROUS! No size check, stops at whitespace
scanf("%s", name) has two fatal flaws: it doesn't know your buffer is only 20 bytes, so a user typing 200 characters will overflow it (buffer overflow), and it stops reading at the first space, so "John Smith" becomes just "John".
char name[20];
fgets(name, sizeof(name), stdin); // SAFE: max 19 chars +
fgets is the safe alternative. It reads up to sizeof(name) - 1 characters, always null-terminates, and includes spaces. The only quirk: fgets preserves the trailing newline character ('
') if there's room. You may want to strip it:
#include <stdio.h>
#include <string.h>
int main() {
char username[20];
printf("Enter name: ");
fgets(username, sizeof(username), stdin);
// Strip the trailing newline if present
size_t len = strlen(username);
if (len > 0 && username[len-1] == '
') {
username[len-1] = ' ';
}
printf("Hello, %s!
", username);
return 0;
}
4. The Essential string.h Functions
The <string.h> header provides the standard toolkit for string manipulation. Here are the four most important functions, with their common pitfalls:
strlen — String Length
size_t strlen(const char *s);
// Returns the number of characters BEFORE the null terminator.
// strlen("Hello") = 5, even though "Hello" occupies 6 bytes.
strcpy — String Copy
char *strcpy(char *dest, const char *src);
// Copies src into dest, INCLUDING the null terminator.
// DANGER: No size check! dest must be big enough or you get buffer overflow.
// Safer alternative: strncpy(dest, src, n) or snprintf
strcat — String Concatenation
char *strcat(char *dest, const char *src);
// Appends src to the end of dest. dest must have enough space for BOTH!
// DANGER: Same buffer overflow risk as strcpy.
strcmp — String Comparison
int strcmp(const char *s1, const char *s2);
// Returns 0 if strings are EQUAL.
// Returns <0 if s1 comes before s2 lexicographically.
// Returns >0 if s1 comes after s2 lexicographically.
The #1 strcmp mistake: using == to compare strings:
char *a = "hello", *b = "hello";
if (a == b) // WRONG! Compares ADDRESSES, not content
if (strcmp(a, b) == 0) // RIGHT! Compares the actual text
5. Complete String Toolkit Demo
#include <stdio.h>
#include <string.h>
int main() {
char first[20] = "Hello";
char last[20] = "World";
char full[40];
// Copy first into full
strcpy(full, first); // full = "Hello"
printf("After strcpy: %s
", full);
// Append a space and last name
strcat(full, " "); // full = "Hello "
strcat(full, last); // full = "Hello World"
printf("After strcat: %s
", full);
// Get length
printf("Length: %zu
", strlen(full)); // 11 (not counting )
// Compare strings
char pwd[] = "secret";
if (strcmp(pwd, "secret") == 0) {
printf("Access granted.
");
}
return 0;
}
6. Memory Diagram: String Concatenation in Action
char first[20] = "Hello";
char last[20] = "World";
char full[40];
After strcpy(full, first):
full: [H][e][l][l][o][ ][?][?]...[?]
^--------- copied ---------^
After strcat(full, " "):
full: [H][e][l][l][o][ ][ ][?]...[?]
After strcat(full, last):
full: [H][e][l][l][o][ ][W][o][r][l][d][ ]...[?]
^------------- concatenated ------------^
7. Common String Mistakes
Mistake #1: Forgetting the null terminator space
char word[5] = "hello"; // BUG: "hello" needs 6 bytes (h,e,l,l,o, )
// The compiler may warn, but if it doesn't, you have an
// unterminated string that will cause chaos.
Mistake #2: Buffer overflow with strcpy/strcat
char dest[5];
strcpy(dest, "Hello, World!"); // 14 bytes into a 5-byte buffer!
// This is the classic buffer overflow vulnerability.
Mistake #3: Using == to compare strings
if (userInput == "quit") // BUG: compares pointer addresses!
// Use strcmp(userInput, "quit") == 0 instead.
Mistake #4: Modifying a string literal
char *msg = "Hello";
msg[0] = 'J'; // UNDEFINED BEHAVIOR! String literals are read-only.
8. Key Takeaways
- C strings are null-terminated character arrays. Every string of N visible characters needs N+1 bytes.
- Use
fgets, notscanf("%s"), for safe string input that prevents buffer overflow and handles spaces. strcmpreturns 0 for equality, not 1. This is backwards from what many beginners expect.strcpyandstrcathave no bounds checking. Always verify your destination buffer is large enough, or use their safer variants (strncpy,strncat, orsnprintf).- String literals (
"like this") are stored in read-only memory. Declare them asconst char *or aschar[]if you need to modify them.
9. Practice Exercises
Exercise 1: Safe Input and Output
Write a program that asks the user for their first name and last name separately (using fgets), strips the trailing newlines, concatenates them with a space, and prints the full name. Handle the case where the user types more characters than your buffer size.
Exercise 2: Implement your own strcmp
Write a function int my_strcmp(const char *s1, const char *s2) that behaves exactly like the standard strcmp. Walk through both strings character by character, comparing until you find a difference or hit the null terminator. Test it with identical strings, different strings, and strings where one is a prefix of the other.
Exercise 3: Count Vowels
Write a function int countVowels(const char *str) that returns the number of vowels (a, e, i, o, u, both uppercase and lowercase) in a string. Test it with several strings including an empty string and a string with no vowels.
Exercise 4: Palindrome Checker
Write a function int isPalindrome(const char *str) that returns 1 if the string is a palindrome (reads the same forward and backward), ignoring case. Use two pointers: one at the start, one at the end, walking toward each other. Test with "Racecar", "hello", "A man a plan a canal Panama" (challenge: ignore spaces too).