Builder to Architect
CS Fundamentals

CS50 Lecture 2 — Arrays

Understand how data is stored in memory, how compiling really works, and why arrays are the foundation of all data structures.

How Compiling Actually Works

Last week you learned that C code gets compiled into machine code. But what really happens during compilation? It's a 4-step process:

  1. Preprocessing: Lines starting with # (like #include <stdio.h>) get processed. The compiler literally copies the contents of stdio.h into your file.

  2. Compiling: Your C code gets translated into assembly language — a low-level language that's close to machine code but still human-readable.

  3. Assembling: Assembly code gets converted into machine code (binary — the 0s and 1s from Week 1).

  4. Linking: If your program uses libraries (like stdio.h), the compiled library code gets combined with your code into one executable.

This is relevant to your world: when you run npm run build on your Next.js projects, a similar multi-step process happens:

  1. TypeScript → JavaScript (compilation)
  2. JSX → function calls (transformation)
  3. Bundling multiple files together (linking)
  4. Optimization and minification

Understanding compilation helps you debug build errors instead of just guessing.

Arrays — Contiguous Memory

An array is a sequence of values stored next to each other in memory. Think of a row of mailboxes — each one is numbered and contains one item.

int scores[3];        // Declare array of 3 integers
scores[0] = 72;       // First element
scores[1] = 85;       // Second element
scores[2] = 91;       // Third element

// Or initialize all at once:
int scores[] = {72, 85, 91};

Why does this matter? Because arrays are the foundation of almost every data structure in programming. Python lists, JavaScript arrays, database rows — they're all built on this concept.

Memory Layout

When you create int scores[3], the computer allocates 12 bytes of memory (3 × 4 bytes per int), all next to each other:

Memory address:  100   104   108
Value:           72    85    91
Index:           [0]   [1]   [2]

This is why array access is fast — to find scores[2], the computer calculates: start address + (2 × 4 bytes) = address 108. It jumps directly there. No searching needed.

Strings Are Arrays

Here's a key insight: in C, a string is just an array of characters:

char name[] = "Lia";

In memory:

name[0] = 'L'
name[1] = 'i'
name[2] = 'a'
name[3] = '\0'   ← null terminator (marks end of string)

That \0 at the end is called the null terminator. It tells C where the string ends. Without it, the computer would keep reading memory and print garbage. Python handles this automatically — you never see null terminators. But they exist underneath.

String Functions

#include <string.h>

char name[] = "Hello";
int length = strlen(name);    // → 5 (doesn't count \0)

// Comparing strings
if (strcmp(name, "Hello") == 0)   // strcmp returns 0 if equal
{
    printf("Match!\n");
}

In Python, you just write len("Hello") and "Hello" == "Hello". C makes you use special functions because strings are arrays, not a built-in type.

Command-Line Arguments

Programs can receive input when you run them:

int main(int argc, char *argv[])
{
    if (argc == 2)
    {
        printf("Hello, %s\n", argv[1]);
    }
    else
    {
        printf("Usage: ./greet name\n");
    }
}

Running ./greet Lia prints "Hello, Lia".

  • argc = argument count (how many words were typed)
  • argv = argument vector (the actual words, as an array of strings)
  • argv[0] is always the program name itself

This is exactly how command-line tools work. When you type git push origin main, git receives argc=4 and argv = ["git", "push", "origin", "main"].

The Bigger Picture

This lecture teaches you that data has a physical reality in memory. It takes up space, has an address, and is organized in specific patterns. When your web app is slow, it might be because data is organized inefficiently. When a bug causes "undefined" values, it might be because you're accessing memory that wasn't properly initialized.

You don't need to manage memory yourself in Python or JavaScript. But knowing that memory management exists underneath helps you understand why certain patterns are fast and others are slow.

What to Do This Week

  1. Watch CS50 Lecture 2 (link above)
  2. Focus on understanding: How arrays work in memory, why strings are arrays, what compilation actually does
  3. Take the quiz below.

Quiz

Question 1/5Score: 0

What are the 4 steps of compilation in C?