Docs

data structures

04 - Data Structures

📌 What You'll Learn

  • Lists - ordered, mutable sequences
  • Tuples - ordered, immutable sequences
  • Dictionaries - key-value pairs
  • Sets - unordered, unique collections
  • Nested data structures
  • When to use which structure

📋 Lists

Lists are ordered, mutable collections that can hold items of any type.

Creating Lists

# Empty list
empty = []
empty = list()

# List with items
numbers = [1, 2, 3, 4, 5]
mixed = [1, "hello", 3.14, True, None]
nested = [[1, 2], [3, 4], [5, 6]]

# From other iterables
from_string = list("Python")  # ['P', 'y', 't', 'h', 'o', 'n']
from_range = list(range(5))   # [0, 1, 2, 3, 4]

Accessing Elements

fruits = ["apple", "banana", "cherry", "date"]

# Indexing
print(fruits[0])     # "apple" (first)
print(fruits[-1])    # "date" (last)
print(fruits[1:3])   # ["banana", "cherry"] (slice)

# Check if item exists
if "banana" in fruits:
    print("Found banana!")

Modifying Lists

fruits = ["apple", "banana", "cherry"]

# Add items
fruits.append("date")           # Add to end
fruits.insert(1, "blueberry")   # Insert at position
fruits.extend(["elder", "fig"]) # Add multiple items
fruits += ["grape"]             # Concatenate

# Remove items
fruits.remove("banana")   # Remove by value
popped = fruits.pop()     # Remove and return last
popped = fruits.pop(0)    # Remove and return at index
del fruits[0]             # Delete by index
fruits.clear()            # Remove all items

# Modify items
fruits[0] = "apricot"     # Replace by index
fruits[1:3] = ["b", "c"]  # Replace slice

List Methods

numbers = [3, 1, 4, 1, 5, 9, 2, 6]

# Sorting
numbers.sort()              # Sort in place (ascending)
numbers.sort(reverse=True)  # Sort descending
sorted_nums = sorted(numbers)  # Return new sorted list

# Reversing
numbers.reverse()           # Reverse in place
reversed_nums = numbers[::-1]  # Return new reversed list

# Searching
index = numbers.index(4)    # Find index of value
count = numbers.count(1)    # Count occurrences

# Copying
copy1 = numbers.copy()      # Shallow copy
copy2 = numbers[:]          # Slice copy
copy3 = list(numbers)       # Constructor copy

List Comprehensions

# Basic
squares = [x**2 for x in range(10)]

# With condition
evens = [x for x in range(20) if x % 2 == 0]

# With transformation
words = ["hello", "world"]
upper = [w.upper() for w in words]

# Nested
matrix = [[i*j for j in range(3)] for i in range(3)]

📦 Tuples

Tuples are ordered, immutable sequences. Once created, they cannot be modified.

Creating Tuples

# Empty tuple
empty = ()
empty = tuple()

# Single item (note the comma!)
single = (1,)     # This is a tuple
not_tuple = (1)   # This is just an integer!

# Multiple items
coords = (3, 4)
mixed = (1, "hello", 3.14)
nested = ((1, 2), (3, 4))

# Without parentheses (tuple packing)
point = 10, 20, 30

Accessing Elements

coords = (10, 20, 30, 40, 50)

# Indexing (same as lists)
print(coords[0])     # 10
print(coords[-1])    # 50
print(coords[1:4])   # (20, 30, 40)

# Unpacking
x, y, z, *rest = coords
print(x, y, z)       # 10 20 30
print(rest)          # [40, 50]

# Named unpacking
point = (3, 4)
x, y = point

Tuple Methods

numbers = (1, 2, 3, 2, 4, 2, 5)

# Only two methods (immutable!)
print(numbers.count(2))   # 3 (occurrences)
print(numbers.index(3))   # 2 (first index)

Why Use Tuples?

# 1. Dictionary keys (lists can't be keys)
locations = {
    (40.7128, -74.0060): "New York",
    (34.0522, -118.2437): "Los Angeles"
}

# 2. Multiple return values
def get_dimensions():
    return 1920, 1080  # Returns tuple

width, height = get_dimensions()

# 3. Data integrity (can't be accidentally modified)
DAYS = ("Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun")

# 4. Memory efficiency (smaller than lists)
# 5. Slightly faster than lists

📖 Dictionaries

Dictionaries are unordered (Python 3.7+ maintains insertion order), mutable collections of key-value pairs.

Creating Dictionaries

# Empty dict
empty = {}
empty = dict()

# With items
person = {
    "name": "Alice",
    "age": 25,
    "city": "New York"
}

# From sequences
keys = ["a", "b", "c"]
values = [1, 2, 3]
from_zip = dict(zip(keys, values))

# Dict comprehension
squares = {x: x**2 for x in range(5)}

# fromkeys (same value for all keys)
defaults = dict.fromkeys(["a", "b", "c"], 0)

Accessing Values

person = {"name": "Alice", "age": 25, "city": "NYC"}

# By key
print(person["name"])        # "Alice"
# print(person["country"])   # KeyError!

# Safe access with get()
print(person.get("name"))           # "Alice"
print(person.get("country"))        # None
print(person.get("country", "USA")) # "USA" (default)

# Get all keys, values, items
print(person.keys())    # dict_keys(['name', 'age', 'city'])
print(person.values())  # dict_values(['Alice', 25, 'NYC'])
print(person.items())   # dict_items([('name', 'Alice'), ...])

# Check if key exists
if "name" in person:
    print("Name exists")

Modifying Dictionaries

person = {"name": "Alice", "age": 25}

# Add/update single item
person["city"] = "NYC"        # Add new key
person["age"] = 26            # Update existing

# Update multiple items
person.update({"age": 27, "country": "USA"})

# Remove items
del person["city"]            # Delete by key
age = person.pop("age")       # Remove and return
last = person.popitem()       # Remove and return last item

# Set default (add if not exists)
person.setdefault("name", "Unknown")  # Won't change
person.setdefault("email", "n/a")     # Will add

Dictionary Methods

person = {"name": "Alice", "age": 25}

# Copy
copy = person.copy()

# Clear
person.clear()  # Empty the dict

# Merge (Python 3.9+)
dict1 = {"a": 1, "b": 2}
dict2 = {"b": 3, "c": 4}
merged = dict1 | dict2  # {"a": 1, "b": 3, "c": 4}

Iterating Dictionaries

person = {"name": "Alice", "age": 25, "city": "NYC"}

# Keys only
for key in person:
    print(key)

# Values only
for value in person.values():
    print(value)

# Keys and values
for key, value in person.items():
    print(f"{key}: {value}")

🎯 Sets

Sets are unordered collections of unique items.

Creating Sets

# Empty set (not {} - that's an empty dict!)
empty = set()

# With items
numbers = {1, 2, 3, 4, 5}
mixed = {1, "hello", 3.14}  # No duplicates allowed

# From iterable (removes duplicates!)
from_list = set([1, 2, 2, 3, 3, 3])  # {1, 2, 3}
from_string = set("hello")  # {'h', 'e', 'l', 'o'}

# Frozen set (immutable)
frozen = frozenset([1, 2, 3])

Set Operations

a = {1, 2, 3, 4}
b = {3, 4, 5, 6}

# Union (all elements from both)
print(a | b)          # {1, 2, 3, 4, 5, 6}
print(a.union(b))

# Intersection (common elements)
print(a & b)          # {3, 4}
print(a.intersection(b))

# Difference (in a but not in b)
print(a - b)          # {1, 2}
print(a.difference(b))

# Symmetric difference (in either, but not both)
print(a ^ b)          # {1, 2, 5, 6}
print(a.symmetric_difference(b))

# Subset and superset
print({1, 2}.issubset({1, 2, 3}))     # True
print({1, 2, 3}.issuperset({1, 2}))  # True
print({1, 2}.isdisjoint({3, 4}))      # True (no common elements)

Modifying Sets

numbers = {1, 2, 3}

# Add items
numbers.add(4)           # Add single item
numbers.update([5, 6])   # Add multiple items

# Remove items
numbers.remove(3)        # Remove (KeyError if not found)
numbers.discard(10)      # Remove (no error if not found)
popped = numbers.pop()   # Remove and return arbitrary item
numbers.clear()          # Remove all

Set Use Cases

# Remove duplicates from list
my_list = [1, 2, 2, 3, 3, 3]
unique = list(set(my_list))  # [1, 2, 3]

# Membership testing (faster than list)
allowed = {"admin", "moderator", "user"}
if "admin" in allowed:
    print("Access granted")

# Find common/different items
list1 = [1, 2, 3, 4, 5]
list2 = [4, 5, 6, 7, 8]
common = set(list1) & set(list2)  # {4, 5}

🔗 Nested Data Structures

Nested Lists

# 2D matrix
matrix = [
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
]

# Access
print(matrix[0][0])    # 1
print(matrix[1][2])    # 6

# Iterate
for row in matrix:
    for item in row:
        print(item, end=" ")
    print()

Nested Dictionaries

users = {
    "user1": {
        "name": "Alice",
        "age": 25,
        "hobbies": ["reading", "coding"]
    },
    "user2": {
        "name": "Bob",
        "age": 30,
        "hobbies": ["gaming", "music"]
    }
}

# Access
print(users["user1"]["name"])           # Alice
print(users["user2"]["hobbies"][0])     # gaming

# Iterate
for user_id, info in users.items():
    print(f"{user_id}: {info['name']}")

List of Dictionaries

products = [
    {"name": "Laptop", "price": 999, "stock": 50},
    {"name": "Phone", "price": 699, "stock": 100},
    {"name": "Tablet", "price": 499, "stock": 75}
]

# Access
print(products[0]["name"])  # Laptop

# Filter
expensive = [p for p in products if p["price"] > 500]

# Sort
by_price = sorted(products, key=lambda x: x["price"])

🧰 Collections Module

The collections module provides specialized container types.

Counter - Count Occurrences

from collections import Counter

# Count items
words = ['apple', 'banana', 'apple', 'cherry', 'apple', 'banana']
word_count = Counter(words)
print(word_count)  # Counter({'apple': 3, 'banana': 2, 'cherry': 1})

# Most common
print(word_count.most_common(2))  # [('apple', 3), ('banana', 2)]

# Arithmetic operations
counter1 = Counter(['a', 'b', 'b'])
counter2 = Counter(['b', 'c', 'c'])
print(counter1 + counter2)  # Counter({'b': 3, 'c': 2, 'a': 1})
print(counter1 - counter2)  # Counter({'a': 1, 'b': 1})

defaultdict - Dictionary with Default Values

from collections import defaultdict

# List as default
groups = defaultdict(list)
groups['fruits'].append('apple')
groups['fruits'].append('banana')
groups['vegetables'].append('carrot')
print(groups)  # {'fruits': ['apple', 'banana'], 'vegetables': ['carrot']}

# Int as default (good for counting)
counts = defaultdict(int)
for word in ['a', 'b', 'a', 'c', 'a']:
    counts[word] += 1
print(counts)  # {'a': 3, 'b': 1, 'c': 1}

# Set as default (unique values)
index = defaultdict(set)
index['category1'].add('item1')
index['category1'].add('item2')

namedtuple - Tuple with Named Fields

from collections import namedtuple

# Define a named tuple
Point = namedtuple('Point', ['x', 'y'])
Person = namedtuple('Person', 'name age city')

# Create instances
p = Point(3, 4)
print(p.x, p.y)  # 3 4

person = Person('Alice', 25, 'NYC')
print(person.name)  # Alice
print(person[0])    # Alice (still works like a tuple)

# Convert to dict
print(person._asdict())  # {'name': 'Alice', 'age': 25, 'city': 'NYC'}

# Replace fields (returns new namedtuple)
person2 = person._replace(age=26)

deque - Double-Ended Queue

from collections import deque

# Create deque
d = deque([1, 2, 3])

# Add items
d.append(4)        # Add right: [1, 2, 3, 4]
d.appendleft(0)    # Add left: [0, 1, 2, 3, 4]

# Remove items
d.pop()            # Remove right: [0, 1, 2, 3]
d.popleft()        # Remove left: [1, 2, 3]

# Rotate
d = deque([1, 2, 3, 4, 5])
d.rotate(2)        # Rotate right: [4, 5, 1, 2, 3]
d.rotate(-2)       # Rotate left: [1, 2, 3, 4, 5]

# Fixed size (oldest items removed)
d = deque(maxlen=3)
d.extend([1, 2, 3, 4, 5])
print(d)  # deque([3, 4, 5], maxlen=3)

📊 Comparison Table

FeatureListTupleDictSet
Ordered✅*
Mutable
DuplicatesKeys: ❌
IndexingBy Key
Syntax[](){}{}

*Python 3.7+ maintains insertion order

When to Use What?

# LIST: Ordered collection, need to modify
shopping_cart = ["apple", "banana", "milk"]

# TUPLE: Fixed data, function returns, dict keys
coordinates = (40.7128, -74.0060)
def get_user(): return "Alice", 25  # Returns tuple

# DICT: Key-value mapping, fast lookup by key
user = {"name": "Alice", "age": 25}

# SET: Unique items, membership testing, math operations
unique_visitors = {"user1", "user2", "user3"}

🎯 Next Steps

After mastering data structures, proceed to 05_functions to learn about defining and using functions!

Data Structures - Python Tutorial | DeepML