Implementing Adjacency Lists for Large Sparse Graphs in C++: A Comprehensive Guide

by | C++, DSA, Graphs

In this guide, we’ll explore how to implement efficient adjacency lists in C++ for large sparse graphs. We’ll cover both custom implementations and integration with the Eigen library, focusing on practical examples and performance considerations.

Introduction

Adjacency lists are a space-efficient way to represent graphs, particularly when dealing with sparse graphs where the number of edges is much smaller than the maximum possible number of edges. In this guide, we’ll explore how to implement them effectively in C++, with a focus on handling large-scale sparse graphs.

Key Concepts

  • Space efficiency for sparse graphs (where |E| ≪ |V|²)
  • Fast iteration over neighbors of a vertex
  • Efficient memory usage for dynamic graphs
  • Integration with modern C++ containers and the Eigen library
📚 Key Terms: Adjacency List Concepts
Adjacency List
A collection of unordered lists used to represent a finite graph. Each list describes the set of neighbors of a vertex in the graph.
Sparse Graph
A graph where |E| ≪ |V|². Most real-world graphs are sparse, making adjacency lists an efficient choice for representation.
Edge
A connection between two vertices in the graph. In adjacency lists, edges are stored in the lists of their source vertices.
Vertex
A node in the graph. Each vertex maintains a list of its adjacent vertices (neighbors).
Space Complexity
O(|V| + |E|) for adjacency lists, making them more efficient than matrices for sparse graphs.
Dynamic Graph
A graph that changes over time with vertex/edge additions and removals. Adjacency lists handle these changes efficiently.

Mathematical Background

An adjacency list represents a graph G = (V, E) by maintaining a list of neighbors for each vertex v ∈ V. This representation is particularly efficient for sparse graphs, where |E| ≪ |V|².

0 1 2 3 4 Adjacency List Representation: 0 → 1, 2 1 → 0, 4 2 → 0, 3 3 → 2, 4 4 → 1, 3
Figure 1: A sparse graph with 5 vertices and 6 edges, and its adjacency list representation

Key Properties

  1. Space Complexity:

    For a graph with |V| vertices and |E| edges: \[ \text{Space Required} = O(|V| + |E|) \] This is more efficient than the \(O(|V|^2)\) required by adjacency matrices for sparse graphs.

  2. Time Complexities:
    • Edge Query: \(O(\text{deg}(v))\) where deg(v) is the degree of vertex v
    • Adding an Edge: \(O(1)\) amortized with proper data structures
    • Removing an Edge: \(O(\text{deg}(v))\)
    • Iterating Over Neighbors: \(O(\text{deg}(v))\)
  3. Memory Usage:

    For weighted graphs with edge weight type T: \[ \text{Memory} \approx |V| \times \text{sizeof(pointer)} + |E| \times (\text{sizeof(vertex_id)} + \text{sizeof(T)}) \]

Trade-offs

  • ✅ Excellent space efficiency for sparse graphs
  • ✅ Fast iteration over neighbors
  • ✅ Efficient edge insertion
  • ❌ Slower edge existence queries compared to adjacency matrices
  • ❌ More complex implementation

Note: For graphs where \(|E| \approx |V|^2\) (dense graphs), an adjacency matrix might be more appropriate. However, most real-world graphs are sparse, making adjacency lists the preferred choice.

Custom Implementation

Let’s implement a flexible and efficient adjacency list structure in C++. We’ll use modern C++ features and STL containers to create a robust implementation that can handle both weighted and unweighted graphs.

Basic Structure

Our implementation will use templates to support different types of weights and vertex IDs. We’ll start with the basic structure and gradually add more sophisticated features.

Core Adjacency List Implementation
// Required standard libraries
#include <vector>       // For storing lists of edges
#include <unordered_map> // For efficient edge lookup
#include <utility>      // For std::pair
#include <stdexcept>    // For exception handling
#include <algorithm>    // For find_if and other algorithms
#include <iostream> // For input/output

/**
 * @brief A templated adjacency list implementation for sparse graphs
 * @tparam V Type of vertex IDs (usually int or size_t)
 * @tparam W Type of edge weights (use bool for unweighted graphs)
 * @tparam Directed Whether the graph is directed (default: false)
 */
template<typename V = int,
         typename W = bool,
         bool Directed = false>
class AdjacencyList {
private:
    // Forward declaration of the Edge structure
    struct Edge;

    // Type definitions for cleaner code
    using EdgeList = std::vector<Edge>;
    using AdjList = std::unordered_map<V, EdgeList>;

    /**
     * @brief Internal edge structure
     * Represents a single edge in the graph
     */
    struct Edge {
        V target;      // ID of the target vertex
        W weight;      // Weight of the edge

        // Constructor for weighted edges
        Edge(V t, W w) : target(t), weight(w) {}

        // Constructor for unweighted edges
        explicit Edge(V t) : target(t), weight(W{}) {}

        // Equality comparison for finding edges
        bool operator==(const Edge& other) const {
            return target == other.target;
        }
    };

    AdjList adjacencyLists;  // Main container for the adjacency lists
    size_t edgeCount = 0;    // Tracks total number of edges

    /**
     * @brief Validates a vertex exists in the graph
     * @param vertex Vertex ID to check
     * @throws std::out_of_range if vertex doesn't exist
     */
    void validateVertex(V vertex) const {
        if (adjacencyLists.find(vertex) == adjacencyLists.end()) {
            throw std::out_of_range("Vertex not found in graph");
        }
    }

public:
    // Default constructor
    AdjacencyList() = default;

    /**
     * @brief Adds a vertex to the graph
     * @param vertex ID of the vertex to add
     * @return true if vertex was added, false if it already existed
     */
    bool addVertex(V vertex) {
        return adjacencyLists.emplace(vertex, EdgeList{}).second;
    }

    /**
     * @brief Adds an edge to the graph
     * @param from Source vertex
     * @param to Target vertex
     * @param weight Edge weight (optional)
     * @throws std::out_of_range if either vertex doesn't exist
     */
    void addEdge(V from, V to, W weight = W{}) {
        // Validate both vertices exist
        validateVertex(from);
        validateVertex(to);

        // Check if edge already exists
        auto& fromList = adjacencyLists[from];
        auto it = std::find_if(fromList.begin(), fromList.end(),
            [to](const Edge& e) { return e.target == to; });

        if (it == fromList.end()) {
            // Add edge from source to target
            fromList.emplace_back(to, weight);
            edgeCount++;

            // For undirected graphs, add the reverse edge
            if (!Directed) {
                adjacencyLists[to].emplace_back(from, weight);
            }
        }
    }

    /**
     * @brief Checks if an edge exists between two vertices
     * @param from Source vertex
     * @param to Target vertex
     * @return true if edge exists, false otherwise
     */
    bool hasEdge(V from, V to) const {
        try {
            validateVertex(from);
            validateVertex(to);

            const auto& fromList = adjacencyLists.at(from);
            return std::find_if(fromList.begin(), fromList.end(),
                [to](const Edge& e) { return e.target == to; }) != fromList.end();
        } catch (const std::out_of_range&) {
            return false;
        }
    }

    /**
     * @brief Gets the neighbors of a vertex
     * @param vertex The vertex to get neighbors for
     * @return Vector of neighbor vertex IDs
     * @throws std::out_of_range if vertex doesn't exist
     */
    std::vector<V> getNeighbors(V vertex) const {
        validateVertex(vertex);

        std::vector<V> neighbors;
        const auto& edgeList = adjacencyLists.at(vertex);
        neighbors.reserve(edgeList.size());  // Pre-allocate space

        // Extract just the target vertices
        std::transform(edgeList.begin(), edgeList.end(),
                      std::back_inserter(neighbors),
                      [](const Edge& e) { return e.target; });

        return neighbors;
    }

    /**
     * @brief Gets the weight of an edge
     * @param from Source vertex
     * @param to Target vertex
     * @return Edge weight
     * @throws std::out_of_range if edge doesn't exist
     */
    W getWeight(V from, V to) const {
        validateVertex(from);
        const auto& fromList = adjacencyLists.at(from);

        auto it = std::find_if(fromList.begin(), fromList.end(),
            [to](const Edge& e) { return e.target == to; });

        if (it == fromList.end()) {
            throw std::out_of_range("Edge not found");
        }
        return it->weight;
    }

    /**
     * @brief Gets the number of vertices in the graph
     */
    size_t vertexCount() const {
        return adjacencyLists.size();
    }

    /**
     * @brief Gets the number of edges in the graph
     * Note: For undirected graphs, each edge is counted only once
     */
    size_t getEdgeCount() const {
        return Directed ? edgeCount : edgeCount / 2;
    }
};

Let’s examine how to use this implementation with a practical example:

Usage Example
int main() {
    // Create a weighted, undirected graph
    AdjacencyList<int, double, false> graph;

    // Add vertices
    for (int i = 0; i < 5; ++i) {
        graph.addVertex(i);
    }

    // Add edges with weights (representing distances)
    graph.addEdge(0, 1, 2.5);  // Edge from 0 to 1 with weight 2.5
    graph.addEdge(0, 2, 1.0);  // Edge from 0 to 2 with weight 1.0
    graph.addEdge(1, 3, 3.5);  // Edge from 1 to 3 with weight 3.5
    graph.addEdge(2, 3, 2.0);  // Edge from 2 to 3 with weight 2.0
    graph.addEdge(3, 4, 1.5);  // Edge from 3 to 4 with weight 1.5

    // Print information about vertex 3
    std::cout << "Neighbors of vertex 3:\n";
    for (int neighbor : graph.getNeighbors(3)) {
        double weight = graph.getWeight(3, neighbor);
        std::cout << "  → " << neighbor
                  << " (weight: " << weight << ")\n";
    }

    // Print graph statistics
    std::cout << "\nGraph Statistics:\n";
    std::cout << "Vertices: " << graph.vertexCount() << "\n";
    std::cout << "Edges: " << graph.getEdgeCount() << "\n";

    return 0;
}
Neighbors of vertex 3:
  → 1 (weight: 3.5)
  → 2 (weight: 2)
  → 4 (weight: 1.5)

Graph Statistics:
Vertices: 5
Edges: 2

Implementation Features

Our implementation includes several important features:

  • Template-based Design: Supports different vertex ID and weight types
  • Efficient Edge Lookup: Uses unordered_map for O(1) average case vertex access
  • Memory Efficient: Only stores existing edges
  • Exception Safety: Proper validation and error handling
  • Modern C++: Uses STL containers and algorithms

Performance Tips:

  • Use reserve() when adding many edges to avoid reallocations
  • Consider using a custom allocator for very large graphs
  • Profile your specific use case to choose between vector and unordered_set for edge storage
  • For extremely large graphs, consider using compressed storage formats

Eigen Implementation

While our custom implementation is efficient for many use cases, the Eigen library provides highly optimized sparse matrix capabilities that we can leverage for graph representation. Let's explore how to implement an adjacency list using Eigen's sparse matrix features.

Eigen-based Adjacency List Implementation
// Required Eigen headers
#include <Eigen/Sparse>  // For sparse matrix functionality
#include <vector>       // For storing vertex data
#include <unordered_map> // For vertex mapping
#include <stdexcept>    // For exception handling
#include <iostream> // For Input/Output
/**
 * @brief Graph implementation using Eigen's sparse matrix capabilities
 * @tparam T Type of edge weights (double, float, etc.)
 * @tparam StorageOrder Matrix storage order (ColMajor for better column slicing,
 *                      RowMajor for better row access)
 */
template<typename T = double,
         int StorageOrder = Eigen::ColMajor>
class EigenGraph {
public:
    // Type definitions for cleaner code
    using SparseMatrix = Eigen::SparseMatrix<T, StorageOrder>;
    using Triplet = Eigen::Triplet<T>;
    using Index = typename SparseMatrix::Index;

private:
    SparseMatrix adjacencyMatrix;    // Sparse matrix storing edges
    std::vector<bool> vertexExists;   // Track existing vertices
    size_t numEdges = 0;            // Track number of edges
    bool isDirected;                // Whether graph is directed

    /**
     * @brief Validates a vertex index
     * @param vertex Index to validate
     * @throws std::out_of_range if index is invalid
     */
    void validateVertex(Index vertex) const {
        if (vertex >= adjacencyMatrix.rows() || !vertexExists[vertex]) {
            throw std::out_of_range("Invalid vertex index");
        }
    }

public:
    /**
     * @brief Constructs a graph with initial capacity
     * @param initialCapacity Initial number of vertices to allocate
     * @param directed Whether the graph is directed
     */
    explicit EigenGraph(Index initialCapacity = 10, bool directed = false)
        : adjacencyMatrix(initialCapacity, initialCapacity)
        , vertexExists(initialCapacity, false)
        , isDirected(directed) {
        adjacencyMatrix.reserve(Eigen::VectorXi::Constant(initialCapacity, 5));
    }

    /**
     * @brief Adds a new vertex to the graph
     * @return Index of the new vertex
     */
    Index addVertex() {
        Index newIndex = 0;
        for (; newIndex < vertexExists.size(); ++newIndex) {
            if (!vertexExists[newIndex]) break;
        }

        if (newIndex == vertexExists.size()) {
            Index newSize = std::max<Index>(1, vertexExists.size() * 2);
            vertexExists.resize(newSize, false);

            SparseMatrix newMatrix(newSize, newSize);
            newMatrix.reserve(Eigen::VectorXi::Constant(newSize, 5));

            // Copy existing elements
            if (adjacencyMatrix.nonZeros() > 0) {
                newMatrix = adjacencyMatrix;
            }
            adjacencyMatrix.swap(newMatrix);
        }

        vertexExists[newIndex] = true;
        return newIndex;
    }

    /**
     * @brief Adds an edge between vertices
     * @param from Source vertex index
     * @param to Target vertex index
     * @param weight Edge weight
     * @throws std::out_of_range if either vertex is invalid
     */
    void addEdge(Index from, Index to, T weight = T(1)) {
        validateVertex(from);
        validateVertex(to);

        if (adjacencyMatrix.coeff(from, to) == T(0)) {
            adjacencyMatrix.coeffRef(from, to) = weight;
            if (!isDirected && from != to) {
                adjacencyMatrix.coeffRef(to, from) = weight;
            }
            numEdges += isDirected ? 1 : 2;
        }
    }

    /**
     * @brief Gets neighbors of a vertex using Eigen's sparse matrix features
     * @param vertex Vertex to get neighbors for
     * @return Vector of neighbor indices and their weights
     * @throws std::out_of_range if vertex is invalid
     */
    std::vector<std::pair<Index, T>> getNeighbors(Index vertex) const {
        validateVertex(vertex);
        std::vector<std::pair<Index, T>> neighbors;

        for (typename SparseMatrix::InnerIterator it(adjacencyMatrix, vertex);
             it; ++it) {
            if (it.row() != vertex) {  // Skip self-loops
                neighbors.emplace_back(it.row(), it.value());
            }
        }

        return neighbors;
    }

    /**
     * @brief Gets the weight of an edge
     * @param from Source vertex
     * @param to Target vertex
     * @return Edge weight, or 0 if edge doesn't exist
     */
    T getWeight(Index from, Index to) const {
        validateVertex(from);
        validateVertex(to);
        return adjacencyMatrix.coeff(from, to);
    }

    /**
     * @brief Performs efficient batch edge insertion
     * @param edges Vector of (from, to, weight) tuples
     */
    template<typename IndexType>
    void addEdgesBatch(const std::vector<std::tuple<IndexType, IndexType, T>>& edges) {
        std::vector<Triplet> triplets;
        triplets.reserve(isDirected ? edges.size() : edges.size() * 2);

        for (const auto& [from, to, weight] : edges) {
            validateVertex(static_cast<Index>(from));
            validateVertex(static_cast<Index>(to));

            triplets.emplace_back(static_cast<Index>(from),
                                static_cast<Index>(to), weight);
            if (!isDirected && from != to) {
                triplets.emplace_back(static_cast<Index>(to),
                                    static_cast<Index>(from), weight);
            }
        }

        adjacencyMatrix.setFromTriplets(triplets.begin(), triplets.end());
    }
};

Let's see how to use this updated Eigen-based implementation:

Eigen Implementation Usage Example
int main() {
    // Create a directed graph with initial capacity of 5 vertices
    EigenGraph<double> graph(5, true);

    // Add vertices
    std::vector<int> vertices;
    for (int i = 0; i < 5; ++i) {
        vertices.push_back(graph.addVertex());
    }

    // Add edges using batch insertion for better performance
    std::vector<std::tuple<int, int, double>> edges = {
        {0, 1, 2.5}, {0, 2, 1.0},
        {1, 3, 3.5}, {2, 3, 2.0},
        {3, 4, 1.5}
    };
    graph.addEdgesBatch(edges);

    // Print neighbors and weights for vertex 3
    std::cout << "Neighbors of vertex 3:\n";
    for (const auto& [neighbor, weight] : graph.getNeighbors(3)) {
        std::cout << "  → " << neighbor
                  << " (weight: " << weight << ")\n";
    }

    return 0;
}
Neighbors of vertex 3:
  → 1 (weight: 3.5)
  → 2 (weight: 2)

Eigen Implementation Benefits

  • Optimized Storage: Eigen's sparse matrix storage is highly optimized for memory usage
  • SIMD Operations: Eigen can leverage SIMD instructions for better performance
  • Matrix Operations: Easy integration with Eigen's rich set of matrix operations
  • Batch Processing: Efficient batch edge insertion using triplets
  • Memory Management: Automatic memory handling and optimization

Implementation Notes:

  • Choose RowMajor for graphs where you frequently access outgoing edges
  • Use ColMajor for graphs where you frequently need incoming edges
  • Batch edge insertions whenever possible for better performance
  • Consider using Eigen's pruning functions to remove deleted edges

Performance Considerations

When implementing adjacency lists for large sparse graphs, several key optimization strategies can significantly improve performance. Here are practical considerations for both custom and Eigen-based implementations.

Memory Management

  • Container Pre-allocation:
    • Estimate average vertex degree and pre-allocate edge lists
    • Use reserve() on vectors to prevent frequent reallocations
    • Consider shrink_to_fit() after bulk insertions to reclaim memory
  • Memory Layout:
    • Keep frequently accessed data (vertex IDs, weights) together
    • Consider struct-of-arrays vs array-of-structs based on access patterns
    • Align data structures to cache line boundaries for better access
Memory-Optimized Adjacency List Implementation
// Required standard library headers
#include <vector>       // For dynamic array containers
#include <cstdint>      // For fixed-width integer types (uint32_t)
#include <tuple>        // For std::tuple in batch operations
#include <stdexcept>    // For exceptions
#include <iostream>     // For output in examples
#include <algorithm>    // For std::fill, std::find_if
#include <new>          // For aligned new operator
#include <memory>       // For smart pointers and memory utilities
#include <optional>     // For optional values in error handling
#include <system_error> // For error codes and conditions

/**
 * Custom exceptions for graph-specific errors
 */
class graph_error : public std::runtime_error {
    using std::runtime_error::runtime_error;
};

class duplicate_edge_error : public graph_error {
public:
    duplicate_edge_error() : graph_error("Edge already exists") {}
};

class invalid_edge_error : public graph_error {
public:
    invalid_edge_error(const std::string& msg) : graph_error(msg) {}
};

/**
 * Edge structure optimized for cache access
 * - Aligned to 64-byte cache line boundary for optimal memory access
 * - Frequently accessed data kept together
 * - Compact representation for better cache utilization
 */
struct alignas(64) Edge {
    uint32_t targetVertex;       // Target vertex ID
    float weight;               // Edge weight (use std::optional if weights are optional)
    bool valid = true;          // Flag for soft deletion

    Edge(uint32_t target, float w) : targetVertex(target), weight(w) {}

    // Equality comparison for finding duplicate edges
    bool operator==(const Edge& other) const {
        return targetVertex == other.targetVertex;
    }
};

/**
 * Adjacency list implementation with memory optimizations and robust error handling
 * Demonstrates best practices for memory management in graph data structures
 */
class OptimizedAdjacencyList {
private:
    // Configuration constants
    static constexpr size_t AVG_DEGREE = 8;      // Typical average degree for pre-allocation
    static constexpr size_t MAX_VERTICES = 1e6;  // Maximum allowed vertices

    /**
     * Vertex data using Struct-of-Arrays pattern
     * Groups related data together for better cache utilization
     */
    struct VertexData {
        std::vector<Edge> edges;     // Edge list with pre-allocated capacity
        uint32_t inDegree = 0;      // Track incoming edges for quick access
        uint32_t outDegree = 0;     // Track outgoing edges for quick access
        bool active = false;        // Vertex state without deletion overhead

        // Track memory allocation status
        std::error_code lastError;
    };

    std::vector<VertexData> vertices;    // Main container, pre-allocated
    size_t activeVertexCount = 0;        // Track active vertex count
    bool allowSelfLoops;                 // Configuration flag for self-loops

    /**
     * Validates a vertex exists and is active
     * @throws std::out_of_range if vertex index is invalid
     * @throws graph_error if vertex is inactive
     */
    void validateVertex(uint32_t vertex) const {
        if (vertex >= vertices.size()) {
            throw std::out_of_range("Vertex index out of bounds");
        }
        if (!vertices[vertex].active) {
            throw graph_error("Vertex is not active");
        }
    }

    /**
     * Checks if an edge already exists
     * @return true if edge exists, false otherwise
     */
    bool hasEdge(uint32_t from, uint32_t to) const {
        const auto& fromEdges = vertices[from].edges;
        return std::find_if(fromEdges.begin(), fromEdges.end(),
            [to](const Edge& e) { return e.valid && e.targetVertex == to; }
        ) != fromEdges.end();
    }

public:
    /**
     * Constructor with memory pre-allocation and configuration
     * @param initialVertexCapacity Expected number of vertices
     * @param allowLoops Whether self-loops are allowed (default: false)
     * @throws std::invalid_argument if capacity exceeds maximum
     * @throws std::bad_alloc if memory allocation fails
     */
    explicit OptimizedAdjacencyList(size_t initialVertexCapacity = 1000,
                                  bool allowLoops = false)
        : allowSelfLoops(allowLoops) {
        if (initialVertexCapacity > MAX_VERTICES) {
            throw std::invalid_argument("Vertex capacity exceeds maximum limit");
        }
        try {
            // Pre-allocate vertex storage to prevent reallocation
            vertices.reserve(initialVertexCapacity);
            vertices.resize(initialVertexCapacity);

            // Pre-allocate edge storage for each vertex
            for (auto& vertex : vertices) {
                vertex.edges.reserve(AVG_DEGREE);
            }
        } catch (const std::bad_alloc& e) {
            throw std::bad_alloc();
        }
    }

    /**
     * Add vertex with custom edge capacity estimation
     * @param expectedDegree Expected number of edges for this vertex
     * @return Index of the new vertex
     * @throws std::bad_alloc if memory allocation fails
     */
    uint32_t addVertex(size_t expectedDegree = AVG_DEGREE) {
        try {
            // Reuse inactive vertex slots when available
            for (size_t i = 0; i < vertices.size(); ++i) {
                if (!vertices[i].active) {
                    vertices[i].active = true;
                    vertices[i].edges.reserve(expectedDegree);
                    activeVertexCount++;
                    return i;
                }
            }

            // Create new vertex when no inactive slots available
            if (vertices.size() >= MAX_VERTICES) {
                throw graph_error("Maximum vertex limit reached");
            }

            vertices.push_back(VertexData{});
            vertices.back().edges.reserve(expectedDegree);
            vertices.back().active = true;
            activeVertexCount++;
            return vertices.size() - 1;
        } catch (const std::bad_alloc& e) {
            throw std::bad_alloc();
        }
    }

    /**
     * Add single edge with validation
     * @param from Source vertex
     * @param to Target vertex
     * @param weight Edge weight
     * @throws various exceptions for invalid operations
     */
    void addEdge(uint32_t from, uint32_t to, float weight) {
        // Validate vertices
        validateVertex(from);
        validateVertex(to);

        // Check self-loops
        if (from == to && !allowSelfLoops) {
            throw invalid_edge_error("Self-loops are not allowed");
        }

        // Check for duplicate edge
        if (hasEdge(from, to)) {
            throw duplicate_edge_error();
        }

        try {
            vertices[from].edges.emplace_back(to, weight);
            vertices[from].outDegree++;
            vertices[to].inDegree++;
        } catch (const std::bad_alloc& e) {
            throw std::bad_alloc();
        }
    }

    /**
     * Memory-efficient batch edge addition with validation
     * @param edges Vector of (from, to, weight) tuples
     * @throws various exceptions for invalid operations
     */
    void addEdgesBatch(const std::vector<std::tuple<uint32_t, uint32_t, float>>& edges) {
        if (edges.empty()) return;

        try {
            // Validate all edges first
            for (const auto& [from, to, _] : edges) {
                validateVertex(from);
                validateVertex(to);
                if (from == to && !allowSelfLoops) {
                    throw invalid_edge_error("Self-loops are not allowed");
                }
                if (hasEdge(from, to)) {
                    throw duplicate_edge_error();
                }
            }

            // Pre-calculate capacity requirements
            std::vector<size_t> newEdgeCounts(vertices.size(), 0);
            for (const auto& [from, to, _] : edges) {
                newEdgeCounts[from]++;
            }

            // Pre-allocate required capacity
            for (size_t i = 0; i < vertices.size(); ++i) {
                if (newEdgeCounts[i] > 0) {
                    vertices[i].edges.reserve(
                        vertices[i].edges.size() + newEdgeCounts[i]
                    );
                }
            }

            // Add all edges
            for (const auto& [from, to, weight] : edges) {
                vertices[from].edges.emplace_back(to, weight);
                vertices[from].outDegree++;
                vertices[to].inDegree++;
            }
        } catch (const std::bad_alloc& e) {
            throw std::bad_alloc();
        }
    }

    /**
     * Optimize memory usage by releasing excess capacity
     * - Compacts edge lists that are significantly over-allocated
     * - Updates capacity tracking
     * - Handles empty vertices
     * @return true if optimization was performed, false if no optimization needed
     */
    bool optimizeMemory() {
        bool optimized = false;
        try {
            for (auto& vertex : vertices) {
                if (!vertex.active) continue;

                // Only shrink if capacity is significantly larger than size
                if (vertex.edges.capacity() > vertex.edges.size() * 2) {
                    vertex.edges.shrink_to_fit();
                    optimized = true;
                }

                // Remove invalidated edges if any
                if (std::any_of(vertex.edges.begin(), vertex.edges.end(),
                              [](const Edge& e) { return !e.valid; })) {
                    auto newEnd = std::remove_if(vertex.edges.begin(),
                                               vertex.edges.end(),
                                               [](const Edge& e) { return !e.valid; });
                    vertex.edges.erase(newEnd, vertex.edges.end());
                    optimized = true;
                }
            }
        } catch (const std::bad_alloc& e) {
            // If optimization fails, original data remains intact
            return false;
        }
        return optimized;
    }

    /**
     * Structure for monitoring memory usage
     */
    struct MemoryStats {
        size_t totalVertices;    // Total vertex slots
        size_t activeVertices;   // Currently active vertices
        size_t totalEdges;       // Total valid edges in graph
        size_t totalCapacity;    // Total edge capacity
        float memoryEfficiency;  // Ratio of used to allocated space

        // Constructor ensures safe calculation of memory efficiency
        MemoryStats(size_t tv, size_t av, size_t te, size_t tc)
            : totalVertices(tv)
            , activeVertices(av)
            , totalEdges(te)
            , totalCapacity(tc)
            , memoryEfficiency(tc > 0 ? (float)te / tc * 100 : 0.0f)
        {}
    };

    /**
     * Get current memory usage statistics
     * @return MemoryStats structure with usage information
     */
    MemoryStats getMemoryStats() const {
        size_t totalEdges = 0;
        size_t totalCapacity = 0;

        for (const auto& vertex : vertices) {
            if (vertex.active) {
                totalEdges += std::count_if(vertex.edges.begin(),
                                          vertex.edges.end(),
                                          [](const Edge& e) { return e.valid; });
                totalCapacity += vertex.edges.capacity();
            }
        }

        return MemoryStats(
            vertices.size(),
            activeVertexCount,
            totalEdges,
            totalCapacity
        );
    }
};
Usage Example with Error Handling
int main() {
    try {
        // Create graph with initial capacity and no self-loops
        OptimizedAdjacencyList graph(1000, false);

        // Add vertices with custom expected degrees
        uint32_t highDegreeVertex = graph.addVertex(100);
        uint32_t normalVertex = graph.addVertex();

        try {
            // Attempt to add a self-loop (will throw)
            graph.addEdge(highDegreeVertex, highDegreeVertex, 1.0f);
        } catch (const invalid_edge_error& e) {
            std::cerr << "Error: " << e.what() << '\n';
        }

        // Prepare batch of edges
        std::vector<std::tuple<uint32_t, uint32_t, float>> edges;
        edges.reserve(50);

        // Add 50 unique edges
        for (uint32_t i = 0; i < 50; ++i) {
            uint32_t newVertex = graph.addVertex();
            edges.emplace_back(highDegreeVertex, newVertex, 1.0f);
        }

        // Add edges in batch
        try {
            graph.addEdgesBatch(edges);
        } catch (const duplicate_edge_error& e) {
            std::cerr << "Error: " << e.what() << '\n';
        }

        // Check and display memory usage
        auto stats = graph.getMemoryStats();
        std::cout << "Memory Statistics:\n"
                  << "Active vertices: " << stats.activeVertices << '\n'
                  << "Total edges: " << stats.totalEdges << '\n'
                  << "Memory efficiency: " << stats.memoryEfficiency << "%\n";

        // Optimize memory and check if optimization occurred
        if (graph.optimizeMemory()) {
            std::cout << "Memory optimization performed\n";
            // Get updated stats
            stats = graph.getMemoryStats();
            std::cout << "New memory efficiency: "
                      << stats.memoryEfficiency << "%\n";
        }

    } catch (const std::bad_alloc& e) {
        std::cerr << "Memory allocation failed\n";
        return 1;
    } catch (const std::exception& e) {
        std::cerr << "Error: " << e.what() << '\n';
        return 1;
    }

    return 0;
}

Implementation Features:

  • Comprehensive error handling with custom exceptions
  • Safe memory efficiency calculations
  • Validation for self-loops and duplicate edges
  • Proper memory management with allocation failure handling
  • Detailed memory statistics and optimization tracking
Error: Self-loops are not allowed
Memory Statistics:
Active vertices: 52
Total edges: 50
Memory efficiency: 9.84252%
Memory optimization performed
New memory efficiency: 50%

Understanding the Output

  • Initial Error Message
    • Shows our error handling working correctly
    • Prevented an invalid self-loop operation as configured
  • Memory Statistics Before Optimization
    • Active vertices: 52
      • 2 initial vertices (high-degree and normal)
      • 50 additional vertices created for edge connections
    • Total edges: 50
      • Represents successful batch addition of edges
      • One edge to each new vertex
    • Initial efficiency: 9.84252%
      • Low efficiency due to pre-allocated space across all vertices
      • Each vertex has default capacity even if minimally used
  • After Memory Optimization
    • Efficiency improved to 50%
      • Result of shrinking excess capacity
      • Better alignment between allocated space and actual usage

Key Observations: The dramatic improvement in memory efficiency (from ~10% to 50%) demonstrates the effectiveness of the optimizeMemory() operation in reclaiming over-allocated space. This is particularly important in sparse graphs where many vertices may have fewer edges than the default pre-allocation.

Memory Optimization Tips:

  • Pre-allocation: Reserve memory upfront based on expected graph size and connectivity patterns to minimize reallocation overhead.
  • Batch Operations: Group multiple edge additions together to reduce memory allocations and improve cache utilization.
  • Memory Optimization: Call optimizeMemory() after large operations to release unused capacity and consolidate memory usage.
  • Usage Monitoring: Regularly track memory statistics to identify inefficiencies and guide optimization decisions.

Performance Testing Tips:

  • Profiling: Use performance profiling tools to identify actual bottlenecks before optimizing.
  • Cache Analysis: Monitor cache miss rates and memory access patterns to guide optimization efforts.
  • Realistic Testing: Benchmark with production-like data sizes and access patterns.
  • Performance Metrics: Consider both throughput and latency requirements in your optimization strategy.

Common Pitfalls to Avoid

  • Premature Optimization: Always measure and profile before implementing complex optimizations.
  • Early Parallelization: Ensure sequential performance is optimized before adding concurrency.
  • Memory Churn: Minimize dynamic allocations during graph traversal and modification operations.
  • Cache Ignorance: Consider cache behavior when designing data structures and access patterns.
  • Container Complexity: Use simple vectors when possible instead of more complex container types.

Real-World Use Cases

Adjacency lists for sparse graphs find applications across numerous domains. Here are some prominent real-world use cases where this data structure excels:

Social Network Analysis

  • User Modeling: Represent each user as a vertex in the graph, enabling efficient social connection analysis and storage.
  • Friendship Mapping: Model relationships as edges between vertices, providing natural representation of social connections.
  • Community Detection: Efficiently identify and analyze groups of closely connected users through neighborhood analysis.
  • Influence Analysis: Track and measure user influence by analyzing connection patterns and interaction weights.
  • Professional Networking: Model career connections and endorsements to analyze professional relationship patterns.
  • Career Tracking: Analyze career progression paths by examining temporal patterns in professional connections.

Transportation Networks

  • Road Mapping: Represent intersections as vertices and roads as weighted edges for efficient navigation systems.
  • Traffic Analysis: Use edge weights to model travel times and traffic conditions for real-time routing.
  • Route Planning: Enable efficient pathfinding algorithms through sparse graph representation of road networks.
  • Transit Systems: Model stations as vertices and routes as edges with schedule information for journey planning.
  • Multi-modal Transport: Combine different transportation types in a single graph for comprehensive route planning.
  • Service Optimization: Analyze usage patterns to optimize transit schedules and resource allocation.

Computer Networks

  • Network Structure: Model devices as vertices and connections as edges with bandwidth and latency weights.
  • Routing Optimization: Enable efficient path finding for network packet routing and traffic management.
  • Capacity Planning: Analyze network topology to optimize resource allocation and predict bottlenecks.
  • Data Center Design: Model server interconnections for efficient load balancing and resource distribution.
  • Fault Tolerance: Analyze network redundancy and identify critical paths for reliability planning.
  • Performance Analysis: Track and optimize network performance through graph-based metrics.

Scientific Computing

  • Molecular Analysis: Represent atoms as vertices and chemical bonds as edges for structure analysis.
  • Protein Networks: Model protein interactions for drug discovery and biochemical research.
  • Neural Modeling: Represent neurons as vertices and synapses as weighted edges for network analysis.
  • Sparse Networks: Efficiently implement sparse neural networks with optimized memory usage.
  • Research Applications: Support complex scientific computations with efficient graph operations.
  • Data Analysis: Enable efficient processing of large-scale scientific datasets.

Domain-Specific Implementation Tips:

  • Social Analysis: Optimize for rapid neighborhood queries and community structure detection in large user networks.
  • Transport Systems: Focus on real-time pathfinding algorithms and dynamic edge weight updates.
  • Network Infrastructure: Emphasize low-latency operations and robust fault tolerance mechanisms.
  • Scientific Applications: Prioritize numerical stability and efficient parallel processing capabilities.

Common Requirements Across Domains

  • Scale Management: Handle large-scale graphs with millions of vertices while maintaining performance.
  • Dynamic Updates: Support efficient modification of graph structure without compromising performance.
  • Memory Efficiency: Optimize memory usage through appropriate data structures and algorithms.
  • Query Performance: Ensure fast neighborhood queries and efficient graph traversal operations.
  • Update Speed: Maintain quick edge modification capabilities for dynamic graph changes.
  • Operational Stability: Provide consistent performance under varying load conditions.
  • Data Integrity: Ensure reliable updates and proper error handling during modifications.
  • Recovery Mechanisms: Implement robust error recovery and data consistency measures.

Conclusion

Adjacency lists provide a powerful and memory-efficient way to represent sparse graphs in C++. We've covered essential concepts and implementations, from basic custom implementations to advanced Eigen-based solutions. While this representation excels in sparse graphs and offers memory efficiency, it's important to consider your specific use case when choosing between adjacency lists and other graph representations like adjacency matrices.

Key takeaways from this guide:

  • Adjacency lists offer O(|V| + |E|) space complexity, making them ideal for sparse graphs
  • They're particularly effective for dynamic graphs and neighborhood queries
  • Both custom and Eigen implementations can be optimized for specific requirements
  • Memory patterns and cache utilization are crucial for performance

Congratulations on reading to the end of this comprehensive guide! If you found this guide valuable for your C++ journey, please consider citing or sharing it with fellow developers. Your support helps us continue creating comprehensive C++ resources for the development community.

Be sure to explore the Further Reading section for additional resources on graph theory, adjacency lists, and modern C++ practices.

Have fun and happy coding!

Further Reading

C++ Learning Resources

  • C++ Solutions

    Master modern C++ with comprehensive tutorials and practical solutions. From core concepts to advanced techniques, explore clear examples and best practices for efficient, high-performance programming.

  • Online C++ Compiler

    Write, compile, and run your C++ code directly in your browser. Perfect for experimenting with adjacency lists and testing code snippets without setting up a local development environment.

Graph Theory Implementation Guides

Theoretical Foundations

Library Documentation and Tools

Performance and Optimization

Attribution and Citation

If you found this guide helpful, please consider citing it in your work!

Profile Picture
Senior Advisor, Data Science | [email protected] |  + posts

Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.

Buy Me a Coffee ✨