Swift Collections 내부 구조 분석

Swift 2025. 9. 6. 08:14

안녕하세요. 그린입니다 🍏

이번 포스팅은 Swift의 Collections 내부 구조를 깊이 분석해보고, 각 컬렉션이 어떻게 메모리에서 동작하는지 알아보겠습니다 🙋🏻

Swift Collections Deep Dive

개발하면서 이런 궁금증 가져본 적 있을 거예요!

"Array와 Set의 성능 차이가 왜 이렇게 날까? 🤔"
"Dictionary는 어떻게 O(1)로 검색이 가능하지?"
"메모리는 어떻게 관리되고 있을까?"
"copy-on-write가 정확히 언제 동작하지?"

또는 대용량 데이터를 처리하면서 "왜 이렇게 메모리를 많이 먹지?" 하고 생각해본 적 있으실 거에요.

바로 이런 의문들을 해결할 수 있는 것이 Swift Collections의 내부 구조를 이해하는 것입니다.

Swift의 모든 컬렉션은 성능과 메모리 효율성을 위해 정교하게 설계되어 있고, 각각 고유한 내부 구조를 가지고 있어요.

Why Collections Architecture Matters More Than Ever?

📈 성능 최적화의 핵심

적절한 컬렉션 선택으로 성능이 10~100배 향상 가능
메모리 사용량 50% 절약 (올바른 사용 시)
대용량 데이터 처리 시 응답 시간 90% 단축

🚀 Swift 생태계 트렌드

iOS 15+에서 새로운 컬렉션 최적화
SwiftUI와의 완벽한 통합
Server-side Swift에서 메모리 효율성 극대화
Swift 6에서 추가된 새로운 컬렉션 타입들

Array 내부 구조 분석

Array는 Swift에서 가장 기본적이면서도 강력한 컬렉션이에요.

🎯 메모리 레이아웃

// Array의 실제 내부 구조 (단순화)
struct Array<Element> {
    var _buffer: _ArrayBuffer<Element>
}

struct _ArrayBuffer<Element> {
    var storage: _ArrayBufferStorage<Element>
    var startIndex: Int
    var endIndex: Int
}

struct _ArrayBufferStorage<Element> {
    var refCount: Int        // Reference counting
    var capacity: Int        // 할당된 메모리 크기
    var elements: UnsafePointer<Element>  // 실제 데이터
}

🎯 Copy-on-Write 메커니즘

var original = [1, 2, 3, 4, 5]
var copy = original

// 이 시점에서는 메모리 공유 (복사 안됨)
print("메모리 주소 같음: \(original.withUnsafeBufferPointer { $0.baseAddress } == copy.withUnsafeBufferPointer { $0.baseAddress })")

// 수정 시점에서 복사 발생 (Copy-on-Write)
copy.append(6)  // 여기서 실제 복사 발생

print("메모리 주소 달라짐: \(original.withUnsafeBufferPointer { $0.baseAddress } != copy.withUnsafeBufferPointer { $0.baseAddress })")

🎯 동적 크기 조정 (Dynamic Resizing)

// Array 용량 증가 패턴 분석
var numbers: [Int] = []
var capacityHistory: [(count: Int, capacity: Int)] = []

for i in 1...20 {
    numbers.append(i)
    // capacity 확인을 위한 내부 API 접근
    numbers.withUnsafeBufferPointer { buffer in
        capacityHistory.append((count: numbers.count, capacity: buffer.count))
    }
}

// 결과 분석
/*
Count: 1,  Capacity: 2   (초기 할당)
Count: 2,  Capacity: 2   (공간 활용)
Count: 3,  Capacity: 4   (2배 증가)
Count: 4,  Capacity: 4   (공간 활용)
Count: 5,  Capacity: 8   (2배 증가)
...
*/

Dictionary 내부 구조 분석

Dictionary는 해시 테이블을 기반으로 하는 매우 정교한 구조예요.

🎯 해시 테이블 구조

// Dictionary의 내부 구조 (개념적)
struct Dictionary<Key: Hashable, Value> {
    var _variant: _Variant
    
    enum _Variant {
        case native(_NativeDictionary<Key, Value>)
        case cocoa(_CocoaDictionary)  // NSDictionary 호환
    }
}

struct _NativeDictionary<Key: Hashable, Value> {
    var buckets: _DictionaryBuckets<Key, Value>
    var count: Int
    var capacity: Int
}

struct _DictionaryBuckets<Key, Value> {
    var keys: UnsafeMutablePointer<Key?>     // 키 배열
    var values: UnsafeMutablePointer<Value>  // 값 배열
    var hashes: UnsafeMutablePointer<Int>    // 해시값 배열
}

🎯 충돌 해결 - Open Addressing

// 해시 충돌 해결 방법 시연
struct SimpleHashTable<Key: Hashable, Value> {
    private var buckets: [(key: Key, value: Value)?]
    private var capacity: Int
    
    init(capacity: Int = 16) {
        self.capacity = capacity
        self.buckets = Array(repeating: nil, count: capacity)
    }
    
    private func hash(_ key: Key) -> Int {
        return abs(key.hashValue) % capacity
    }
    
    mutating func setValue(_ value: Value, forKey key: Key) {
        var index = hash(key)
        
        // Linear probing으로 빈 공간 찾기
        while buckets[index] != nil {
            if buckets[index]?.key == key {
                // 기존 키 업데이트
                buckets[index] = (key, value)
                return
            }
            index = (index + 1) % capacity  // 다음 인덱스로
        }
        
        buckets[index] = (key, value)
    }
    
    func getValue(forKey key: Key) -> Value? {
        var index = hash(key)
        
        while let bucket = buckets[index] {
            if bucket.key == key {
                return bucket.value
            }
            index = (index + 1) % capacity
        }
        
        return nil
    }
}

🎯 동적 리해싱 (Rehashing)

// Dictionary 용량 증가 시뮬레이션
extension SimpleHashTable {
    mutating func resize() {
        let oldBuckets = buckets
        capacity *= 2
        buckets = Array(repeating: nil, count: capacity)
        
        // 모든 기존 요소를 새로운 해시 테이블에 재삽입
        for bucket in oldBuckets {
            if let (key, value) = bucket {
                setValue(value, forKey: key)
            }
        }
    }
    
    // Load factor 확인 (일반적으로 0.75에서 리사이징)
    var loadFactor: Double {
        let count = buckets.compactMap { $0 }.count
        return Double(count) / Double(capacity)
    }
}

Set 내부 구조 분석

Set은 Dictionary와 유사하지만 값 없이 키만 저장하는 구조예요.

🎯 Set의 최적화된 구조

// Set의 내부 구조 (개념적)
struct Set<Element: Hashable> {
    private var _variant: _Variant
    
    enum _Variant {
        case native(_NativeSet<Element>)
        case cocoa(_CocoaSet)
    }
}

struct _NativeSet<Element: Hashable> {
    var elements: _SetBuffer<Element>
    var count: Int
}

struct _SetBuffer<Element: Hashable> {
    var storage: UnsafeMutablePointer<Element?>
    var capacity: Int
    var hashSeed: Int  // 해시 충돌 방지용 시드
}

🎯 Set 연산 최적화

// Set 교집합 연산의 내부 동작
extension Set {
    func optimizedIntersection<S: Sequence>(with other: S) -> Set<Element> 
    where S.Element == Element {
        // 작은 집합을 기준으로 순회 (성능 최적화)
        let otherSet = Set(other)
        let smaller = self.count < otherSet.count ? self : otherSet
        let larger = self.count < otherSet.count ? otherSet : self
        
        var result: Set<Element> = []
        for element in smaller {
            if larger.contains(element) {
                result.insert(element)
            }
        }
        
        return result
    }
}

성능 특성 비교 분석

🚀 Big O 복잡도 실측

import Foundation

// 성능 측정 헬퍼
func measureTime<T>(operation: () -> T) -> (result: T, time: TimeInterval) {
    let startTime = CFAbsoluteTimeGetCurrent()
    let result = operation()
    let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime
    return (result, timeElapsed)
}

// 다양한 크기에서 성능 테스트
func performanceComparison() {
    let sizes = [1000, 10000, 100000, 1000000]
    
    for size in sizes {
        print("📊 Size: \(size)")
        
        // Array 검색 성능 (O(n))
        let array = Array(1...size)
        let (_, arrayTime) = measureTime {
            array.contains(size / 2)
        }
        
        // Set 검색 성능 (O(1) 평균)
        let set = Set(array)
        let (_, setTime) = measureTime {
            set.contains(size / 2)
        }
        
        // Dictionary 검색 성능 (O(1) 평균)
        let dict = Dictionary(uniqueKeysWithValues: array.map { ($0, $0) })
        let (_, dictTime) = measureTime {
            dict[size / 2] != nil
        }
        
        print("  Array: \(arrayTime * 1000) ms")
        print("  Set:   \(setTime * 1000) ms")
        print("  Dict:  \(dictTime * 1000) ms")
    }
}

🚀 메모리 사용량 분석

import Foundation

// 메모리 사용량 측정
func memoryUsageAnalysis() {
    let elementCount = 100000
    
    // Array 메모리 사용량
    let arrayMemory = MemoryLayout<[Int]>.size + 
                     MemoryLayout<Int>.size * elementCount
    
    // Set 메모리 사용량 (해시 테이블 오버헤드 포함)
    let setMemory = arrayMemory * 2  // 대략 2배 (해시 테이블 특성상)
    
    // Dictionary 메모리 사용량
    let dictionaryMemory = arrayMemory * 3  // Key + Value + Hash 정보
    
    print("📈 메모리 사용량 비교 (\(elementCount) 요소)")
    print("Array:      \(arrayMemory / 1024) KB")
    print("Set:        \(setMemory / 1024) KB")
    print("Dictionary: \(dictionaryMemory / 1024) KB")
}

Copy-on-Write 심화 분석

🎯 COW 최적화 기법

// Custom COW 구조체 구현
struct COWArray<Element> {
    private final class Storage {
        var elements: [Element]
        
        init(_ elements: [Element]) {
            self.elements = elements
        }
        
        func copy() -> Storage {
            return Storage(elements)
        }
    }
    
    private var storage: Storage
    
    init(_ elements: [Element] = []) {
        storage = Storage(elements)
    }
    
    private mutating func makeUniqueIfNeeded() {
        if !isKnownUniquelyReferenced(&storage) {
            storage = storage.copy()
        }
    }
    
    var count: Int {
        return storage.elements.count
    }
    
    subscript(index: Int) -> Element {
        get {
            return storage.elements[index]
        }
        set {
            makeUniqueIfNeeded()
            storage.elements[index] = newValue
        }
    }
    
    mutating func append(_ element: Element) {
        makeUniqueIfNeeded()
        storage.elements.append(element)
    }
}

Select Collections Guide

✅ Array를 사용해야 할 때

// 순서가 중요하고 인덱스 기반 접근이 필요한 경우
let usernames = ["alice", "bob", "charlie"]
let firstUser = usernames[0]  // O(1)

// 데이터를 순차적으로 처리하는 경우
let numbers = [1, 2, 3, 4, 5]
let doubled = numbers.map { $0 * 2 }  // 효율적

// 작은 데이터셋에서 검색 (단순한 경우)
if numbers.contains(3) {  // O(n)이지만 작은 데이터에서는 괜찮음
    print("Found")
}

✅ Set을 사용해야 할 때

// 중복 제거가 필요한 경우
let uniqueNumbers = Set([1, 2, 2, 3, 3, 4])  // [1, 2, 3, 4]

// 빠른 존재 확인이 필요한 경우
let allowedUsers = Set(["admin", "moderator", "user"])
if allowedUsers.contains(currentUser) {  // O(1)
    // 권한 승인
}

// 집합 연산이 필요한 경우
let setA: Set = [1, 2, 3, 4]
let setB: Set = [3, 4, 5, 6]
let intersection = setA.intersection(setB)  // [3, 4]

✅ Dictionary를 사용해야 할 때

// Key-Value 매핑이 필요한 경우
let userProfiles = [
    "alice": UserProfile(name: "Alice", age: 25),
    "bob": UserProfile(name: "Bob", age: 30)
]

// 빠른 룩업이 중요한 경우
if let profile = userProfiles["alice"] {  // O(1)
    print(profile.name)
}

// 카운팅/그룹핑이 필요한 경우
let words = ["apple", "banana", "apple", "cherry", "banana"]
let wordCount = words.reduce(into: [:]) { counts, word in
    counts[word, default: 0] += 1
}  // ["apple": 2, "banana": 2, "cherry": 1]

다른 최적화 기법

💡 메모리 풀링 (Memory Pooling)

// 대용량 데이터 처리를 위한 메모리 풀링
class ArrayPool<Element> {
    private var pool: [[Element]] = []
    private let maxPoolSize: Int
    
    init(maxPoolSize: Int = 10) {
        self.maxPoolSize = maxPoolSize
    }
    
    func borrowArray() -> [Element] {
        if let array = pool.popLast() {
            return array
        }
        return []
    }
    
    func returnArray(_ array: [Element]) {
        if pool.count < maxPoolSize {
            var cleanArray = array
            cleanArray.removeAll(keepingCapacity: true)
            pool.append(cleanArray)
        }
    }
}

// 사용 예시
let pool = ArrayPool<Int>()

func processLargeDataset() {
    var workArray = pool.borrowArray()
    defer { pool.returnArray(workArray) }
    
    // 대용량 데이터 처리
    for i in 1...100000 {
        workArray.append(i)
    }
    
    // 처리 로직...
}

💡 Lazy 컬렉션 활용

// 메모리 효율적인 데이터 처리
let hugeRange = 1...1_000_000

// ❌ 비효율: 모든 데이터를 메모리에 로드
let inefficientResult = hugeRange
    .map { $0 * $0 }
    .filter { $0 % 2 == 0 }
    .prefix(10)
    .reduce(0, +)

// ✅ 효율적: 필요한 만큼만 처리
let efficientResult = hugeRange
    .lazy
    .map { $0 * $0 }
    .filter { $0 % 2 == 0 }
    .prefix(10)
    .reduce(0, +)

💡 컬렉션 타입 전환 최적화

// 상황에 맞는 컬렉션 타입 전환
extension Array where Element: Hashable {
    func toSet() -> Set<Element> {
        // 중복이 많을 때 효과적
        return Set(self)
    }
    
    func toDictionary<Value>(valueProvider: (Element) -> Value) -> [Element: Value] {
        return Dictionary(uniqueKeysWithValues: self.map { ($0, valueProvider($0)) })
    }
}

// 사용 예시
let userIds = [1, 2, 3, 2, 1, 4, 5]
let uniqueIds = userIds.toSet()  // 중복 제거

let userDict = userIds.toDictionary { id in
    "User\(id)"
}  // [1: "User1", 2: "User2", ...]

디버깅과 프로파일링

📊 컬렉션 상태 모니터링

// 컬렉션 성능 모니터링 도구
extension Array {
    var memoryUsage: Int {
        return MemoryLayout<Element>.size * capacity + MemoryLayout<Array>.size
    }
    
    func printStats() {
        print("📊 Array Stats:")
        print("  Count: \(count)")
        print("  Capacity: \(capacity)")
        print("  Memory Usage: \(memoryUsage) bytes")
        print("  Load Factor: \(Double(count) / Double(capacity))")
    }
}

extension Dictionary {
    func printStats() {
        print("📊 Dictionary Stats:")
        print("  Count: \(count)")
        print("  Memory Usage (estimated): \((count * (MemoryLayout<Key>.size + MemoryLayout<Value>.size)) * 2) bytes")
    }
}

📊 성능 테스트 프레임워크

// 자동화된 성능 테스트
struct PerformanceTestSuite {
    static func runCollectionTests() {
        let testSizes = [1000, 10000, 100000]
        
        for size in testSizes {
            print("🧪 Testing size: \(size)")
            
            testArrayPerformance(size: size)
            testSetPerformance(size: size)
            testDictionaryPerformance(size: size)
            
            print("---")
        }
    }
    
    private static func testArrayPerformance(size: Int) {
        let array = Array(1...size)
        
        let searchTime = measureTime {
            _ = array.contains(size / 2)
        }
        
        let insertTime = measureTime {
            var mutableArray = array
            mutableArray.append(size + 1)
        }
        
        print("  Array - Search: \(searchTime.time * 1000)ms, Insert: \(insertTime.time * 1000)ms")
    }
    
    private static func testSetPerformance(size: Int) {
        let set = Set(1...size)
        
        let searchTime = measureTime {
            _ = set.contains(size / 2)
        }
        
        let insertTime = measureTime {
            var mutableSet = set
            mutableSet.insert(size + 1)
        }
        
        print("  Set - Search: \(searchTime.time * 1000)ms, Insert: \(insertTime.time * 1000)ms")
    }
    
    private static func testDictionaryPerformance(size: Int) {
        let dict = Dictionary(uniqueKeysWithValues: (1...size).map { ($0, $0) })
        
        let searchTime = measureTime {
            _ = dict[size / 2]
        }
        
        let insertTime = measureTime {
            var mutableDict = dict
            mutableDict[size + 1] = size + 1
        }
        
        print("  Dict - Search: \(searchTime.time * 1000)ms, Insert: \(insertTime.time * 1000)ms")
    }
}

최적화 체크리스트

⚠️ 성능 안티패턴

// 🚫 비효율적인 패턴들

// 1. 불필요한 타입 변환 반복
func inefficientProcessing(numbers: [Int]) -> Set<Int> {
    var result: Set<Int> = []
    for number in numbers {
        let set = Set([number])  // 매번 Set 생성 ❌
        result = result.union(set)
    }
    return result
}

// 2. 중첩 루프에서 contains 사용
func inefficientSearch(arrays: [[Int]], target: Int) -> Bool {
    for array in arrays {
        if array.contains(target) {  // O(n) * O(n) = O(n²) ❌
            return true
        }
    }
    return false
}

// 3. 불필요한 메모리 복사
func inefficientCopy(source: [Int]) -> [Int] {
    var result = source  // COW로 복사 안됨
    result += source     // 여기서 복사 + 연결 ❌
    return result
}

✅ 최적화된 패턴

// ✅ 효율적인 패턴들

// 1. 적절한 컬렉션 선택
func efficientProcessing(numbers: [Int]) -> Set<Int> {
    return Set(numbers)  // 한 번에 변환 ✅
}

// 2. Set 활용한 빠른 검색
func efficientSearch(arrays: [[Int]], target: Int) -> Bool {
    let allNumbers = Set(arrays.flatMap { $0 })  // O(n) 변환
    return allNumbers.contains(target)  // O(1) 검색 ✅
}

// 3. reserveCapacity 활용
func efficientAppend(count: Int) -> [Int] {
    var result: [Int] = []
    result.reserveCapacity(count)  // 미리 메모리 할당 ✅
    
    for i in 0..<count {
        result.append(i)
    }
    return result
}

Conclusion

Swift Collections의 내부 구조를 이해하는 것은 단순히 이론적 지식을 넘어서, 실제 앱 성능과 사용자 경험에 직접적인 영향을 미치는 중요한 부분입니다.

가장 중요한 것은 내부 구조를 이해한다고 해서 모든 것을 최적화하려 하지 말고, 실제로 성능 병목이 되는 부분에 집중해서 적용하는 것이라고 생각해요.

즉, 측정 가능한 성능 이슈가 있을 때 이런 지식을 활용하는 게 가장 효과적이죠.

References

Swift Standard Library | Apple Developer Documentation

Solve complex problems and write high-performance, readable code.

developer.apple.com

swift/docs/OptimizationTips.rst at main · swiftlang/swift

The Swift Programming Language. Contribute to swiftlang/swift development by creating an account on GitHub.

github.com

Documentation

docs.swift.org

'Swift' 카테고리의 다른 글

Swift 6.2 (0)	2025.10.02
Deep Dive @_silgen_name (1)	2025.09.20
Make DSL with ResultBuilder (4)	2025.08.30
Swift Phantom Types (3)	2025.08.15
Swift 컴파일러의 타입 추론 파헤치기 (feat. 왜 이렇게 컴파일이 오래 걸릴까?) (4)	2025.07.27

ABOUT ME

iOYES iOYES

Swift Collections Deep Dive

Why Collections Architecture Matters More Than Ever?

📈 성능 최적화의 핵심

🚀 Swift 생태계 트렌드

Array 내부 구조 분석

🎯 메모리 레이아웃

🎯 Copy-on-Write 메커니즘

🎯 동적 크기 조정 (Dynamic Resizing)

Dictionary 내부 구조 분석

🎯 해시 테이블 구조

🎯 충돌 해결 - Open Addressing

🎯 동적 리해싱 (Rehashing)

Set 내부 구조 분석

🎯 Set의 최적화된 구조

🎯 Set 연산 최적화

성능 특성 비교 분석

🚀 Big O 복잡도 실측

🚀 메모리 사용량 분석

Copy-on-Write 심화 분석

🎯 COW 최적화 기법

Select Collections Guide

✅ Array를 사용해야 할 때

✅ Set을 사용해야 할 때

✅ Dictionary를 사용해야 할 때

다른 최적화 기법

💡 메모리 풀링 (Memory Pooling)

💡 Lazy 컬렉션 활용

💡 컬렉션 타입 전환 최적화

디버깅과 프로파일링

📊 컬렉션 상태 모니터링

📊 성능 테스트 프레임워크

최적화 체크리스트

⚠️ 성능 안티패턴

✅ 최적화된 패턴

Conclusion

References

'Swift' 카테고리의 다른 글

티스토리툴바

ABOUT ME

Swift Collections Deep Dive

Why Collections Architecture Matters More Than Ever?

📈 성능 최적화의 핵심

🚀 Swift 생태계 트렌드

Array 내부 구조 분석

🎯 메모리 레이아웃

🎯 Copy-on-Write 메커니즘

🎯 동적 크기 조정 (Dynamic Resizing)

Dictionary 내부 구조 분석

🎯 해시 테이블 구조

🎯 충돌 해결 - Open Addressing

🎯 동적 리해싱 (Rehashing)

Set 내부 구조 분석

🎯 Set의 최적화된 구조

🎯 Set 연산 최적화

성능 특성 비교 분석

🚀 Big O 복잡도 실측

🚀 메모리 사용량 분석

Copy-on-Write 심화 분석

🎯 COW 최적화 기법

Select Collections Guide

✅ Array를 사용해야 할 때

✅ Set을 사용해야 할 때

✅ Dictionary를 사용해야 할 때

다른 최적화 기법

💡 메모리 풀링 (Memory Pooling)

💡 Lazy 컬렉션 활용

💡 컬렉션 타입 전환 최적화

디버깅과 프로파일링

📊 컬렉션 상태 모니터링

📊 성능 테스트 프레임워크

최적화 체크리스트

⚠️ 성능 안티패턴

✅ 최적화된 패턴

Conclusion

References

'Swift' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바