<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
  <title>Sirin's Note</title>
  <link>https://sirin.dev</link>
  <description>Personal notes on Computer Science, Machine Learning, and related topics.</description>
  <language>en</language>
  <lastBuildDate>Mon, 01 Jun 2026 12:22:37 GMT</lastBuildDate>
  <atom:link href="https://sirin.dev/feed.xml" rel="self" type="application/rss+xml"/>
    <item>
      <title><![CDATA[Designing Data-Intensive Applications - In-Depth Summary]]></title>
      <link>https://sirin.dev/notes/books-designing-data-intensive-applications-00-introduction</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-designing-data-intensive-applications-00-introduction</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[About This Summary Book Overview The Three Pillars 1. Reliability 2. Scalability 3. Maintainability Book Structure How to Use This Summary Key Takeaway About This Summary This is a comprehensive, chapter-by-chapter summary of Designing Data-Intensive Applications by Martin Kleppmann. The summaries a]]></description>
    </item>
    <item>
      <title><![CDATA[Chapter 1: Reliable, Scalable, and Maintainable Applications]]></title>
      <link>https://sirin.dev/notes/books-designing-data-intensive-applications-01-reliable-scalable-maintainable</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-designing-data-intensive-applications-01-reliable-scalable-maintainable</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Core Concepts Data-Intensive vs Compute-Intensive Thinking About Data Systems Three Fundamental Properties 1. Reliability 2. Scalability 3. Maintainability Operability Simplicity Evolvability Key Takeaways Core Concepts Data-Intensive vs Compute-Intensive Data-intensive : The limiting factor is the ]]></description>
    </item>
    <item>
      <title><![CDATA[Chapter 2: Data Models and Query Languages]]></title>
      <link>https://sirin.dev/notes/books-designing-data-intensive-applications-02-data-models-and-query-languages</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-designing-data-intensive-applications-02-data-models-and-query-languages</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Core Concepts Data Models Matter Relational Model vs Document Model Relational Model (SQL) Document Model (NoSQL) Object-Relational Mismatch Many-to-One and Many-to-Many Relationships Historical Context: Document DBs Repeating History Schema Flexibility Data Locality Convergence Query Languages Impe]]></description>
    </item>
    <item>
      <title><![CDATA[Chapter 3: Storage and Retrieval]]></title>
      <link>https://sirin.dev/notes/books-designing-data-intensive-applications-03-storage-and-retrieval</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-designing-data-intensive-applications-03-storage-and-retrieval</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Core Concepts Database Fundamentals Why Storage Engine Knowledge Matters Data Structures That Power Your Database Simple Key-Value Store (Bash Example) Hash Indexes SSTables and LSM-Trees Sorted String Tables (SSTables) Constructing SSTables LSM-Tree (Log-Structured Merge-Tree) B-Trees Concept Struc]]></description>
    </item>
    <item>
      <title><![CDATA[Chapter 4: Encoding and Evolution]]></title>
      <link>https://sirin.dev/notes/books-designing-data-intensive-applications-04-encoding-and-evolution</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-designing-data-intensive-applications-04-encoding-and-evolution</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Core Concepts The Challenge Modes of Dataflow Formats for Encoding Data Language-Specific Formats JSON, XML, and Binary Variants Thrift and Protocol Buffers Avro The Merits of Schemas Modes of Dataflow Revisited Dataflow Through Databases Dataflow Through Services (REST/RPC) Dataflow Through Message]]></description>
    </item>
    <item>
      <title><![CDATA[Chapter 5: Replication]]></title>
      <link>https://sirin.dev/notes/books-designing-data-intensive-applications-05-replication</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-designing-data-intensive-applications-05-replication</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Core Concepts Why Replication? Challenges Leader-Based Replication How It Works Synchronous vs Asynchronous Replication Setting Up New Followers Handling Node Failures Replication Logs Multi-Leader Replication Use Cases Conflict Resolution Topologies Leaderless Replication Concept Quorum Reads/Write]]></description>
    </item>
    <item>
      <title><![CDATA[Chapter 6: Partitioning]]></title>
      <link>https://sirin.dev/notes/books-designing-data-intensive-applications-06-partitioning</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-designing-data-intensive-applications-06-partitioning</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Core Concepts Why Partitioning? Two Main Approaches Partitioning Strategies Partitioning by Hash of Key Partitioning by Key Range Rebalancing Secondary Indexes Partitioning Secondary Indexes by Document Partitioning Secondary Indexes by Term Request Routing Approaches Service Discovery Transactions ]]></description>
    </item>
    <item>
      <title><![CDATA[Chapter 7: Transactions]]></title>
      <link>https://sirin.dev/notes/books-designing-data-intensive-applications-07-transactions</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-designing-data-intensive-applications-07-transactions</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Core Concepts What is a Transaction? ACID Properties BASE vs ACID Single-Object vs Multi-Object Operations Single-Object Operations Multi-Object Operations Isolation Levels Read Committed Snapshot Isolation (Repeatable Read) Read Skew Write Skew Serializable Isolation Implementing Snapshot Isolation]]></description>
    </item>
    <item>
      <title><![CDATA[Chapter 8: The Trouble with Distributed Systems]]></title>
      <link>https://sirin.dev/notes/books-designing-data-intensive-applications-08-the-trouble-with-distributed-systems</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-designing-data-intensive-applications-08-the-trouble-with-distributed-systems</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Core Concepts Fundamental Challenges System Model Unreliable Networks Common Failures Timeout Problems Unbounded Delays Unreliable Clocks Problems with Clocks Timestamps for Ordering Clock Synchronization Limitations Process Pauses The Problem Solutions Truth and Lies Fencing Tokens Byzantine Faults]]></description>
    </item>
    <item>
      <title><![CDATA[Chapter 9: Consistency and Consensus]]></title>
      <link>https://sirin.dev/notes/books-designing-data-intensive-applications-09-consistency-and-consensus</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-designing-data-intensive-applications-09-consistency-and-consensus</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Core Concepts Why Consensus Matters Consistency Models Linearizability Definition How to Achieve Linearizability and Quorums Performance Cost Causal Consistency Definition Implementations Consistent Prefix Reads Total Order Broadcast Definition Properties Implementations Relationship to Consensus Us]]></description>
    </item>
    <item>
      <title><![CDATA[Chapter 10: Batch Processing]]></title>
      <link>https://sirin.dev/notes/books-designing-data-intensive-applications-10-batch-processing</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-designing-data-intensive-applications-10-batch-processing</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Core Concepts Batch Processing Definition Unix Philosophy for Batch Processing Batch Processing with Unix Tools Simple Log Analysis Limitations of Unix Tools MapReduce Concept Map Function Reduce Function Execution Model Fault Tolerance Joins in MapReduce Limitations Beyond MapReduce Dataflow Engine]]></description>
    </item>
    <item>
      <title><![CDATA[Chapter 11: Stream Processing]]></title>
      <link>https://sirin.dev/notes/books-designing-data-intensive-applications-11-stream-processing</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-designing-data-intensive-applications-11-stream-processing</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Core Concepts Batch vs Stream Processing Messaging Systems Log-Based Messaging How It Works Advantages Consumer Groups Change Data Capture (CDC) Concept Implementations Log Compaction Event Sourcing Concept Commands vs Events Event Sourcing Benefits Stream Processing Stream Tables Stream-Stream Join]]></description>
    </item>
    <item>
      <title><![CDATA[Chapter 12: The Future of Data Systems]]></title>
      <link>https://sirin.dev/notes/books-designing-data-intensive-applications-12-the-future-of-data-systems</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-designing-data-intensive-applications-12-the-future-of-data-systems</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Core Concepts Unbundling Databases Data Integration Designing Applications Around Dataflow Event-Driven Architecture Dataflow Through Services Derived Data Observability and Auditability Monitoring Auditability Data Lineage Trust, but Verify Data Integrity Checking Data Integrity Auditable Data Syst]]></description>
    </item>
    <item>
      <title><![CDATA[Key Concepts Glossary]]></title>
      <link>https://sirin.dev/notes/books-designing-data-intensive-applications-99-key-concepts-glossary</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-designing-data-intensive-applications-99-key-concepts-glossary</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[A B C D E F G H I J L M N O P Q R S T V W A ACID - Atomicity, Consistency, Isolation, Durability. Properties guaranteeing database transaction correctness. Anti-entropy - Process of comparing replicas and synchronizing data to resolve inconsistencies. Atomicity - Property that all operations in a tr]]></description>
    </item>
    <item>
      <title><![CDATA[Designing Data-Intensive Applications]]></title>
      <link>https://sirin.dev/notes/books-designing-data-intensive-applications-index</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-designing-data-intensive-applications-index</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Part I: Foundations of Data Systems Part II: Distributed Data Part III: Derived Data Designing Data-Intensive Applications Author: Martin Kleppmann Chapter-by-chapter summary covering distributed systems, storage, replication, partitioning, transactions, consistency, batch and stream processing, and]]></description>
    </item>
    <item>
      <title><![CDATA[📘 สรุป Fundamentals of Data Engineering]]></title>
      <link>https://sirin.dev/notes/books-fundamentals-of-data-engineering-00-overview</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-fundamentals-of-data-engineering-00-overview</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[ผู้แต่ง: Joe Reis, Matt Housley แนวคิดหลัก: มอง data engineering ผ่าน Data Engineering Lifecycle — ไม่ใช่แค่เครื่องมือ แต่คือการจัดการข้อมูลตั้งแต่เกิดจนส่งมอบคุณค่า]]></description>
    </item>
    <item>
      <title><![CDATA[บทที่ 1: Data Engineering Described — Data Engineering คืออะไร?]]></title>
      <link>https://sirin.dev/notes/books-fundamentals-of-data-engineering-01-data-engineering-described</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-fundamentals-of-data-engineering-01-data-engineering-described</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[คำนิยามจาก Experts ต่างๆ วิวัฒนาการของ Data Engineering Skills ที่ Data Engineer ต้องมี ความแตกต่างของบทบาท ประเภทของ Data Engineers ระดับ Data Maturity คำนิยามจาก Experts ต่างๆ AlexSoft: ชุด operations ที่สร้าง interfaces และ mechanisms สำหรับ flow และการเข้าถึงข้อมูล Jesse Anderson: แบ่งเป็น 2 แบบ]]></description>
    </item>
    <item>
      <title><![CDATA[บทที่ 2: The Data Engineering Lifecycle — หัวใจของหนังสือทั้งเล่ม]]></title>
      <link>https://sirin.dev/notes/books-fundamentals-of-data-engineering-02-the-data-engineering-lifecycle</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-fundamentals-of-data-engineering-02-the-data-engineering-lifecycle</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[5 Stages ของ Data Engineering Lifecycle 1. Generation (Source Systems) 2. Storage 3. Ingestion 4. Transformation 5. Serving Data 6 Undercurrents (กระแสที่แทรกทุก Stage) 1. Security 2. Data Management 3. DataOps 4. Data Architecture 5. Orchestration 6. Software Engineering 5 Stages ของ Data Engineeri]]></description>
    </item>
    <item>
      <title><![CDATA[บทที่ 3: Designing Good Data Architecture]]></title>
      <link>https://sirin.dev/notes/books-fundamentals-of-data-engineering-03-designing-good-data-architecture</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-fundamentals-of-data-engineering-03-designing-good-data-architecture</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Data Architecture คืออะไร? 9 Principles of Good Data Architecture Major Architecture Concepts Batch Architectures Streaming Architectures Data Mesh (Zhamak Dehghani) Data Architecture คืออะไร? การออกแบบระบบเพื่อรองรับความต้องการข้อมูลที่เปลี่ยนแปลงไป โดยใช้ flexible and reversible decisions ผ่านการป]]></description>
    </item>
    <item>
      <title><![CDATA[บทที่ 4: Choosing Technologies Across the Data Engineering Lifecycle]]></title>
      <link>https://sirin.dev/notes/books-fundamentals-of-data-engineering-04-choosing-technologies</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-fundamentals-of-data-engineering-04-choosing-technologies</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[The Embarrassment of Riches เกณฑ์การประเมิน Technology Open Source vs Managed vs Proprietary The Golden Rule The Embarrassment of Riches มี data technologies มากเกินไป — engineers หลงกับ bleeding-edge tools จนลืม core purpose ต้องตอบคำถามเดียว: มันเพิ่ม value ให้ data product และ business หรือไม่? เ]]></description>
    </item>
    <item>
      <title><![CDATA[บทที่ 5: Data Generation in Source Systems]]></title>
      <link>https://sirin.dev/notes/books-fundamentals-of-data-engineering-05-data-generation-in-source-systems</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-fundamentals-of-data-engineering-05-data-generation-in-source-systems</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[ประเภทของ Source Systems State vs Events CDC — Change Data Capture Consistency Models Communication กับ Source System Owners ประเภทของ Source Systems ชนิด ตัวอย่าง ลักษณะ Files CSV, JSON, XML, Excel ใช้แลกเปลี่ยนข้อมูลระหว่างระบบ APIs REST (REpresentational State Transfer), GraphQL, gRPC, Webhooks R]]></description>
    </item>
    <item>
      <title><![CDATA[บทที่ 6: Storage]]></title>
      <link>https://sirin.dev/notes/books-fundamentals-of-data-engineering-06-storage</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-fundamentals-of-data-engineering-06-storage</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA["It's storage all the way down" Raw Ingredients Storage Systems Taxonomy CAP Theorem &#x26; PACELC Data Temperature Separation of Compute from Storage "It's storage all the way down" Storage คือรากฐานของทุกอย่างใน data engineering lifecycle — ข้อมูลถูกจัดเก็บหลายครั้งระหว่างเดินทางผ่าน pipeline Raw ]]></description>
    </item>
    <item>
      <title><![CDATA[บทที่ 7: Ingestion]]></title>
      <link>https://sirin.dev/notes/books-fundamentals-of-data-engineering-07-ingestion</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-fundamentals-of-data-engineering-07-ingestion</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Batch Ingestion Patterns Streaming Ingestion Patterns Key Challenges Tools &#x26; Technologies Batch Ingestion Patterns รูปแบบ คำอธิบาย Full Snapshot ดึงสถานะทั้งหมดทุกครั้ง — ง่ายแต่ใช้ bandwidth เยอะ Incremental ดึงเฉพาะการเปลี่ยนแปลง — ลด network/storage แต่ซับซ้อนกว่า ETL (Extract → Transform → ]]></description>
    </item>
    <item>
      <title><![CDATA[บทที่ 8: Queries, Modeling, and Transformation]]></title>
      <link>https://sirin.dev/notes/books-fundamentals-of-data-engineering-08-queries-modeling-and-transformation</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-fundamentals-of-data-engineering-08-queries-modeling-and-transformation</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[จุดประสงค์ของ Transformation Data Modeling Approaches Query Optimization Transformation Patterns dbt และ Modern Transformation Tools จุดประสงค์ของ Transformation เปลี่ยน raw data → ข้อมูลที่ stakeholders ใช้ได้จริง ความแตกต่างระหว่าง query (แค่เรียกดู) กับ transformation (บันทึกผลลัพธ์ให้ downstream]]></description>
    </item>
    <item>
      <title><![CDATA[บทที่ 9: Serving Data for Analytics, Machine Learning, and Reverse ETL]]></title>
      <link>https://sirin.dev/notes/books-fundamentals-of-data-engineering-09-serving-data</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-fundamentals-of-data-engineering-09-serving-data</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[การสร้าง Trust ใน Data 3 ประเภท Analytics Machine Learning Serving Reverse ETL Undercurrents ในการ Serving การสร้าง Trust ใน Data ก่อน serving data ต้องมี trust ใน data ก่อน — ถ้าคนไม่เชื่อถือ data จะไม่มีค่า Trust → Data products → Self-service → Data-driven culture 3 ประเภท Analytics Business Inte]]></description>
    </item>
    <item>
      <title><![CDATA[บทที่ 10: Security and Privacy]]></title>
      <link>https://sirin.dev/notes/books-fundamentals-of-data-engineering-10-security-and-privacy</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-fundamentals-of-data-engineering-10-security-and-privacy</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Security First Mindset People: จุดอ่อนที่สุด Processes: หลักการสำคัญ Technology: เครื่องมือที่ต้องใช้ Compliance &#x26; Regulations Internal Security Research Security First Mindset Security ต้องมาก่อน ทุกอย่าง — data engineers จัดการ sensitive data ทุกวัน "Security is the first thing a data enginee]]></description>
    </item>
    <item>
      <title><![CDATA[บทที่ 11: The Future of Data Engineering]]></title>
      <link>https://sirin.dev/notes/books-fundamentals-of-data-engineering-11-the-future-of-data-engineering</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-fundamentals-of-data-engineering-11-the-future-of-data-engineering</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[แนวโน้มสำคัญ จาก Modern Data Stack → Live Data Stack What Will Stay the Same (สิ่งที่ unlikely to change) แนวโน้มสำคัญ Data Engineering Lifecycle จะไม่หายไป — เครื่องมือง่ายขึ้น แต่งาน data engineer จะ shift ไป higher-value SaaS (Software as a Service) managed services (BigQuery, Snowflake, EMR — Am]]></description>
    </item>
    <item>
      <title><![CDATA[สรุปแนวคิดหลักของหนังสือทั้งเล่ม]]></title>
      <link>https://sirin.dev/notes/books-fundamentals-of-data-engineering-99-conclusion</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-fundamentals-of-data-engineering-99-conclusion</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[หยุดคิดในแง่ tools — คิดในแง่ lifecycle — Generation → Storage → Ingestion → Transformation → Serving 6 undercurrents (security, data management, DataOps, architecture, orchestration, software engineering) แทรกทุก stage Architecture is strategic, tools are tactical — เลือก technology ที่รับใช้ archi]]></description>
    </item>
    <item>
      <title><![CDATA[Appendix A: Serialization and Compression]]></title>
      <link>https://sirin.dev/notes/books-fundamentals-of-data-engineering-a-serialization-and-compression</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-fundamentals-of-data-engineering-a-serialization-and-compression</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Row-Based vs Columnar วิธีเลือก Format Compression Row-Based vs Columnar Format จุดเด่น จุดอ่อน CSV ง่าย, universal error-prone, ไม่มี schema จริง — ควรหลีกเลี่ยงใน pipelines JSON/JSONL มาตรฐาน API, native support ใน DB สมัยใหม่ performance ต่ำกว่า columnar มาก Avro Row-oriented, binary, schema ใน J]]></description>
    </item>
    <item>
      <title><![CDATA[Appendix B: Cloud Networking]]></title>
      <link>https://sirin.dev/notes/books-fundamentals-of-data-engineering-b-cloud-networking</link>
      <guid isPermaLink="true">https://sirin.dev/notes/books-fundamentals-of-data-engineering-b-cloud-networking</guid>
      <pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Network Topology Data Egress Fees Key Takeaways สำหรับ Data Engineer Network Topology Availability Zone (AZ): หน่วยเล็กสุด, bandwidth สูงสุด, latency ต่ำสุด — traffic ใน zone ฟรี (ใช้ private IP) Region: 2+ AZ, ใช้ geo-redundancy, latency เพิ่ม, egress fees ข้าม zone GCP Multiregion: US, EU, ASIA — ]]></description>
    </item>
</channel>
</rss>