Understand Row-Oriented vs Column-Oriented Storage

The way we access and analyze data has changed a lot lately. Row-oriented storage, which has been the standard for data storage for a long time, is having trouble keeping up with the demands of modern data analysis. In this article, I will introduce you to column-oriented storage and how it can help analytical queries run faster. OLAP In my previous post, we discussed the differences between Online Analytical Processing (OLAP) and Online Transaction Processing (OLTP)....

Apr 5, 2024 · 5 min

Recursive CTEs and CONNECT BY in SQL to query Hierarchical data

In database design, the idea of hierarchical data represents relationships between entities as a tree-like structure. This type of data model is widely used in many domains, such as file systems, organizational structure, etc. When dealing with hierarchical data, it is crucial to efficiently query and extract information about the relationships between entities. In this post, we will explore two powerful SQL tools for querying hierarchical data: recursive Common Table Expressions (CTEs) and the CONNECT BY clause....

Feb 20, 2024 · 10 min

Snowflake ID - Simplifying uniqueness in distributed systems

Problem description In developing database systems, generating IDs is a crucial task. IDs ensure the uniqueness of data, facilitate queries, and establish relationship constraints in databases. Most modern database management systems (DBMS) can generate auto-increment IDs. We can delegate this task to the DBMS entirely and not worry about the uniqueness. However, there are several reasons why we shouldn’t use auto-increment IDs, especially for distributed systems. The most important reason is that in distributed systems with independent servers, using per-server auto-increment IDs does not guarantee uniqueness and can lead to duplication problems....

Feb 3, 2024 · 3 min