Blog
Latest Posts
Beyond Rows and Columns: Greenplum’s Polymorphic Data Storage™ -- Part 1
Traditionally relational data has been stored in rows — i.e. as a sequence of tuples in which all the columns of each tuple are stored together on disk. This has a long heritage back to early OLTP systems that introduced the ‘slotted page’ layout that is still in common use today. However analytical databases tend to have different access patterns than OLTP systems. Instead of seeing many single-row reads and writes, analytical databases must process larger more complex queries that touch much larger volumes of data — i.e. read-mostly with big scanning reads and infrequent batch appends of data. Of course the world isn’t quite this simple, and more often than not customers want the best of both worlds — i.e. an ability to handle mixed workloads containing big queries, small queries, big appends, and trickle streams all at the same time.
Vendors have taken a number of different approaches to meeting these needs. Some have optimized their disk layouts to eliminate the OLTP fat and do smarter disk scans. Others have tried to pull processing into memory to avoid touching disk at all (which of course works great until your dataset exceeds your budget to buy more memory). Others have turned their storage sideways (literally) with column-stores — i.e. the decades old idea of ‘vertical decomposition’ demonstrated to good success by Sybase IQ, and now reimagined by a raft of newer vendors. Each of these approaches has proven to have sweet-spots where they shine, and others where they do a less admirable job. Unfortunately the debate over which is superior has taken on a marketing fury that has shed far more heat than light. These are all reasonable choices, and the differences between them matter far less than bigger issues like the ability of the database to leverage multi-core parallelism and scale out effectively across 10s or 100s of commodity hardware servers.
Rather than join the chorus on one side or another, we’ve been hard at work building in the flexibility so that customers can choose the right strategy for the job at hand. In Part 2 we describe Greenplum’s Polymorphic Data Storage™ technology and our newly added support for column-oriented tables.
- Greenplum Days!
- MAD Skills for Changing Times
- Teradata Taking Aim at Our Enterprise Data Cloud™ Initiative
- Beyond Rows and Columns: Greenplum’s Polymorphic Data Storage™ -- Part 2
- Beyond Rows and Columns: Greenplum’s Polymorphic Data Storage™ -- Part 1
Archive
2010
2009
- December
- November
- October (4)
- September (4)
- June (1)
- May (2)
- April (3)
- March (1)
- February (4)
- January (2)
2008
- December (4)
- November (3)
- October (3)
- September (4)
- August (3)
- July (2)
- June (2)
- May (1)
- April (1)
- March (2)
- February (1)
- January (3)


Add A Comment