World’s Most Powerful Analytical Database


Greenplum Database is a software solution built to support the next generation of data warehousing and large-scale analytics processing. Supporting SQL and MapReduce parallel processing, Greenplum Database offers industry-leading performance at a low cost for companies managing terabytes to petabytes of data. Learn more about MapReduce. >

Greenplum Database provides:

  • Economical Petabyte Scaling > More.
    • Making it easy to build a warehouse of any size and grow it over time
    • Linear and cost-effective scaling on commodity hardware
  • Massively Parallel Query Execution > More.
    • Get answers faster than ever before
    • Ensuring High-performance analysis as your data grows
  • Unified Analytical Processing > More.
    • Common platform for queries, machine learning, text mining, statistical computing , etc.
    • Enabling parallel analysis on any data, at all levels with SQL, MapReduce, R, etc.

    Register to Download Greenplum Database 3.2 >

    Greenplum Database's fundamental breakthrough is its ability to store and analyze terabytes to petabytes of data using clusters of commodity servers. Greenplum Database moves processing power as close as possible to the data, so processing always occurs in parallel, delivering unmatched query and load performance. Greenplum Database makes it easy to incrementally add storage capacity and processing power when needed, avoiding costly appliance upgrades.

    Greenplum Database is simply the industry's fastest and most affordable high-end data warehousing solution. With Greenplum Database, users will have the power to answer complex questions, running analyses that used to take days with traditional solutions, in literally just seconds.

    Economical Petabyte Scaling

    In today’s business climate, every leading organization finds itself in the data business. Companies are storing increasingly detailed information about customers and business processes, storing it for longer, and analyzing it more deeply than ever before. It is no surprise that typical data volumes are growing by 1.5 to 2.5x a year.

    Given these new realities, traditional data warehouse solutions are failing to provide the scalability and cost-effectiveness that business are demanding. They need a solution that can go from terabytes to petabytes without a hitch, and do it without exorbitantly expensive proprietary hardware solutions. The answer is Greenplum Database.

    Greenplum Database utilizes a shared-nothing, massively parallel processing architecture that is optimized for data warehousing, business intelligence, and analytical processing. Customers can leverage the disruptive price-efficiencies of commodity servers, storage, and networking to economically scale to petabytes and meet the challenges of today and tomorrow.

    Massively Parallel Query Execution

    The real test of an analytical database is how quickly it can return answers to complex questions against large volumes of data. It is on these queries that traditional data warehouse solutions show their limitations, demonstrating limited parallelism and bottlenecks that can slow processing to a crawl.

    Greenplum Database utilizes state-of-the-art parallel processing techniques to return answers to queries at unmatched speed -- often 10 to 100 times faster than traditional solutions.

    The key is Greenplum's parallel dataflow engine, which connects 10s, 100s or 1000s of processing cores and disks into a massively parallel query processing supercomputer. Greenplum fully utilizes the power of each core with linear scalability, ensuring that processing can keep up as your data volumes grow.

    Unified Analytical Processing

    Most enterprises have a patchwork of platforms and tools for storing and analyzing data. Structured data tends to be stored in databases, while unstructured data often lives in file systems. Each group that may want to analyze data - DBAs/analysts, software developers and statisticians - is likely to have its own languages and specialize in particular types of analysis.

    Structured Data
    (in database)
    Unstructured Data (in file system)
    DBA/analyst SQL Queries and in-database UDFs. None
    Software developer Extract from database into custom application (Perl, Python, Java, C, etc). Custom application (Perl, Python, Java, C, etc).
    Statistician Extract from database in SAS, R, etc. Custom application (R, Python, etc).

    The result is siloed inefficiency and lost opportunities. Developers or statisticians who want to use innovative new analysis algorithms against data in the database must find their own servers and storage on which to run their analysis, spend hours or days copying over slices of the data, and then slowly churn through the data one record at a time.

    The Greenplum Database dramatically improves on this status quo by providing the first unified analytical processing platform. This unique architecture allows all users of the system to mix and match data sources (structured in-database, unstructured external) and programming styles (SQL, MapReduce, R, Perl, Python, etc) and have them all run on a common massively-parallel infrastructure. Developers and statisticians can now directly analyze any data in the system, without any extracting or moving data, and leverage the full massively-parallel processing performance of the Greenplum system.

    This innovation makes it easy for companies to deploy the latest techniques in machine learning, graph analysis, statistical computing and text analysis techniques against any of their data.

     




Videos
Briefing on the Petabyte Future and the next generation database.
Watch now
mapreduce demo
Technical Overview of MapReduce - with MapReduce Demos.
Watch now

Database thought leaders discuss state of development.
Watch now

Luke Lonergan on achieving large scale analytics.
Watch now

Customers help shape the next generation of Greenplum Database.
Watch now