Greenplum Pivotal HD: Enterprise-Ready Apache Hadoop
Unprecedented Query Processing
Dynamic Pipelining™ technology delivers 100X performance improvement with mature SQL query optimization and powerful analytics.
Industry Leading Data Management
Scatter/Gather™ data loading, polymorphic storage, third-party tools certification, and language support.
Virtualization & Cloud Ready
Leverage VMware and Isilon virtualization and enterprise storage for rapid deployment, elastic scale and high availability.
Expand the Productivity and Possibilities of Hadoop
Apache Hadoop’s reliance on advanced languages and programming frameworks have limited its cost-effectiveness and adoption in many organizations. Pivotal HD Enterprise offers enterprise-hardened Hadoop, as well as advanced SQL query services that make Hadoop more stable and usable. The result: Reduced implementation costs for all but the most sophisticated organizations.
Pivotal Advanced Database Services, powered by HAWQ, add SQL’s expressive power to Hadoop. By adding rich, mature SQL processing, Pivotal HD leverages existing BI and analytics products and your workforce’s SQL skills to simplify development, expand Hadoop’s capabilities, increase productivity, and cut costs.
Pivotal HD’s Advanced Database Services augment Hadoop MapReduce, Pig, HBase and Hive using a proven, distributed SQL query engine extended with analytical algorithms to improve productivity and accelerate data analytics projects.
Deliver Hadoop with Enterprise Capabilities and Flexibility
Pivotal HD Enterprise is based on Apache Hadoop 2.0 and includes components that meet the needs of organizations building applications on top of Hadoop-based data and analytics platforms.
Hadoop users can maximize data availability with Pivotal HD Enterprise, using pluggable storage to choose between traditional Hadoop direct-attach data storage, or EMC Isilon OneFS Scale-Out NAS Storage, which offers 100% HDFS compatibility. EMC Isilon OneFS Scale-Out NAS Storage also increases storage efficiency, simplifying data loading and streamlining data protection with data snapshots and replication.
Hadoop Virtualization Extensions (HVE) included in Pivotal HD Enterprise leverage VMware technology to simplify deployment and management of Hadoop in either public or private clouds. Pivotal HD Enterprise also maximizes the flexibility of on-premises deployment with software versions for deployment on a wide range of available hardware, as well as easy-to-deploy preconfigured Greenplum Data Computing Appliances.
Pivotal HD brings a pluggable storage layer to Hadoop that supports both HDFS and Isilon OneFS Storage
Pivotal HD Enterprise
Pivotal HD Enterprise is a commercially-supported distribution of the Apache Hadoop stack including HDFS, MapReduce, Hive, Pig, HBase, Zookeeper, Sqoop and Flume packages from The Apache Foundation. Backed by the world’s largest Hadoop support organization and tested at scale in Greenplum’s 1,000 node Pivotal Analytics Workbench, Pivotal HD Enterprise offers the capabilities of Apache Hadoop in a fully-supported, enterprise-ready distribution.
Pivotal Advanced Database Services
Pivotal Advanced Database Services (ADS) powered by HAWQ, extends Pivotal HD Enterprise, adding rich, proven parallel SQL processing facilities. These render Hadoop queries faster than any Hadoop-based query interface on the market today, enhancing productivity. Pivotal ADS enables SQL analysis of data in a variety of Hadoop-based data formats using the Pivotal Xtension Framework, without duplicating or converting HBase files. Alternatively, an optimized format is available for ADS table storage for best performance.
A Fast, Proven SQL Database Engine for Hadoop
Unlike new SQL-on-Hadoop entrants, Pivotal Advanced Database Services bring over 10 years of innovation that has resulted in a rich, powerful SQL query optimizer and processor optimized to run analytical queries and mixed query workloads in massively parallel, distributed environments. Like Pivotal HD Enterprise, ADS can be deployed as software, as an appliance, or virtualized in private or public cloud environments.
Hadoop In The Cloud: Pivotal HD Virtualized by VMware
Hadoop Virtualization Extensions (HVE) facilitate installation and management of Pivotal HD Enterprise in public and private clouds using VMware-designed deployment and management tools. HVE tools speed provisioning and simplify creation of virtualized high availability and fault tolerance with physical infrastructure awareness. With HVE, Pivotal HD Enterprise can deliver truly elastic scalability in the cloud, augmenting on-premises deployment options that include software and appliance deployments.
Isilon Scale Out NAS for Pivotal HD Enterprise
Pivotal HD Enterprise offers pluggable HDFS storage to simplify the delivery of highly available Hadoop data and applications. Users can opt to use traditional HDFS storage on direct-attach disks with Pivotal HD Enterprise, or they can run it on EMC Isilon’s OneFS Scale-Out NAS Storage.
EMC Isilon gives HD users options to increase reliability and cut overall costs, without sacrificing compatibility with Apache HDFS. Isilon improves storage efficiency, scales elastically and simplifies loading. Data availability is improved through built-in redundancy, data mirroring, time-based snapshots and replication and simplicity.
As opposed to scenarios where users must build and manage highly available native HDFS systems, staffing costs are lower using Pivotal HD with EMC Isilon. Your staff can focus on business analysis and data management, rather than squandering time and resources on the complexities of configuring and managing a Hadoop infrastructure.
Rapid Hadoop: Pivotal HD Enterprise and Greenplum Data Computing Appliances
Greenplum Data Computing Appliances (DCAs) provide you with pre-tested, pre-optimized, pre-configured Pivotal HD infrastructure. Delivered as modular systems, DCAs enable Pivotal HD Enterprise to be delivered and at work within days, and can be easily scaled without disruption. DCA-based HD deployments improve information availability, leveraging the DCA’s fully redundant architecture.
Once installed, DCAs are easily managed using the same Command Center console that is used to manage Pivotal HD itself. DCAs can also report detailed system status to your data center management infrastructure using Simple Network Management Protocols (SNMP), as well as EMC support centers if configured to do so.
Big Data Analytics Capability and Productivity
Analyzing big data efficiently requires massively-parallel architectures like Hadoop. To take advantage the computational capacity of MPP sytems, the statistical, mathematical, and machine learning algorithms must be refactored to run efficiently in a parallel environment. Pivotal HD’s Advanced Database Services offer these capabilities through MADlib, a library of MPP-capable algoritnms that extend the SQL capabilities of Hadoop. In addition, Pivotal HD Enterprise includes Apache Mahout, an open-source parallelized analytical library for MapReduce users.
With Pivotal HD, your experienced team is prepared to tackle increasingly complex analytics challenges. To further enhance productivity, Pivotal HD Enterprise supports the Chorus Analytical Productivity Platform. Chorus offers a social-media inspired environment where your data analysts, data scientists, IT staff, DBAs, executives, and other stakeholders can collaboratively identify and address Big Data opportunities even faster than before. Chorus increases productivity across the data science team at all levels of the Big Data value chain, from searching, exploring, visualizing and importing data through analysis, developing insights, selecting analytic methods, to creating workflows to harness the value of Big Data.
Data Sheet
The world's most powerful distribution of Apache Hadoop. Pivotal HD Enterprise is a commercially-supported distribution of the Apache Hadoop stack including HDFS, MapReduce, Hive, Pig, HBase, Zookeeper, Sqoop and Flume packages from The Apache Foundation.
HAWQ - A true SQL engine for Hadoop
HAWQ is a parallel SQL query engine that combines the key technological advantages of the industry-leading Greenplum Database with the scalability and convenience of Hadoop.
Whitepaper
Lowering the Barriers to Entry for Big Data
For most enterprises, Big Data has moved from being just a buzzword or a science project to the recognition that Big Data can generate tactical and strategic competitive advantage as well as being a value creator for the business. There are plenty of choices in terms of the underlying hardware technology; a variety of Hadoop distributions or NoSQL databases; and numerous technologies that help enterprise enable internal datasets to be integrated with external datasets.
An Industry Take on Pivotal HD
Hear from our customer and partner ecosystem on how Pivotal HD's Hadoop distribution is a Big Data game-changer for organizations of all sizes. Featuring VMWare, NYSE Euronext, Cisco, Tableau, GE, Factual, Alpine and more.
Data Sheet
Technical Brief: Data Sharing between Database and the Hadoop Distributed File System
How Greenplum Database and Hadoop work together
Isilon Scale-Out NAS for Greenplum HD
EMC Isilon Big Data Storage and Analytics Solution that can natively integrate with the Hadoop Distributed File System (HDFS) layer.
Big Data Storage & Analytics Solutions with VMware ‘Serengenti’
End-to-end Hadoop solutions powered by EMC Isilon, EMC Greenplum and VMware. EMC® Isilon® scale-out NAS is the first and only Enterprise NAS solution that can natively integrate with the Hadoop Distributed File System (HDFS) layer.
An emerging MapReduce platform layered on a distributed file system—Hadoop and HDFS—is one of the solutions more recently being selected by companies to address their big data analytics needs.
Hadoop, an Apache Foundation Open Source project, represents a way for enterprise IT to take advantage of Cloud and Internet capabilities sooner.
Analyst Report
Hadoop: Revealing It's True Value for Business Intelligence
Despite all the hubbub and hype around Hadoop, few business intelligence (BI) and data warehousing (DW) professionals know much about what Hadoop is, how it does what it does, or in which situations they should deploy it.
Press Release
EMC Announces 1000 Node Analytic Platform To Accelerate Industry’s Hadoop Testing and Development
EMC and industry leading companies including Intel, VMware, Micron, Seagate, Supermicro, Switch, and Mellanox Technologies Partner To Deliver the Greenplum Analytics Workbench ™ analytic computing platform