Greenplum Analytics Workbench Scale Out Development Environment for Big Data Innovation and Research

Products Landing

Accelerate Hadoop Technology

The Greenplum Analytics Workbench will enable the Apache Hadoop open source community to validate code to scale on a regular, ongoing basis. With contributions certified at scale, enterprises can run them with confidence.

Big Data Application Innovation

Greenplum will not only use the Analytics Workbench to test the limits of scale-out infrastructure technology but also to re-define the models for applying Big Data analytics.

Training & Certification

A unique aspect of Greenplum’s Hadoop training program is that each member of the course will be granted access to the 1,000-node cluster to use as a sandbox environment to leverage following the successful completion of the Greenplum’s training and certification process.

Greenplum Analytics Workbench

The Greenplum Analytics Workbench – a 1,000-node cluster that will act as a lab environment for accelerating the pace of Big Data innovation - is now live. One of the primary uses of the Greenplum Analytics Workbench will be to act as an environment for running scale validation of the Apache Hadoop code base. Greenplum is actively working with the Apache Software Foundation to ensure that all results from the Analytics Workbench are available to the open source community in an effort to leverage the resources of the Analytics Workbench to further accelerate the development of Hadoop as a revolutionary technology for Big Data. The Analytics Workbench consists of technology from some of the world's leading software and hardware manufacturers to provide the infrastructure needed to fuel the progression of Big Data analytics.

Hadoop innovation and development is reliant upon contributions made by open source developers. However, the Apache Hadoop community has consistently faced the challenge of provisioning the required resources to validate new releases of the open source software. Without access to a large cluster for scale validation, the Apache community – and enterprise users – must wait for Hadoop user communities to sponsor an effort to run scale validations. This is done very infrequently and a lot of time is spent stabilizing releases for enterprise adoption.

With an aggressive plan for testing on the Apache Hadoop trunk and its continuing releases, EMC is excited to contribute to the Hadoop open source community by providing testing resources it lacks to quickly identify bugs, stabilize new releases and optimize hardware configurations in an effort to speed up the innovation of Hadoop. EMC plans to provide test results to the Apache Software Foundation and open source community, and EMC’s testing will be planned in coordination with the Apache Hadoop project.

The Greenplum Analytics Workbench is the result of a collaboration of several hardware and software vendors including:

The test bed cluster, which consists of 1,000-plus hardware nodes or 10,000 nodes with the addition of virtual machines, features 24 petabytes of physical storage. This is the equivalent of nearly half of the entire written works of mankind, from the beginning of recorded history.


Solution Highlights

Analytics Workbench

Scale-Out Development Environment for Big Data Innovation and Research

Learn More

Scale-Out Hadoop Validation

The Greenplum Analytics Workbench will be used for regular integration tests on Apache Hadoop. The 1,000-plus node test bed cluster incorporates technology from the world’s leading software and hardware manufacturers with the intention of providing the infrastructure needed to facilitate Apache Hadoop innovation. With the availability of a large-scale test bed, developers can have their contributions validated at scale, and enterprises can confidently deploy new releases in a production environment.

Innovative Applications of Big Data Analytics

Greenplum will not only use the Analytics Workbench to test the limits of scale-out infrastructure technology but also to re-define the models for applying Big Data analytics. Whether that involves working with visionary academic institutions on data-intensive research studies, or collaborating with big data application developers, Greenplum has plans to provide the most innovative thinkers in the data space with access to the Analytics Workbench.

Greenplum Training and Certification

The 1,000-node cluster will also be made available to members of Greenplum’s training and certification classes for Hadoop. With the first publicly available courses launching this summer, Greenplum will offer organizations and individuals with a set of comprehensive Hadoop training programs designed to provide participants with the knowledge and programming skills required to have success with Hadoop. A unique aspect of Greenplum’s Hadoop training program is that each member of the course will be granted access to the 1,000-node cluster to use as a sandbox environment to leverage following the successful completion of the Greenplum’s training and certification process.

Analytics Workbench

Scale-Out Development Environment for Big Data Innovation and Research

Learn More
Whitepaper

Greeenplum Analytics Workbench Whitepaper

This whitepaper details the way the Greenplum Analytics Workbench was designed and built to validate Apache Hadoop code at scale, as well as provide a large scale experimentation environment for mixed mode development that include various SQL and Non-SQL execution environments.

Press Release

Greenplum® Analytics Workbench Debuts at EMC World

Greenplum and Partners Launch 1,000 Node Platform to Accelerate Hadoop® Testing and Development

Press Release

EMC Announces 1000 Node Analytic Platform To Accelerate Industry’s Hadoop Testing and Development

EMC and industry leading companies including Intel, VMware, Micron, Seagate, Supermicro, Switch, and Mellanox Technologies Partner To Deliver the Greenplum Analytics Workbench ™ analytic computing platform