gNet Software Interconnect
In shared-nothing MPP database systems, data often needs to be moved whenever there is a join or an aggregation process for which the data requires repartitioning across the segments. As a result, the interconnect serves as one of the most critical components within Greenplum Database. Greenplum’s gNet interconnect optimizes the flow of data to allow continuous pipelining of processing without blocking on all nodes of the system. The gNet interconnect is tuned and optimized to scale to 10,000s of processors and leverages commodity Gigabit Ethernet and 10GigE switch technology.
At its core, the gNet software interconnect is a supercomputing-based ‘soft-switch’ that is responsible for efficiently pumping streams of data between motion nodes during query-plan execution. It delivers messages, moves data, collects results, and coordinates work among the segments in the system. It is infrastructure underpinning the execution of motion nodes that occur within parallel query plans on the Greenplum system.
Within the execution of each node in the query plan, multiple relational operations are processed by pipelining. For example, while a table scan is taking place, rows selected can be pipelined into a join process. Pipelining is the ability to begin a task before its predecessor task has completed, and this ability is key to increasing basic query parallelism. Greenplum Database utilizes pipelining whenever possible to ensure the highest-possible performance.