Research Paper
Load-balancing Parallel Relational Algebra
Event Type
Research Paper
TimeTuesday, June 23rd4:20pm - 4:45pm
DescriptionRelational algebra (RA) comprises a basis of important operations, sufficient to power state-of-the-art reasoning engines for Datalog and related logic-programming languages. Parallel RA implementations
can thus play a significant role in extracting parallelism inherent in a wide variety of analytic problems. In general, bottom-up logical inference can be implemented as fixed-point iteration over RA kernels; relations dynamically accumulate new tuples of information according to a set of rules until no new tuples can be discovered from previously inferred tuples and relevant rules (RA kernels). While this strategy has been quite successful in single-node contexts, it poses unique challenges when distributed over many-node, networked clusters—especially regarding how the work-load is balanced across available compute resources. In this paper, we identify two fundamental kinds of load imbalance and present a strategy to address each. We investigate both spacial load imbalance—imbalance across each relation (across compute nodes)—and temporal load imbalance–imbalance in tuples produced across fixed-point iterations. For spacial balancing, we implement refinement and consolidation procedures. For temporal balancing, we implement a technique that permits the residual workload from a busy iteration to roll over to a new iteration. In sum, these techniques permit fully dynamic load-balancing of relational algebra that is robust to changes across time.
This section contains ISC 2020 Digital content.
Please log in with your password to view it.

Not yet registered? Event registration is available here .