Future Generation Computer Systems
Year
2020
Abstract
Temporal graphs capture the development of relationships within data throughout time. This model fits naturally within a streaming architecture, where new events can be inserted directly into the graph upon arrival from a data source and be compared to related entities or historical state. However, the vast majority of graph processing systems only consider traditional graph analysis on static data, with some outliers supporting batched updating and temporal analysis across graph snapshots. In this work we define a temporal graph model which can be updated via event streams and discuss the challenges of distribution and graph maintenance. To solve these challenges, we introduce Raphtory, a distributed temporal graph management system which maintains the full graph history in memory, leveraging this to insert streamed events directly into the model without batching or centralised ordering. Raphtory additionally provides an API to perform both approximative analysis on the most up-to-date version of the graph, as well as temporal analysis throughout its full history; executed in parallel with ingestion.
Description
This paper is the first journal paper describing the Raphtory distributed streaming system. While the system design has moved on since this paper was written much of the overall architecture description remains relevant.
Preprint
bibtex
@article{steer_raphtory_2020
author = {Steer, Benjamin and Cuadrado, Felix and Clegg, Richard G.},
journal = {Future Generation Computer Systems},
volume = {101},
pages = {453--464},
year = {2020}
}
author = {Steer, Benjamin and Cuadrado, Felix and Clegg, Richard G.},
journal = {Future Generation Computer Systems},
volume = {101},
pages = {453--464},
year = {2020}
}
doi
https://doi.org/10.1016/j.future.2019.08.022
Paper type
Subject area