Graph Programming
Bulk Synchronous Parallelism(BSP)
-
Processors perform a set of tasks in a parallel fashion and ;must complete the set of tasks till time t (Barrier) so that next set of tasks can be loaded.
-
BSP proceeds in supersteps
- SuperStep1 : Get required data, Compute - Yes/No, Exchange Messages - Yes/No
- Synchronize
- SuperStep2 : Get required data, Compute - Yes/No, Exchange Messages - Yes/No
-Synchronize
...
-
All computation is in context of vertex of graph. One processor can be thought of as a vertex. A processor only receives or can send data to its neighbours.
-
A vertex can have an id and a value
-
A vertex is aware of the edges it is connected to
-
An edge can also have an id and a value
-
What a vertex can do?
- Get its ID
- Get/set its value
- Get/Count its edges
- Get/Set a specific edge value
- Using edge ID
- Using the ID of the target vertex.
- Gets values of all edges connected to a vertex(multigraph)
- Add/remove specific edge
- Start/Stop computing
-
What can an edge do?
- Get its id
- Get/Set its value
- Get the ID of its target vertex
Pregel : A system for Large-Scale Graph Processing
-
e.g. PageRank -- Node connected to an important node is also important