Skip to content

rishabh-sachdeva/spark-bigData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

spark-bigData

graph.tsv is a file in which each line is a triple of the form: SOURCE_NODE DESTINATION_NODE WEIGHT. Note that source and destination nodes are integers, as are weights.

outdegree.py: For each node, compute the outdegree (number of outgoing edges) and output the (node, count) pairs in sorted order by node.

weight.py: For each node, compute the sum of weights of incoming edges and output the (node, weight_sum) pairs in order sorted by node.

pairs.py: For each node X, find a list of all other nodes Y such that there is an (X, Y) edge in the graph and a (Y, X) edge in the graph, and output the (X, [Y1, Y2, ..., Yn]) pairs in order sorted by X. I solved this by building two RDDs, one in which edge source nodes are keys and destination nodes are values, and one in which edge destination nodes are keys and source nodes are values.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages