- Extract data from Github API
- Prepare user to user graph relation model
- Use external graph visualisation tools to analyse graph relations and detect communities
The idea is to build User nodes around the repositories they interact with.
Github API Link: https://developer.github.com/v3/
User resources
that are meaningful for community relation (starting from the most important):
- Pull request review
- Pull request comment
- Issue comment
- Review request
- Issue assignees
- Followers
- Following
- Watch
- Stars
- Reactions
- Given a repository fetch pull request and create node
Pull Request Author
- For given pull request fetch related resources such as: Reviews, Pull reuqest comments and comments
- Connect review, pull request comment and comment authors with
Pull Request Author
Reference pull request: microsoft/vscode#16129
MERGE (requester:User {ID: 1, Name: "nojvek"})
MERGE (i:User {ID: 2, Name: "isidorn"})
MERGE (i)-[:REVIEWED {PR_ID: 16129, Weight: 20}]->(requester)
MERGE (j:User {ID: 3, Name: "jrieken"})
MERGE (j)-[:COMMENTED {PR_ID: 16129, Weight: 10}]-(requester)
RETURN requester, i, j;
Properties
- Node: User ID, User Name
- Relation: Pull Request ID, Weight
Graph database which provides user interface and integrates with various graph algorithms.
Supported community detection algorithms:
- Louvain (
algo.louvain
) - Label Propagation (
algo.labelPropagation
) - Weakly Connected Components (
algo.unionFind
)
Golang
programming language
repositories := fetchRepositories()
for _, repository := range repositories {
pullRequests := fetchPullRequests(repository)
for _, pullRequest := range pullRequests {
resource := fetchPRRelatedResource(pullRequest)
neo.Create(resource)
}
}
repositories := fetchRepositories()
var waitGroup sync.WaitGroup
waitGroup.Add(len(repositories)) // Create task for each repository
for _, repository := range repositories {
go func() {
defer waitGroup.Done() // Notify waitGroup that goroutine is Done at the end
// Processing ...
}()
}
waitGroup.Wait() // Wait for all goroutines to be complete
Container engine - used to resolve neo4j bolt
protocol
seabolt
dependency with Go.