Skip to content

JRI98/string-clusterer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

string-clusterer

Go package for clustering strings. Given a slice of strings, a similarity metric and a threshold, the input strings are clustered according to their similarity.

Similarity metrics are provided by https://github.com/adrg/strutil.

Installation

go get github.com/JRI98/string-clusterer

Example

clusterer := NewClusterer()
input := []string{"apple", "aple", "banana", "bananna", "orange", "ornge"}
result := clusterer.Cluster(input)
fmt.Println(result) // [[apple aple] [banana bananna] [orange ornge]]

Available Similarity Metrics

NewHamming(caseSensitive bool)
NewJaccard(caseSensitive bool)
NewJaro(caseSensitive bool)
NewJaroWinkler(caseSensitive bool)
NewLevenshtein(caseSensitive bool)
NewOverlapCoefficient(caseSensitive bool)
NewSmithWatermanGotoh(caseSensitive bool)
NewSorensenDice(caseSensitive bool)

Repository Maintenance

Run Tests

go test

Run Benchmarks

go test -bench=. -run=^#

Run Fuzzing

go test -fuzz=FuzzCluster -run=^#

About

Go package for clustering strings

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages