Skip to content

Commit 2ee7096

Browse files
committed
2 parents d84f8bd + bf63f66 commit 2ee7096

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Or check the [releases](https://github.com/tdebatty/java-string-similarity/relea
1818

1919
## Summary
2020

21-
The main characteristics of each implemented algorithm are presented below. The "cost" column gives an estimation of the computational cost to compute te similarity between two strings of length m and n respectively.
21+
The main characteristics of each implemented algorithm are presented below. The "cost" column gives an estimation of the computational cost to compute the similarity between two strings of length m and n respectively.
2222

2323
| | | Normalized? | Metric? | Type | Cost |
2424
|-------- |------- |------------- |---------- | ------ | ---- |
@@ -113,6 +113,10 @@ public class MyApp {
113113
## Weighted Levenshtein
114114
An implementation of Levenshtein that allows to define different weights for different character substitutions.
115115

116+
This algorithm is usually used for optical character recognition (OCR) applications. For OCR, the cost of substituting P and R is lower then the cost of substituting P and M for example because because from and OCR point of view P is similar to R.
117+
118+
It can also be used for keyboard typing auto-correction. Here the cost of substituting E and R is lower for example because these are located next to each other on an AZERTY or QWERTY keyboard. Hence the probability that the user mistyped the characters is higher.
119+
116120
```java
117121
import info.debatty.java.stringsimilarity.*;
118122

0 commit comments

Comments
 (0)