@@ -50,7 +50,7 @@ to assist natural language translation based on translation memory.
50
50
51
51
Let’s take a simple example of finding minimum edit distance between
52
52
strings ` ME ` and ` MY ` . Intuitively you already know that minimum edit distance
53
- here is ` 1 ` operation and this operation. And it is a replacing ` E ` with ` Y ` . But
53
+ here is ` 1 ` operation and this operation. And it is replacing ` E ` with ` Y ` . But
54
54
let’s try to formalize it in a form of the algorithm in order to be able to
55
55
do more complex examples like transforming ` Saturday ` into ` Sunday ` .
56
56
@@ -75,12 +75,12 @@ to transform an empty string to `MY`. And it is by inserting `Y` and `M`.
75
75
- Cell ` (1:1) ` contains number 0. It means that it costs nothing
76
76
to transform ` M ` into ` M ` .
77
77
- Cell ` (1:2) ` contains red number 1. It means that we need 1 operation
78
- to transform ` ME ` to ` M ` . And it is be deleting ` E ` .
78
+ to transform ` ME ` to ` M ` . And it is by deleting ` E ` .
79
79
- And so on...
80
80
81
81
This looks easy for such small matrix as ours (it is only ` 3x3 ` ). But here you
82
82
may find basic concepts that may be applied to calculate all those numbers for
83
- bigger matrices (let’s say ` 9x7 ` one, for ` Saturday → Sunday ` transformation).
83
+ bigger matrices (let’s say a ` 9x7 ` matrix for ` Saturday → Sunday ` transformation).
84
84
85
85
According to the formula you only need three adjacent cells ` (i-1:j) ` , ` (i-1:j-1) ` , and ` (i:j-1) ` to
86
86
calculate the number for current cell ` (i:j) ` . All we need to do is to find the
@@ -97,13 +97,13 @@ Let's draw a decision graph for this problem.
97
97
98
98
You may see a number of overlapping sub-problems on the picture that are marked
99
99
with red. Also there is no way to reduce the number of operations and make it
100
- less then a minimum of those three adjacent cells from the formula.
100
+ less than a minimum of those three adjacent cells from the formula.
101
101
102
102
Also you may notice that each cell number in the matrix is being calculated
103
103
based on previous ones. Thus the tabulation technique (filling the cache in
104
104
bottom-up direction) is being applied here.
105
105
106
- Applying this principles further we may solve more complicated cases like
106
+ Applying this principle further we may solve more complicated cases like
107
107
with ` Saturday → Sunday ` transformation.
108
108
109
109
![ Levenshtein distance] ( https://cdn-images-1.medium.com/max/1600/1*fPEHiImYLKxSTUhrGbYq3g.jpeg )
0 commit comments