-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
Feature/#4 train model
- Loading branch information
Showing
804 changed files
with
243,015 additions
and
2 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
Submodule langdata
added at
0fabfc
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Submodule tessdata_best
added at
e2aad9
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
--- | ||
BasedOnStyle: Google | ||
# Only merge empty functions. | ||
AllowShortFunctionsOnASingleLine: Empty | ||
# Do not allow short if statements. | ||
AllowShortIfStatementsOnASingleLine: false | ||
# Enforce always the same pointer alignment. | ||
DerivePointerAlignment: false | ||
IndentPPDirectives: AfterHash |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
Before you submit an issue, please review [the guidelines for this repository](https://github.com/tesseract-ocr/tesseract/blob/master/CONTRIBUTING.md). | ||
|
||
Please report an issue only for a BUG, not for asking questions. | ||
|
||
Note that it will be much easier for us to fix the issue if a test case that | ||
reproduces the problem is provided. Ideally this test case should not have any | ||
external dependencies. Provide a copy of the image or link to files for the test case. | ||
|
||
Please delete this text and fill in the template below. | ||
|
||
------------------------ | ||
|
||
### Environment | ||
|
||
* **Tesseract Version**: <!-- compulsory. you must provide your version --> | ||
* **Commit Number**: <!-- optional. if known - specify commit used, if built from source --> | ||
* **Platform**: <!-- either `uname -a` output, or if Windows, version and 32-bit or 64-bit --> | ||
|
||
### Current Behavior: | ||
|
||
### Expected Behavior: | ||
|
||
### Suggested Fix: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,120 @@ | ||
*~ | ||
# Windows | ||
*.user.* | ||
*.idea* | ||
*.log | ||
*.tlog | ||
*.cache | ||
*.obj | ||
*.sdf | ||
*.opensdf | ||
*.lastbuildstate | ||
*.unsuccessfulbuild | ||
*.suo | ||
*.res | ||
*.ipch | ||
*.manifest | ||
*.user | ||
|
||
# Linux | ||
# ignore local configuration | ||
config.* | ||
config/* | ||
Makefile | ||
Makefile.in | ||
*.m4 | ||
|
||
# ignore help scripts/files | ||
configure | ||
libtool | ||
stamp-h1 | ||
tesseract.pc | ||
config_auto.h | ||
/doc/html/* | ||
/doc/*.1 | ||
/doc/*.5 | ||
/doc/*.html | ||
/doc/*.xml | ||
|
||
# generated version file | ||
/src/api/tess_version.h | ||
|
||
# executables | ||
/src/api/tesseract | ||
/src/training/ambiguous_words | ||
/src/training/classifier_tester | ||
/src/training/cntraining | ||
/src/training/combine_tessdata | ||
/src/training/dawg2wordlist | ||
/src/training/merge_unicharsets | ||
/src/training/mftraining | ||
/src/training/set_unicharset_properties | ||
/src/training/shapeclustering | ||
/src/training/text2image | ||
/src/training/unicharset_extractor | ||
/src/training/wordlist2dawg | ||
|
||
*.patch | ||
|
||
# files generated by libtool | ||
/src/training/combine_lang_model | ||
/src/training/lstmeval | ||
/src/training/lstmtraining | ||
|
||
# ignore compilation files | ||
build/* | ||
/bin | ||
*/.deps/* | ||
*/.libs/* | ||
*/*/.deps/* | ||
*/*/.libs/* | ||
*.lo | ||
*.la | ||
*.o | ||
*.Plo | ||
*.a | ||
*.class | ||
*.jar | ||
__pycache__ | ||
|
||
# tessdata | ||
*.traineddata | ||
|
||
# OpenCL | ||
tesseract_opencl_profile_devices.dat | ||
kernel*.bin | ||
|
||
# build dirs | ||
/build* | ||
/.cppan | ||
/cppan | ||
/*.dll | ||
/*.lib | ||
/*.exe | ||
/*.lnk | ||
/win* | ||
.vs* | ||
.s* | ||
|
||
# files generated by "make check" | ||
/tests/.dirstamp | ||
/unittest/*.trs | ||
/unittest/tmp/* | ||
|
||
# test programs | ||
/unittest/*_test | ||
/unittest/primesbitvector | ||
/unittest/primesmap | ||
|
||
# generated files from unlvtests | ||
times.txt | ||
/unlvtests/results* | ||
|
||
# snap packaging specific rules | ||
/parts/ | ||
/stage/ | ||
/prime/ | ||
/snap/.snapcraft/ | ||
|
||
/*.snap | ||
/*_source.tar.bz2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
[submodule "abseil"] | ||
path = abseil | ||
url = https://github.com/abseil/abseil-cpp.git | ||
[submodule "googletest"] | ||
path = googletest | ||
url = https://github.com/google/googletest.git | ||
[submodule "test"] | ||
path = test | ||
url = https://github.com/tesseract-ocr/test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
extraction: | ||
cpp: | ||
prepare: | ||
packages: | ||
- libpango1.0-dev | ||
configure: | ||
command: | ||
- ./autogen.sh | ||
- mkdir _lgtm_build_dir | ||
- cd _lgtm_build_dir | ||
- ../configure | ||
index: | ||
build_command: | ||
- cd _lgtm_build_dir | ||
- make training | ||
python: | ||
python_setup: | ||
version: 3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
# Travis CI configuration for Tesseract | ||
|
||
language: cpp | ||
|
||
dist: xenial | ||
|
||
env: | ||
- LEPT_VER=1.77.0 | ||
|
||
notifications: | ||
email: false | ||
|
||
sudo: false | ||
|
||
os: | ||
- linux | ||
- osx | ||
|
||
addons: | ||
apt: | ||
sources: | ||
#- ubuntu-toolchain-r-test | ||
packages: | ||
- libarchive-dev | ||
- libpango1.0-dev | ||
#- g++-6 | ||
|
||
#matrix: | ||
#include: | ||
#- os: osx | ||
#install: | ||
#script: brew install tesseract --HEAD | ||
#cache: | ||
#directories: | ||
#- $HOME/Library/Caches/Homebrew | ||
#allow_failures: | ||
#- script: brew install tesseract --HEAD | ||
|
||
cache: | ||
directories: | ||
- leptonica-$LEPT_VER | ||
|
||
before_install: | ||
- if [[ $TRAVIS_OS_NAME == linux ]]; then LINUX=true; fi | ||
- if [[ $TRAVIS_OS_NAME == osx ]]; then OSX=true; fi | ||
|
||
install: | ||
#- if [[ $LINUX && "$CXX" = "g++" ]]; then export CXX="g++-6" CC="gcc-6"; fi | ||
- if test ! -d leptonica-$LEPT_VER/src; then curl -Ls https://github.com/DanBloomberg/leptonica/archive/$LEPT_VER.tar.gz | tar -xz; fi | ||
- if test ! -d leptonica-$LEPT_VER/usr; then cmake -Hleptonica-$LEPT_VER -Bleptonica-$LEPT_VER/build -DCMAKE_INSTALL_PREFIX=leptonica-$LEPT_VER/usr; fi | ||
- if test ! -e leptonica-$LEPT_VER/usr/lib/libleptonica.so; then make -C leptonica-$LEPT_VER/build install; fi | ||
|
||
script: | ||
- mkdir build | ||
- cd build | ||
- cmake .. -DLeptonica_DIR=leptonica-$LEPT_VER/build -DCPPAN_BUILD=OFF | ||
- make |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
Ray Smith (lead developer) <[email protected]> | ||
Ahmad Abdulkader | ||
Rika Antonova | ||
Nicholas Beato | ||
Jeff Breidenbach | ||
Samuel Charron | ||
Phil Cheatle | ||
Simon Crouch | ||
David Eger | ||
Sheelagh Huddleston | ||
Dan Johnson | ||
Rajesh Katikam | ||
Thomas Kielbus | ||
Dar-Shyang Lee | ||
Zongyi (Joe) Liu | ||
Robert Moss | ||
Chris Newton | ||
Michael Reimer | ||
Marius Renn | ||
Raquel Romano | ||
Christy Russon | ||
Shobhit Saxena | ||
Mark Seaman | ||
Faisal Shafait | ||
Hiroshi Takenaka | ||
Ranjith Unnikrishnan | ||
Joern Wanke | ||
Ping Ping Xiu | ||
Andrew Ziem | ||
Oscar Zuniga | ||
|
||
Community Contributors: | ||
Zdenko Podobný (Maintainer) | ||
Jim Regan (Maintainer) | ||
James R Barlow | ||
Amit Dovev | ||
Martin Ettl | ||
Shree Devi Kumar | ||
Noah Metzger | ||
Tom Morris | ||
Tobias Müller | ||
Egor Pugin | ||
Sundar M. Vaidya | ||
Stefan Weil |
Oops, something went wrong.