Skip to content

Commit

Permalink
Merge pull request #5 from I-can-read/feature/#4-train-model
Browse files Browse the repository at this point in the history
Feature/#4 train model
  • Loading branch information
genieu99 authored Mar 28, 2023
2 parents eadc315 + cbee40a commit 67699e9
Show file tree
Hide file tree
Showing 804 changed files with 243,015 additions and 2 deletions.
1 change: 0 additions & 1 deletion README.md

This file was deleted.

1 change: 0 additions & 1 deletion imageToText.ipynb

This file was deleted.

1 change: 1 addition & 0 deletions langdata
Submodule langdata added at 0fabfc
Binary file removed result/result.jpeg
Binary file not shown.
Binary file removed result/result2.jpeg
Binary file not shown.
Binary file removed result/result2_new.jpeg
Binary file not shown.
Binary file removed result/result2_newBinary.jpeg
Binary file not shown.
Binary file removed result/result3_newBinary.jpeg
Binary file not shown.
Binary file removed result/result4_best.jpeg
Binary file not shown.
Binary file removed result/result4_newBinary.jpeg
Binary file not shown.
Binary file removed result/result_newBinary.jpeg
Binary file not shown.
1 change: 1 addition & 0 deletions tessdata_best
Submodule tessdata_best added at e2aad9
9 changes: 9 additions & 0 deletions tesseract-4.1.1/.clang-format
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
BasedOnStyle: Google
# Only merge empty functions.
AllowShortFunctionsOnASingleLine: Empty
# Do not allow short if statements.
AllowShortIfStatementsOnASingleLine: false
# Enforce always the same pointer alignment.
DerivePointerAlignment: false
IndentPPDirectives: AfterHash
23 changes: 23 additions & 0 deletions tesseract-4.1.1/.github/ISSUE_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
Before you submit an issue, please review [the guidelines for this repository](https://github.com/tesseract-ocr/tesseract/blob/master/CONTRIBUTING.md).

Please report an issue only for a BUG, not for asking questions.

Note that it will be much easier for us to fix the issue if a test case that
reproduces the problem is provided. Ideally this test case should not have any
external dependencies. Provide a copy of the image or link to files for the test case.

Please delete this text and fill in the template below.

------------------------

### Environment

* **Tesseract Version**: <!-- compulsory. you must provide your version -->
* **Commit Number**: <!-- optional. if known - specify commit used, if built from source -->
* **Platform**: <!-- either `uname -a` output, or if Windows, version and 32-bit or 64-bit -->

### Current Behavior:

### Expected Behavior:

### Suggested Fix:
120 changes: 120 additions & 0 deletions tesseract-4.1.1/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
*~
# Windows
*.user.*
*.idea*
*.log
*.tlog
*.cache
*.obj
*.sdf
*.opensdf
*.lastbuildstate
*.unsuccessfulbuild
*.suo
*.res
*.ipch
*.manifest
*.user

# Linux
# ignore local configuration
config.*
config/*
Makefile
Makefile.in
*.m4

# ignore help scripts/files
configure
libtool
stamp-h1
tesseract.pc
config_auto.h
/doc/html/*
/doc/*.1
/doc/*.5
/doc/*.html
/doc/*.xml

# generated version file
/src/api/tess_version.h

# executables
/src/api/tesseract
/src/training/ambiguous_words
/src/training/classifier_tester
/src/training/cntraining
/src/training/combine_tessdata
/src/training/dawg2wordlist
/src/training/merge_unicharsets
/src/training/mftraining
/src/training/set_unicharset_properties
/src/training/shapeclustering
/src/training/text2image
/src/training/unicharset_extractor
/src/training/wordlist2dawg

*.patch

# files generated by libtool
/src/training/combine_lang_model
/src/training/lstmeval
/src/training/lstmtraining

# ignore compilation files
build/*
/bin
*/.deps/*
*/.libs/*
*/*/.deps/*
*/*/.libs/*
*.lo
*.la
*.o
*.Plo
*.a
*.class
*.jar
__pycache__

# tessdata
*.traineddata

# OpenCL
tesseract_opencl_profile_devices.dat
kernel*.bin

# build dirs
/build*
/.cppan
/cppan
/*.dll
/*.lib
/*.exe
/*.lnk
/win*
.vs*
.s*

# files generated by "make check"
/tests/.dirstamp
/unittest/*.trs
/unittest/tmp/*

# test programs
/unittest/*_test
/unittest/primesbitvector
/unittest/primesmap

# generated files from unlvtests
times.txt
/unlvtests/results*

# snap packaging specific rules
/parts/
/stage/
/prime/
/snap/.snapcraft/

/*.snap
/*_source.tar.bz2
9 changes: 9 additions & 0 deletions tesseract-4.1.1/.gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
[submodule "abseil"]
path = abseil
url = https://github.com/abseil/abseil-cpp.git
[submodule "googletest"]
path = googletest
url = https://github.com/google/googletest.git
[submodule "test"]
path = test
url = https://github.com/tesseract-ocr/test
18 changes: 18 additions & 0 deletions tesseract-4.1.1/.lgtm.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
extraction:
cpp:
prepare:
packages:
- libpango1.0-dev
configure:
command:
- ./autogen.sh
- mkdir _lgtm_build_dir
- cd _lgtm_build_dir
- ../configure
index:
build_command:
- cd _lgtm_build_dir
- make training
python:
python_setup:
version: 3
57 changes: 57 additions & 0 deletions tesseract-4.1.1/.travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Travis CI configuration for Tesseract

language: cpp

dist: xenial

env:
- LEPT_VER=1.77.0

notifications:
email: false

sudo: false

os:
- linux
- osx

addons:
apt:
sources:
#- ubuntu-toolchain-r-test
packages:
- libarchive-dev
- libpango1.0-dev
#- g++-6

#matrix:
#include:
#- os: osx
#install:
#script: brew install tesseract --HEAD
#cache:
#directories:
#- $HOME/Library/Caches/Homebrew
#allow_failures:
#- script: brew install tesseract --HEAD

cache:
directories:
- leptonica-$LEPT_VER

before_install:
- if [[ $TRAVIS_OS_NAME == linux ]]; then LINUX=true; fi
- if [[ $TRAVIS_OS_NAME == osx ]]; then OSX=true; fi

install:
#- if [[ $LINUX && "$CXX" = "g++" ]]; then export CXX="g++-6" CC="gcc-6"; fi
- if test ! -d leptonica-$LEPT_VER/src; then curl -Ls https://github.com/DanBloomberg/leptonica/archive/$LEPT_VER.tar.gz | tar -xz; fi
- if test ! -d leptonica-$LEPT_VER/usr; then cmake -Hleptonica-$LEPT_VER -Bleptonica-$LEPT_VER/build -DCMAKE_INSTALL_PREFIX=leptonica-$LEPT_VER/usr; fi
- if test ! -e leptonica-$LEPT_VER/usr/lib/libleptonica.so; then make -C leptonica-$LEPT_VER/build install; fi

script:
- mkdir build
- cd build
- cmake .. -DLeptonica_DIR=leptonica-$LEPT_VER/build -DCPPAN_BUILD=OFF
- make
44 changes: 44 additions & 0 deletions tesseract-4.1.1/AUTHORS
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
Ray Smith (lead developer) <[email protected]>
Ahmad Abdulkader
Rika Antonova
Nicholas Beato
Jeff Breidenbach
Samuel Charron
Phil Cheatle
Simon Crouch
David Eger
Sheelagh Huddleston
Dan Johnson
Rajesh Katikam
Thomas Kielbus
Dar-Shyang Lee
Zongyi (Joe) Liu
Robert Moss
Chris Newton
Michael Reimer
Marius Renn
Raquel Romano
Christy Russon
Shobhit Saxena
Mark Seaman
Faisal Shafait
Hiroshi Takenaka
Ranjith Unnikrishnan
Joern Wanke
Ping Ping Xiu
Andrew Ziem
Oscar Zuniga

Community Contributors:
Zdenko Podobný (Maintainer)
Jim Regan (Maintainer)
James R Barlow
Amit Dovev
Martin Ettl
Shree Devi Kumar
Noah Metzger
Tom Morris
Tobias Müller
Egor Pugin
Sundar M. Vaidya
Stefan Weil
Loading

0 comments on commit 67699e9

Please sign in to comment.