Skip to content

WIP: Document api version compare #13

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 45 commits into
base: document_api
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
f5d230d
fix: updated payload
patbeqo Jul 21, 2022
e2cda26
Merge pull request #10 from ShipSolver/patrik/payload-changes
patbeqo Jul 21, 2022
ba99c36
update requirements
Jul 21, 2022
2c01af3
status api bugfix
satwik18 Jul 22, 2022
c9a654f
gitignore log
satwik18 Jul 22, 2022
898144c
Status API fixes
satwik18 Jul 22, 2022
62bf60a
milestones bug fix
satwik18 Jul 22, 2022
bd4ca4b
WIP
satwik18 Jul 5, 2022
800b653
modifying db schema
dantemazza Jul 5, 2022
4da06fe
push
dantemazza Jul 5, 2022
f88f54d
fix schema
dantemazza Jul 6, 2022
b9f2b0f
get endpoints
dantemazza Jul 6, 2022
edf4592
ALL tickets API done
dantemazza Jul 7, 2022
22194a6
Fixing default date bug
dantemazza Jul 7, 2022
44137f3
Cors header
dantemazza Jul 7, 2022
ea4b6f4
Cors header
dantemazza Jul 7, 2022
59e6480
Cors header
dantemazza Jul 7, 2022
e6c65bb
message
dantemazza Jul 7, 2022
c39d48c
Stefan codeazzzzzzzzzzzzzzzzzzzzzzzzzzzz
dantemazza Jul 7, 2022
dfdf60f
Fix commit bugs for mergmerge
dantemazza Jul 7, 2022
36efba9
Fixed celery pipeline
dantemazza Jul 8, 2022
1698ab9
document api finished
dantemazza Jul 21, 2022
4826242
add .vscode to gitignore
satwik18 Jul 5, 2022
6c7a7c4
modifying db schema
dantemazza Jul 5, 2022
4ca93a9
push
dantemazza Jul 5, 2022
039e845
fix schema
dantemazza Jul 6, 2022
9f305c5
get endpoints
dantemazza Jul 6, 2022
15e334e
Fixing default date bug
dantemazza Jul 7, 2022
63b1dad
Cors header
dantemazza Jul 7, 2022
0fa2b57
Cors header
dantemazza Jul 7, 2022
9ff3099
Cors header
dantemazza Jul 7, 2022
3b3a689
message
dantemazza Jul 7, 2022
694923e
Stefan codeazzzzzzzzzzzzzzzzzzzzzzzzzzzz
dantemazza Jul 7, 2022
2991a93
Fix commit bugs for mergmerge
dantemazza Jul 7, 2022
5d894de
Fixed celery pipeline
dantemazza Jul 8, 2022
f2c58a0
s3 presigned links
dantemazza Jul 21, 2022
38c2c38
enable cognito
satwik18 Jul 22, 2022
581a853
docker changes for flask-cognito-lib
satwik18 Jul 22, 2022
3cf32fc
docker file changes
satwik18 Jul 22, 2022
35d7c35
Dockerization complete
dantemazza Jul 22, 2022
40e728e
Extraction
dantemazza Jul 22, 2022
f1ab001
satwik
dantemazza Jul 22, 2022
844c300
docker changes for flask-cognito-lib
satwik18 Jul 22, 2022
009a0a3
Co-authored-by: Dante Mazza <[email protected]>
Jul 22, 2022
ede43da
remove breakinbg changes
Jul 22, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,5 @@ __pycache__/
*.pyc
.vscode
**/.env
tmp
tmp
**/log.txt
18 changes: 18 additions & 0 deletions servers/app.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
FROM python:3.9
RUN apt-get update && apt-get -y install qpdf poppler-utils && apt-get install -y build-essential libpoppler-cpp-dev pkg-config python-dev
RUN apt -y install libpq-dev
COPY tenant/requirements.txt .
RUN pip3 install --upgrade pip
RUN pip3 install -r requirements.txt
RUN pip3 install psycopg2
RUN git -C /root clone https://github.com/ShipSolver/flask-cognito-lib.git
RUN pip3 install -e /root/flask-cognito-lib
WORKDIR /opt/metadata-extraction
ENV PYTHONPATH .
EXPOSE 6767
ENV aws_secret_access_key 0zGUnCc0XNGX5lAfoN88EPnycnuZ0bMOWWKEqine
ENV aws_access_key_id AKIASPMMHOETWM2ETVWJ
ENV AWS_REGION="us-east-1"
ENV AWS_COGNITO_USER_POOL_ID="us-east-1_6AUY6LKPZ"
ENV AWS_COGNITO_USER_POOL_CLIENT_ID="2vukbtukva3u0oh29lf32ghmkp"
ENV AWS_COGNITO_DOMAIN="https://shipsolver-dev.auth.us-east-1.amazoncognito.com/"
17 changes: 17 additions & 0 deletions servers/celery.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
FROM python:3.9
RUN apt-get update && apt-get -y install qpdf poppler-utils && apt-get install -y build-essential libpoppler-cpp-dev pkg-config python-dev
RUN apt -y install tesseract-ocr && apt -y install libtesseract-dev
COPY tenant/requirements.txt .
RUN pip3 install --upgrade pip
RUN pip3 install -r requirements.txt
RUN pip3 install psycopg2-binary
RUN git -C /root clone https://github.com/ShipSolver/flask-cognito-lib.git
RUN pip3 install -e /root/flask-cognito-lib
WORKDIR /opt/metadata-extraction/tenant
ENV PYTHONPATH ..
ENV aws_secret_access_key 0zGUnCc0XNGX5lAfoN88EPnycnuZ0bMOWWKEqine
ENV aws_access_key_id AKIASPMMHOETWM2ETVWJ
ENV AWS_REGION="us-east-1"
ENV AWS_COGNITO_USER_POOL_ID="us-east-1_6AUY6LKPZ"
ENV AWS_COGNITO_USER_POOL_CLIENT_ID="2vukbtukva3u0oh29lf32ghmkp"
ENV AWS_COGNITO_DOMAIN="https://shipsolver-dev.auth.us-east-1.amazoncognito.com/"
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ services:
dockerfile: celery.Dockerfile
volumes:
- .:/opt/metadata-extraction
command: celery -A __init__.client worker --loglevel=info -f celery.logs -Ofair -c 2
command: celery -A config.client worker --loglevel=info -f celery.logs -Ofair -c 2
tty: true
app:
hostname: app.wlp.com
Expand All @@ -36,8 +36,8 @@ services:
- .:/opt/metadata-extraction
container_name: app01
ports:
- "5000:5000"
command: python3 server/__init__.py
- "6767:6767"
command: python3 tenant/server.py
tty: true
flower:
hostname: flower.wlp.com
Expand Down
15 changes: 9 additions & 6 deletions extraction/app.py → servers/extraction/app.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
import os
# from multilingual_pdf2text.pdf2text import PDF2Text
# from multilingual_pdf2text.models.document_model.document import Document
# import pdfplumber
# import extraction.extract as e
from multilingual_pdf2text.pdf2text import PDF2Text
from multilingual_pdf2text.models.document_model.document import Document
import pdfplumber
import extraction.extract as e
import json
from celery.utils.log import get_logger

logger = get_logger(__name__)

def read_pdfplumber(file_name):
with pdfplumber.open(file_name) as pdf:
Expand All @@ -26,8 +28,9 @@ def work(folder_path):

ml_page_text = list(content)[0]["text"]
pp_text = read_pdfplumber(pdf_file)

extract_json = e.extract(ml_page_text, plumber_page=pp_text)
for i in range(14):
logger.info("WE HERE----------------")
extract_json = e.generate_doclist(e.extract(ml_page_text, plumber_page=pp_text))

with open(f"{folder_path}/{pdf_uuid}.json", "w") as f:
json.dump(extract_json, f, indent=2)
Expand Down
39 changes: 21 additions & 18 deletions extraction/const.py → servers/extraction/const.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,30 +6,33 @@

#doclist_keys

HOUSE_REF = "house_ref"
BARCODE = "barcode"
FIRST_PARTY = "first_party"
NUM_PCS = "num_pcs"
PCS = "pcs"
BARCODE = "barcodeNumber"
HOUSE_REF = "houseReferenceNumber"
WEIGHT = "weight"
NUM_PCS = "claimedNumberOfPieces"
BOL_NUM = "BOLNumber"
SPECIAL_SERVICES = "specialServices"
SPECIAL_INSTRUCTIONS = "specialInstructions"
CONSIGNEE = "consignee"
SHIPPER = "shipper"
COMPANY = "Company"
NAME = "Name"
ADDRESS = "Address"
POSTAL_CODE = "PostalCode"
PHONE_NUMBER = "PhoneNumber"

NO_SIGNATURE_REQUIRED = "noSignatureRequired"
TAILGATE_AUTHORIZED = "tailgateAuthorized"

FIRST_PARTY = "customerName"

PCS = "pieces"

PKG = "pkg"
WT_LBS = "wt(lbs)"
WT_LBS = "weight"
COMMODITY_DESCRIPTION = "commodity_description"
DIMS_IN = "dims(in)"

BOL_NUM = "bol_num"
SPECIAL_SERVICES = "special_services"
SPECIAL_INSTRUCTIONS = "special_instructions"

COMPANY = "company"
NAME = "name"
ADDRESS = "address"
POSTAL_CODE = "postal_code"
PHONE_NUMBER = "phone_number"

CONSIGNEE = "consignee"
SHIPPER = "shipper"

CEVA_SHIPPER_FIELDS = [COMPANY, ADDRESS]
CEVA_CONSIGNEE_FIELDS = [NAME, ADDRESS]
Expand Down
28 changes: 13 additions & 15 deletions extraction/extract.py → servers/extraction/extract.py
Original file line number Diff line number Diff line change
Expand Up @@ -271,25 +271,23 @@ def generate_doclist(_list):
HOUSE_REF: _list[HOUSE_REF] if HOUSE_REF in _list else "",
BARCODE: _list[BARCODE] if BARCODE in _list else "",
PCS: _list[PCS] if PCS in _list else [],
NUM_PCS: _list[NUM_PCS] if NUM_PCS in _list else "",
NUM_PCS: _list[NUM_PCS] if NUM_PCS in _list else 0,
WEIGHT: _list[WEIGHT] if WEIGHT in _list else "",
BOL_NUM: _list[BOL_NUM] if BOL_NUM in _list else "",
SPECIAL_SERVICES: _list[SPECIAL_SERVICES] if SPECIAL_SERVICES in _list else "",
SPECIAL_INSTRUCTIONS: _list[SPECIAL_INSTRUCTIONS] if SPECIAL_INSTRUCTIONS in _list else "",
CONSIGNEE: {
COMPANY: _list[CONSIGNEE][COMPANY] if CONSIGNEE in _list and COMPANY in _list[CONSIGNEE] else "",
NAME: _list[CONSIGNEE][NAME] if CONSIGNEE in _list and NAME in _list[CONSIGNEE] else "",
ADDRESS: _list[CONSIGNEE][ADDRESS] if CONSIGNEE in _list and ADDRESS in _list[CONSIGNEE] else "",
POSTAL_CODE: _list[CONSIGNEE][POSTAL_CODE] if CONSIGNEE in _list and POSTAL_CODE in _list[CONSIGNEE] else "",
PHONE_NUMBER: _list[CONSIGNEE][PHONE_NUMBER] if CONSIGNEE in _list and PHONE_NUMBER in _list[CONSIGNEE] else ""
},
SHIPPER: {
COMPANY: _list[SHIPPER][COMPANY] if SHIPPER in _list and COMPANY in _list[SHIPPER] else "",
NAME: _list[SHIPPER][NAME] if SHIPPER in _list and NAME in _list[SHIPPER] else "",
ADDRESS: _list[SHIPPER][ADDRESS] if SHIPPER in _list and ADDRESS in _list[SHIPPER] else "",
POSTAL_CODE: _list[SHIPPER][POSTAL_CODE] if SHIPPER in _list and POSTAL_CODE in _list[SHIPPER] else "",
PHONE_NUMBER: _list[SHIPPER][PHONE_NUMBER] if SHIPPER in _list and PHONE_NUMBER in _list[SHIPPER] else ""
}
CONSIGNEE+COMPANY: _list[CONSIGNEE][COMPANY] if CONSIGNEE in _list and COMPANY in _list[CONSIGNEE] else "",
CONSIGNEE+NAME: _list[CONSIGNEE][NAME] if CONSIGNEE in _list and NAME in _list[CONSIGNEE] else "",
CONSIGNEE+ADDRESS: _list[CONSIGNEE][ADDRESS] if CONSIGNEE in _list and ADDRESS in _list[CONSIGNEE] else "",
CONSIGNEE+POSTAL_CODE: _list[CONSIGNEE][POSTAL_CODE] if CONSIGNEE in _list and POSTAL_CODE in _list[CONSIGNEE] else "",
CONSIGNEE+PHONE_NUMBER: _list[CONSIGNEE][PHONE_NUMBER] if CONSIGNEE in _list and PHONE_NUMBER in _list[CONSIGNEE] else "",
SHIPPER+COMPANY: _list[SHIPPER][COMPANY] if SHIPPER in _list and COMPANY in _list[SHIPPER] else "",
SHIPPER+NAME: _list[SHIPPER][NAME] if SHIPPER in _list and NAME in _list[SHIPPER] else "",
SHIPPER+ADDRESS: _list[SHIPPER][ADDRESS] if SHIPPER in _list and ADDRESS in _list[SHIPPER] else "",
SHIPPER+POSTAL_CODE: _list[SHIPPER][POSTAL_CODE] if SHIPPER in _list and POSTAL_CODE in _list[SHIPPER] else "",
SHIPPER+PHONE_NUMBER: _list[SHIPPER][PHONE_NUMBER] if SHIPPER in _list and PHONE_NUMBER in _list[SHIPPER] else "",
NO_SIGNATURE_REQUIRED: _list[NO_SIGNATURE_REQUIRED] if NO_SIGNATURE_REQUIRED in _list else False,
TAILGATE_AUTHORIZED: _list[TAILGATE_AUTHORIZED] if TAILGATE_AUTHORIZED in _list else False
}


Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
22 changes: 0 additions & 22 deletions servers/tenant/Pipfile

This file was deleted.

Loading