Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
08647ae
refactor: rename test classes to remove warnings with pytests
xmalet-nrcan Jul 11, 2025
d317c8b
test: refine ReaderFactory unit tests for PostGisTableDataReader and …
xmalet-nrcan Jul 11, 2025
c98af6a
refactor: enhance ReaderFactory to support Connection type and improv…
xmalet-nrcan Jul 11, 2025
70728df
feat: add properties for table name, schema, and formatted table name…
xmalet-nrcan Jul 11, 2025
4a8e5d2
refactor: rename CI configuration files for clarity
xmalet-nrcan Jul 11, 2025
03191d8
refactor: rename CI configuration files for clarity
xmalet-nrcan Jul 11, 2025
832f724
fix: update CI conditions to correctly handle main branch and version…
xmalet-nrcan Jul 11, 2025
f2deb52
fix: update CI conditions to correctly handle main branch without ver…
xmalet-nrcan Jul 11, 2025
cfd65a7
fix: update CI conditions to allow builds on main branch without vers…
xmalet-nrcan Jul 11, 2025
ce8ad1a
chore: bump version to 0.1.24a
github-actions[bot] Jul 11, 2025
95bbd3b
Update README.md
xmalet-nrcan Jul 11, 2025
27ee9cd
Update README.md
xmalet-nrcan Jul 11, 2025
1065039
Update ci-release.yml
xmalet-nrcan Jul 11, 2025
c214444
fix: update CI workflow to use Poetry for dependency management and t…
xmalet-nrcan Jul 11, 2025
2adad1f
chore: bump version to 0.1.24b
github-actions[bot] Jul 11, 2025
9370e3b
fix: enhance test coverage reporting in CI workflows
xmalet-nrcan Jul 11, 2025
540e381
Merge remote-tracking branch 'origin/main'
xmalet-nrcan Jul 11, 2025
112fe5a
chore: bump version to 0.1.24c
github-actions[bot] Jul 11, 2025
eca0584
Update README.md
xmalet-nrcan Jul 11, 2025
189e4a6
refactor: introduce context manager for improved resource management …
xmalet-nrcan Jul 22, 2025
763f3b7
refactor: implement context manager for session handling in abstract_…
xmalet-nrcan Jul 22, 2025
40d9308
chore: bump version to 0.1.25 in pyproject.toml
xmalet-nrcan Jul 22, 2025
7513332
fix: update CI badge link in README.md for correct repository path
xmalet-nrcan Jul 22, 2025
c4d0a88
refactor: clean up imports and improve formatting in abstract_databas…
xmalet-nrcan Jul 22, 2025
e57a766
chore: bump version to 0.1.26
github-actions[bot] Jul 22, 2025
0f4bbfb
refactor: remove timezonefinder dependency from pyproject.toml
xmalet-nrcan Jul 22, 2025
ce75e36
refactor: update session handling to use SQLAlchemy's Session and rem…
xmalet-nrcan Jul 22, 2025
366f099
ruff check and format
xmalet-nrcan Jul 22, 2025
8b8e39f
Merge remote-tracking branch 'origin/main'
xmalet-nrcan Jul 22, 2025
27f97c5
chore: bump version to 0.1.25a
github-actions[bot] Jul 22, 2025
36170f4
chore: bump version to 0.1.27
github-actions[bot] Jul 22, 2025
deaa475
feat: add session property for nested transaction handling
xmalet-nrcan Jul 22, 2025
9b5bdd5
Merge remote-tracking branch 'origin/main'
xmalet-nrcan Jul 22, 2025
25f83fb
chore: bump version to 0.1.28
github-actions[bot] Jul 22, 2025
018208e
refactor: update session management to use sessionmaker and improve e…
xmalet-nrcan Jul 22, 2025
548fdc4
chore: bump version to 0.1.29
github-actions[bot] Jul 22, 2025
7e8f3f0
refactor: update session handling to use a single session attribute a…
xmalet-nrcan Jul 22, 2025
4c1bb7a
chore: bump version to 0.1.30
github-actions[bot] Jul 22, 2025
0a89e1f
Merge remote-tracking branch 'origin/main'
xmalet-nrcan Jul 22, 2025
5199a48
refactor: improve session management in database handlers and add CI …
xmalet-nrcan Jul 23, 2025
2398ba9
file formattin with RUFF
xmalet-nrcan Jul 23, 2025
4a6d48d
chore: bump version to 0.1.31
github-actions[bot] Jul 23, 2025
368db22
update session handling
xmalet-nrcan Jul 23, 2025
c145242
Merge remote-tracking branch 'origin/main'
xmalet-nrcan Jul 23, 2025
be21de1
chore: bump version to 0.1.32
github-actions[bot] Jul 23, 2025
fdec981
update session handling
xmalet-nrcan Jul 23, 2025
2d99629
Merge branch 'main' of https://github.com/xmalet-nrcan/etl-toolbox
xmalet-nrcan Jul 23, 2025
7470134
chore: bump version to 0.1.33
github-actions[bot] Jul 23, 2025
ca13fc1
fix: correct condition for data length check in abstract_database_obj…
xmalet-nrcan Jul 23, 2025
588f6b0
Merge remote-tracking branch 'origin/main'
xmalet-nrcan Jul 23, 2025
c605136
chore: bump version to 0.1.34
github-actions[bot] Jul 23, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci-lint-and-format.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: CI
name: ci-lint-and-format

on:
push:
Expand Down
19 changes: 11 additions & 8 deletions .github/workflows/ci-release.yml
Original file line number Diff line number Diff line change
@@ -1,25 +1,22 @@
name: ci-release.yml
name: ci-test-build-release
permissions:
contents: write
on:
push:
branches:
- main
tags:
- 'v*.*.*'
paths-ignore:
- 'pyproject.toml' # Ignore les modifications de ce fichier pour éviter les boucles
pull_request:
branches:
- main
tags:
- 'v*.*.*'
paths-ignore:
- 'pyproject.toml' # Ignore les modifications de ce fichier pour éviter les boucles
jobs:

test:
if: "!contains(github.event.head_commit.message, 'chore: bump version')" # Ignore les commits de version
if: "!contains(github.event.head_commit.message, 'chore: bump version')"

name: Tests
runs-on: ubuntu-latest
steps:
Expand All @@ -44,10 +41,16 @@ jobs:
- name: Run Tests
run: poetry run pytest -v --cov --cov-branch --cov-report=xml tests/

- name: Upload coverage reports to Codecov
# Copy and paste the codecov/test-results-action here
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}
- name: Upload test results to Codecov
if: ${{ !cancelled() }}
uses: codecov/test-results-action@v1
with:
token: ${{ secrets.CODECOV_TOKEN }}


build:
Expand Down Expand Up @@ -125,4 +128,4 @@ jobs:
generate_release_notes: true
draft: false
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
33 changes: 24 additions & 9 deletions .github/workflows/ci-run-tests.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: CI
name: ci-test-no-build

on:
push:
Expand All @@ -19,14 +19,29 @@ jobs:
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
- uses: actions/cache@v4
cache: 'pip'

- name: Install Poetry
run: pipx install poetry

- name: Setup Poetry cache
uses: actions/cache@v4
with:
path: ~/.cache/pypoetry
key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
- run: pip install poetry
- run: poetry install
- run: poetry run pytest -v tests/

- name: Install Dependencies
run: poetry install
- name: Run Tests
run: poetry run pytest -v --cov --cov-branch --cov-report=xml tests/

# Copy and paste the codecov/test-results-action here
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}
- name: Upload test results to Codecov
if: ${{ !cancelled() }}
uses: codecov/test-results-action@v1
with:
token: ${{ secrets.CODECOV_TOKEN }}
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
# NRCAN ETL Toolbox


[![codecov](https://codecov.io/github/xmalet-nrcan/xm-etl-toolbox/graph/badge.svg?token=P4ISY9JL78)](https://codecov.io/github/xmalet-nrcan/xm-etl-toolbox)
[![CI](https://github.com/xmalet-nrcan/xm-etl-toolbox/actions/workflows/ci-release.yml/badge.svg)](https://github.com/xmalet-nrcan/xm-etl-toolbox/actions/workflows/ci-release.yml)
[![codecov](https://codecov.io/github/xmalet-nrcan/etl-toolbox/graph/badge.svg?token=L1B4RHVN2E)](https://codecov.io/github/xmalet-nrcan/etl-toolbox)
[![CI](https://github.com/xmalet-nrcan/etl-toolbox/actions/workflows/ci-release.yml/badge.svg)](https://github.com/xmalet-nrcan/etl-toolbox/actions/workflows/ci-release.yml)

Pour la version française de ce document, consultez [README-fr.md](README-fr.md).

Expand Down
100 changes: 100 additions & 0 deletions github_action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
name: CI

on:
push:
branches: [ main ]
pull_request:
branches: [ main ]

jobs:
lint:
name: Ruff Lint
runs-on: ubuntu-latest
strategy:
matrix:
target:
- path: ./nrcan_etl_toolbox/etl_logging
name: Logging
- path: ./nrcan_etl_toolbox/database
name: Database
- path: ./nrcan_etl_toolbox/etl_toolbox/reader
name: Data Reader
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
- run: pip install ruff
- name: Ruff check -- ${{ matrix.target.name }}
run: ruff check ${{ matrix.target.path }} --output-format=github

format:
name: Ruff Format Check
runs-on: ubuntu-latest
strategy:
matrix:
target:
- path: ./nrcan_etl_toolbox/etl_logging
name: Logging
- path: ./nrcan_etl_toolbox/database
name: Database
- path: ./nrcan_etl_toolbox/etl_toolbox/reader
name: Data Reader
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- run: pip install ruff
- name: Ruff format -- ${{ matrix.target.name }}
run: ruff format --check ${{ matrix.target.path }}

test:
name: Test package
runs-on: ubuntu-latest
needs: [lint, format]
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
- uses: actions/cache@v4
with:
path: ~/.cache/pypoetry
key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
- run: pip install poetry
- run: poetry install
- run: python -m pytest -v tests/

build:
name: Build and publish package
runs-on: ubuntu-latest
needs: test
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- run: pip install poetry
- run: poetry install
- run: poetry build
- name: Publish to PyPI
if: github.ref == 'refs/heads/main'
run: poetry publish -u ${{ secrets.PYPI_USERNAME }} -p ${{ secrets.PYPI_PASSWORD }}

delete-pypi-package:
name: Delete PyPI package (manual)
runs-on: ubuntu-latest
if: github.event_name == 'workflow_dispatch'
steps:
- name: Delete package script
run: |
echo "Suppression manuelle du package PyPI à implémenter via l’API PyPI ou un script adapté."
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@
class AbstractDatabaseObjectsInterface:
engine: sqlalchemy.engine.Engine = None
session: sqlalchemy.orm.session.Session = None
SessionLocal = None

logger = CustomLogger("database_objects_handler", logger_type="default")

def __init__(self, database_url: str, db_objects_to_treat: list = None, logger_level="DEBUG"):
Expand Down Expand Up @@ -51,44 +53,30 @@ def _connect_to_database(self, database_url):
AbstractDatabaseObjectsInterface.session = Session(
bind=AbstractDatabaseObjectsInterface.engine, expire_on_commit=False
)

Base.metadata.create_all(AbstractDatabaseObjectsInterface.engine)
self.logger.info("Connected to database")

def _insert_object(self, db_object: Base) -> bool | None:
with self.session.begin(nested=True):
with self.session as session:
try:
# Ajout d'un seul objet à la session
self.logger.debug(db_object)
self.session.add(db_object)
self.session.commit()
session.begin(nested=True)
session.add(db_object)
session.commit()
return True
except UniqueViolation as v:
self.session.rollback()
self.logger.error(f"UniqueViolation - {db_object} \n{v}", stacklevel=3)
raise v

except IntegrityError as e:
self.session.rollback()
session.rollback()
if isinstance(e.orig, UniqueViolation):
if isinstance(e.orig, UniqueViolation):
constraint_match = re.search(r"unique « (.*?) »", str(e.args))
contraint_name = constraint_match.group(1) if constraint_match else None
if "uni_" in contraint_name or "pk_" in contraint_name:
return True
else:
self.logger.error(f"IntegrityError - {db_object} \n{e}", stacklevel=3)
raise e
return None
else:
# Annule les modifications en cas d'erreur d'intégrité
self.session.rollback()
self.logger.error(f"IntegrityError - {db_object} \n{e}", stacklevel=3)
return None

constraint_match = re.search(r"unique « (.*?) »", str(e.args))
contraint_name = constraint_match.group(1) if constraint_match else None
if contraint_name and ("uni_" in contraint_name or "pk_" in contraint_name):
return True
self.logger.error(f"IntegrityError - {db_object} \n{e}", stacklevel=3)
return None
except Exception as e:
# Annule les modifications pour toute autre erreur
self.session.rollback()
raise e from e
session.rollback()
self.logger.error(f"Unexpected error - {db_object} \n{e}", stacklevel=3)
raise

def insert_object(self, db_object: Base):
return self._insert_object(db_object)
Expand Down Expand Up @@ -161,45 +149,46 @@ def _is_like(self, col: InstrumentedAttribute, parameter: str = None) -> BinaryE
return None

def _get_element_in_database(self, table_model: type[T], condition="or", **kwargs) -> list[T] | None:
with self.session.begin(nested=True):
with self.session as session:
try:
data = table_model.query_object(session=self.session, condition=condition, **kwargs)
session.begin(nested=True)
data = table_model.query_object(session=session, condition=condition, **kwargs)
except DataError:
self.session.rollback()

session.rollback()
return None
except Exception:
self.session.rollback()
session.rollback()
return None

finally:
session.commit()
return data

def _get_or_create_element(
self, dict_element: str, table_model: type[T], condition="and", **kwargs
) -> list[T] | None:
with self.session.begin(nested=True):
try:
data = self._get_element_in_database(table_model=table_model, condition=condition, **kwargs)
if data is not None and len(data) == 0:
data = self._get_element_to_be_inserted(
dict_element=dict_element, table_model=table_model, **kwargs
)
except Exception as e:
self.session.rollback()
self.logger.warning(
f"ON _get_or_create_element with \n{table_model}, {condition}, {kwargs} \nRAISED {e}", stacklevel=3
try:
self.session.begin(nested=True)
data = self._get_element_in_database(table_model=table_model, condition=condition, **kwargs)
if data is not None and len(data) == 0:
data = self._get_element_to_be_inserted(
dict_element=dict_element, table_model=table_model, **kwargs
)
else:
if data is not None:
if len(data) == 0:
return [self._create_element(dict_element, table_model, **kwargs)]
elif len(data) <= 1:
return data

# raise Exception(f"More than one {table_model.__name__} found with the same parameters")
else:
except Exception as e:
self.session.rollback()
self.logger.warning(
f"ON _get_or_create_element with \n{table_model}, {condition}, {kwargs} \nRAISED {e}", stacklevel=3
)
else:
if data is not None:
if len(data) == 0:
return [self._create_element(dict_element, table_model, **kwargs)]
elif len(data) >= 1:
return data

# raise Exception(f"More than one {table_model.__name__} found with the same parameters")
else:
return data

def _get_element_to_be_inserted(self, dict_element: str, table_model: type[T], **kwargs) -> list[T] | None:
to_return = []
in_dict_elements = {k: v for k, v in kwargs.items() if v is not None and not table_model.is_identity_column(k)}
Expand Down
16 changes: 11 additions & 5 deletions nrcan_etl_toolbox/etl_toolbox/reader/reader_factory.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import os

from sqlalchemy import Engine, Connection
import pandas as pd
from sqlalchemy import Connection, Engine
from sqlalchemy.orm import Session

from nrcan_etl_toolbox.etl_toolbox.reader.source_readers.base_reader import BaseDataReader
Expand All @@ -18,17 +19,22 @@ class ReaderFactory:
of BaseDataReader, depending on the data source type.
"""

def __init__(self, input_source: str | Engine | Session = None, schema=None, table_name=None,
**kwargs: dict[str, str] | None):
def __init__(
self,
input_source: str | Engine | Session | Connection = None,
schema=None,
table_name=None,
**kwargs: dict[str, str] | None,
):
self._input_source = input_source

self._reader = self._create_reader(input_source, schema=schema, table_name=table_name, **kwargs)

def dataframe(self):
def dataframe(self) -> pd.DataFrame:
return self.data

@property
def data(self):
def data(self) -> pd.DataFrame:
return self._reader.dataframe

@property
Expand Down
Loading
Loading