Pydian - pythonic data interchange

Pydian is a pure Python library for readable and repeatable data mappings. Pydian reduces boilerplate for data manipulation and provides a framework for expressive data wrangling.

Using Pydian, developers can collaboratively and incrementally write data mappings that are expressive, safe, and reusable. Similar to how libraries like React were able to streamline UI components for frontend development, Pydian aims to streamline data transformations for backend development.

`get` specific data, then do stuff

The key idea behind is the following: get data from an object, and if it succeeded, do stuff to it.

from pydian import get

# Some arbitrary source dict
payload = {
    'some': {
        'deeply': {
            'nested': [{
                'value': 'here!'
            }]
        }
    },
    'list_of_objects': [
        {'val': 1},
        {'val': 2},
        {'val': 3}
    ]
}

# Conveniently get values and chain operations
assert get(payload, 'some.deeply.nested[0].value', apply=str.upper) == 'HERE!'

# Unwrap list structures with [*]
assert get(payload, 'list_of_objects[*].val') == [1,2,3]

# Safely specify your logic with built-in null checking (handle `None` instead of a stack trace!)
assert get(payload, 'some.deeply.nested[100].value', apply=str.upper) == None

That's it! Additional constructs are added for more complex mapping operations (Mapper).

What makes this different from regular operations? Pydian is designed with readibility and reusability in mind:

By default, on failure get returns None. This offers a more flexible alternative to direct indexing (e.g. array[0]).
For a specific field, you can concisely fit all of your functional logic into one line of Python. This improves readability and maintainability.
All functions are "pure" and can be effectively reused and imported without side effects. This encapsulates behavior and promotes reusability.

Developer-friendly API

If you are working with dicts, you can use:

A get function with JMESPath key syntax. Chain operations on success, else continue with None
A Mapper class that performs post-processing cleanup on "empty" values. For nuanced edge cases, condtionally DROP fields or KEEP specific values

(Experimental) If you're tired of writing one-off lambda functions, consider using:

The pydian.partials module which provides (possibly) common 1-input, 1-output functions (import pydian.partials as p). A generic p.do wrapper creates a partial function which defaults parameters starting from the second parameter (from functools import partial starts from the first parameter.)

(Experimental) If you are working with pl.DataFrames, you can use:

A select function simple SQL-like syntax (,-delimited, ~ for conditionals, * to get all)
Some functions for creating new dataframes (left_join, inner_join, insert for rows, alter for cols)

Note: the DataFrame module is not included by default. To install, use: pip install "pydian[dataframes]"

Examples

dicts: See get tests and Mapper tests

(Experimental) pl.DataFrames: See select tests

(Experimental) pydian.partials: See pydian.partial tests or snippet below:

from pydian import get
import pydian.partials as p

# Arbitrary example
source = {
    'some_values': [
        250,
        350,
        450
    ]
}

# Standardize how the partial functions are written for simpler management
assert p.equals(1)(1) == True
assert p.equivalent(False)(False) == True
assert get(source, 'some_values', apply=p.index(0), only_if=p.contains(350)) == 250
assert get(source, 'some_values', apply=p.index(0), only_if=p.contains(9000)) == None
assert get(source, 'some_values', apply=p.index(1)) == 350
assert get(source, 'some_values', apply=p.keep(2)) == [250, 350]

Future Work

After 1.0, Pydian will be considered done (barring other community contributions 😃)

There may be further language support in the future (e.g. JS, Rust, Go, Julia, etc.) which could make this pattern even more useful (though still very much tbd!)

Contact

Please submit a GitHub Issue for any bugs + feature requests 🙏

Name		Name	Last commit message	Last commit date
Latest commit History 246 Commits
.github		.github
pydian		pydian
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pydian - pythonic data interchange

`get` specific data, then do stuff

Developer-friendly API

Examples

Future Work

Contact

About

Releases 5

Packages

Languages

License

ericpan64/pydian-canvas-fork

Folders and files

Latest commit

History

Repository files navigation

Pydian - pythonic data interchange

get specific data, then do stuff

Developer-friendly API

Examples

Future Work

Contact

About

Resources

License

Stars

Watchers

Forks

Releases 5

Packages 0

Languages

`get` specific data, then do stuff

Packages