Skip to content

Make pandas dependency optional #290

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
DifferentialOrange opened this issue Apr 11, 2023 · 1 comment · Fixed by #291
Closed

Make pandas dependency optional #290

DifferentialOrange opened this issue Apr 11, 2023 · 1 comment · Fixed by #291
Assignees

Comments

@DifferentialOrange
Copy link
Member

DifferentialOrange commented Apr 11, 2023

There are a lot of projects with Tarantool+Python which are not interested in using datetimes (for example, all projects with Tarantool 2.8 and older). They are still obliged to install pandas and pytz, both are rather heavy. Users ask to make this dependency optional.

It seems that Python package mechanism for providing optional dependencies is "extras". We also shouldn't forget to make deb and rpm package dependencies optional.

This change is a breaking one, since user would need to change their setup process if they want to work with datetimes.

We may start throwing errors for datetimes in msgpack if we cannot parse them or return some dictionaries/tuples with raw binary data.

@LeonidVas
Copy link

Talked, it looks like it makes sense to do a split into 2 packages and the default should be a package with support for all types on board.
But there's a lot to think about.

@DifferentialOrange DifferentialOrange self-assigned this Apr 11, 2023
DifferentialOrange added a commit that referenced this issue Apr 12, 2023
pandas and pytz packages are required to build a tarantool.Datetime
object. If user doesn't plan to work with datetimes, they still would
be installed. This patch makes this code dependency optional. If
packages are not provided, when Tarantool sends a Datetime object, its
encoded bytes data would be simply put to a `tarantool.DatetimeRaw`
object.

Part of #290
DifferentialOrange added a commit that referenced this issue Apr 14, 2023
Rework our implementation of tarantool.Datetime class. Previously it had
relied on pandas.Timestamp and pandas.Timedelta. There were user
complaints about pandas as a requirement since it's rather heavy. Now
our implementation of datetime uses built-in datetime.datetime and
datetime.timedelta, as well as dateutil.relativedelta.relativedelta and
some other built-in tools.

It is expected that the implementation change wouldn't affect users, but
some minor behavior traits were broken in this patch:
- Now we rely on datetime argument validation which if differs from
  pandas one. For example, it doesn't allow overflows for fields.
  Exceptions that user may receive from internal datetime are, of
  course, had changed as well.
- We drop the support of `__eq__` for pandas.Timestamp. We simply
  compared underlying pandas.Timestamp with argument one, and now it's
  impossible. It the feature would be useful later, we may implement its
  comparison in some compatible way.
- __repr__ has been changed since internal representation has been
  changed as well.

python-dateutil is a relatively small (ver. 2.8.2 wheel size on pip is
247 kB, while msgpack 1.0.5 wheel is 316 kB and pytz 2023.3 wheel is
502 kB; numpy 1.24.2 wheel is 17.3 MB and pandas 2.0.0 wheel is
12.3 MB) and a popular library to work with various datetime cases.
Since working with datetime is always bothersome, I think it's
preferable to rely on well-tested library rather than implement months
addition from the scratch.

Closes #290
DifferentialOrange added a commit that referenced this issue Apr 14, 2023
Rework our implementation of tarantool.Datetime class. Previously it had
relied on pandas.Timestamp and pandas.Timedelta. There were user
complaints about pandas as a requirement since it's rather heavy. Now
our implementation of datetime uses built-in datetime.datetime,
datetime.timedelta and other built-in tools.

It is expected that the implementation change wouldn't affect users, but
some minor behavior traits were broken in this patch:
- Now we rely on datetime argument validation which if differs from
  pandas one. For example, it doesn't allow overflows for fields.
  Exceptions that user may receive from internal datetime are, of
  course, had changed as well.
- We drop the support of `__eq__` for pandas.Timestamp. We simply
  compared underlying pandas.Timestamp with argument one, and now it's
  impossible. If the feature would be required later, we may implement
  its comparison in some compatible way.
- __repr__ has been changed since internal representation has been
  changed as well.

Closes #290
DifferentialOrange added a commit that referenced this issue Apr 14, 2023
Rework our implementation of tarantool.Datetime class. Previously it had
relied on pandas.Timestamp and pandas.Timedelta. There were user
complaints about pandas as a requirement since it's rather heavy. Now
our implementation of datetime uses built-in datetime.datetime,
datetime.timedelta and other built-in tools.

It is expected that the implementation change wouldn't affect users, but
some minor behavior traits were broken in this patch:
- Now we rely on datetime argument validation which if differs from
  pandas one. For example, it doesn't allow overflows for fields.
  Exceptions that user may receive from internal datetime are, of
  course, had changed as well.
- We drop the support of `__eq__` for pandas.Timestamp. We simply
  compared underlying pandas.Timestamp with argument one, and now it's
  impossible. If the feature would be required later, we may implement
  its comparison in some compatible way.
- __repr__ has been changed since internal representation has been
  changed as well.

Closes #290
DifferentialOrange added a commit that referenced this issue Apr 14, 2023
Rework our implementation of tarantool.Datetime class. Previously it had
relied on pandas.Timestamp and pandas.Timedelta. There were user
complaints about pandas as a requirement since it's rather heavy. Now
our implementation of datetime uses built-in datetime.datetime,
datetime.timedelta and other built-in tools.

It is expected that the implementation change wouldn't affect users, but
some minor behavior traits were broken in this patch:
- Now we rely on datetime argument validation which if differs from
  pandas one. For example, it doesn't allow overflows for fields.
  Exceptions that user may receive from internal datetime are, of
  course, had changed as well.
- We drop the support of `__eq__` for pandas.Timestamp. We simply
  compared underlying pandas.Timestamp with argument one, and now it's
  impossible. If the feature would be required later, we may implement
  its comparison in some compatible way.
- __repr__ has been changed since internal representation has been
  changed as well.

Closes #290
DifferentialOrange added a commit that referenced this issue Apr 14, 2023
Rework our implementation of tarantool.Datetime class. Previously it had
relied on pandas.Timestamp and pandas.Timedelta. There were user
complaints about pandas as a requirement since it's rather heavy. Now
our implementation of datetime uses built-in datetime.datetime,
datetime.timedelta and other built-in tools.

It is expected that the implementation change wouldn't affect users, but
some minor behavior traits were broken in this patch:
- Now we rely on datetime argument validation which if differs from
  pandas one. For example, it doesn't allow overflows for fields.
  Exceptions that user may receive from internal datetime are, of
  course, had changed as well.
- We drop the support of `__eq__` for pandas.Timestamp. We simply
  compared underlying pandas.Timestamp with argument one, and now it's
  impossible. If the feature would be required later, we may implement
  its comparison in some compatible way.
- `__repr__` has been changed since internal representation has been
  changed as well.

Closes #290
DifferentialOrange added a commit that referenced this issue Apr 17, 2023
Rework our implementation of tarantool.Datetime class. Previously it had
relied on pandas.Timestamp and pandas.Timedelta. There were user
complaints about pandas as a requirement since it's rather heavy. Now
our implementation of datetime uses built-in datetime.datetime,
datetime.timedelta and other built-in tools.

It is expected that the implementation change wouldn't affect users, but
some minor behavior traits were broken in this patch:
- Now we rely on datetime argument validation which if differs from
  pandas one. For example, it doesn't allow overflows for fields.
  Exceptions that user may receive from internal datetime are, of
  course, had changed as well.
- We drop the support of `__eq__` for pandas.Timestamp. We simply
  compared underlying pandas.Timestamp with argument one, and now it's
  impossible. If the feature would be required later, we may implement
  its comparison in some compatible way.
- `__repr__` has been changed since internal representation has been
  changed as well.

Closes #290
DifferentialOrange added a commit that referenced this issue Apr 17, 2023
Rework our implementation of tarantool.Datetime class. Previously it had
relied on pandas.Timestamp and pandas.Timedelta. There were user
complaints about pandas as a requirement since it's rather heavy. Now
our implementation of datetime uses built-in datetime.datetime,
datetime.timedelta and other built-in tools.

It is expected that the implementation change wouldn't affect users, but
some minor behavior traits were broken in this patch:
- Now we rely on datetime argument validation which if differs from
  pandas one. For example, it doesn't allow overflows for fields.
  Exceptions that user may receive from internal datetime are, of
  course, had changed as well.
- We drop the support of `__eq__` for pandas.Timestamp. We simply
  compared underlying pandas.Timestamp with argument one, and now it's
  impossible. If the feature would be required later, we may implement
  its comparison in some compatible way.
- `__repr__` has been changed since internal representation has been
  changed as well.

Closes #290
DifferentialOrange added a commit that referenced this issue Apr 17, 2023
Overview

  This release introduces several minor behavior changes
  to make API more consistent.

  Starting from this release, connector no longer depends on `pandas`.

Breaking changes

  - Allow only named `on_push` and `on_push_ctx` for `insert` and
    `replace`.
  - `tarantool.Datetime` `__repr__` has been changed.
  - `tarantool.Datetime` input arguments are validated with
    `datetime.datetime` rules.
  - `tarantool.Datetime` is no longer expected to throw
    `pandas.Timestamp` exceptions. `datetime.datetime` exceptions will
    be thrown instead of them.
  - Drop the support of `__eq__` operator of `tarantool.Datetime` for
    `pandas.Timestamp`.
  - Remove `join` and `subscribe` connection methods.

Changes

  - Migrate to built-in `Warning` instead of a custom one.
  - Migrate to built-in `RecursionError` instead of a custom one.
  - Collect full exception traceback.
  - Package no longer depends on `pandas` (#290).

Infrastructure

  - Lint the code with `pylint`, `flake8` and `codespell`.
DifferentialOrange added a commit that referenced this issue Apr 17, 2023
Overview

  This release introduces several minor behavior changes
  to make API more consistent.

  Starting from this release, connector no longer depends on `pandas`.

Breaking changes

  - Allow only named `on_push` and `on_push_ctx` for `insert` and
    `replace`.
  - `tarantool.Datetime` `__repr__` has been changed.
  - `tarantool.Datetime` input arguments are validated with
    `datetime.datetime` rules.
  - `tarantool.Datetime` is no longer expected to throw
    `pandas.Timestamp` exceptions. `datetime.datetime` exceptions will
    be thrown instead of them.
  - Drop the support of `__eq__` operator of `tarantool.Datetime` for
    `pandas.Timestamp`.
  - Remove `join` and `subscribe` connection methods.

Changes

  - Migrate to built-in `Warning` instead of a custom one.
  - Migrate to built-in `RecursionError` instead of a custom one.
  - Collect full exception traceback.
  - Package no longer depends on `pandas` (#290).

Infrastructure

  - Lint the code with `pylint`, `flake8` and `codespell`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants