Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update flake, black and isort to set line length limit to 79 #325

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .flake8
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ exclude =
build,
dist,
doc/source/conf.py
max-line-length = 115
max-line-length = 79
# Ignore some style 'errors' produced while formatting by 'black'
# https://black.readthedocs.io/en/stable/guides/using_black_with_other_tools.html#labels-why-pycodestyle-warnings
extend-ignore = E203
2 changes: 1 addition & 1 deletion .isort.cfg
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[settings]
line_length = 115
line_length = 79
multi_line_output = 3
include_trailing_comma = True
12 changes: 10 additions & 2 deletions doc/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -221,7 +221,13 @@
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
("index", "diffpy.utils.tex", "diffpy.utils Documentation", ab_authors, "manual"),
(
"index",
"diffpy.utils.tex",
"diffpy.utils Documentation",
ab_authors,
"manual",
),
]

# The name of an image file (relative to this directory) to place at the top of
Expand Down Expand Up @@ -249,7 +255,9 @@

# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [("index", "diffpy.utils", "diffpy.utils Documentation", ab_authors, 1)]
man_pages = [
("index", "diffpy.utils", "diffpy.utils Documentation", ab_authors, 1)
]

# If true, show URL addresses after external links.
# man_show_urls = False
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ ignore-words = ".codespell/ignore_words.txt"
skip = "*.cif,*.dat"

[tool.black]
line-length = 115
line-length = 79
include = '\.pyi?$'
exclude = '''
/(
Expand Down
195 changes: 135 additions & 60 deletions src/diffpy/utils/diffraction_objects.py

Large diffs are not rendered by default.

4 changes: 3 additions & 1 deletion src/diffpy/utils/parsers/custom_exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,5 +51,7 @@ class ImproperSizeError(Exception):

def __init__(self, bad_object, message=None):
if message is None:
self.message = f"The size of {bad_object} is different than expected."
self.message = (
f"The size of {bad_object} is different than expected."
)
super().__init__(self.message)
93 changes: 61 additions & 32 deletions src/diffpy/utils/parsers/loaddata.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,56 +20,67 @@
from diffpy.utils import validators


def loadData(filename, minrows=10, headers=False, hdel="=", hignore=None, **kwargs):
def loadData(
filename, minrows=10, headers=False, hdel="=", hignore=None, **kwargs
):
"""Find and load data from a text file.

The data block is identified as the first matrix block of at least minrows rows and constant number of columns.
This seems to work for most of the datafiles including those generated by diffpy programs.
The data block is identified as the first matrix block of at least
minrows rows and constant number of columns. This seems to work for most
of the datafiles including those generated by diffpy programs.

Parameters
----------
filename
Name of the file we want to load data from.
minrows: int
Minimum number of rows in the first data block. All rows must have the same number of floating
point values.
Minimum number of rows in the first data block. All rows must have
the same number of floating point values.
headers: bool
when False (default), the function returns a numpy array of the data in the data block.
When True, the function instead returns a dictionary of parameters and their corresponding
values parsed from header (information prior the data block). See hdel and hignore for options
to help with parsing header information.
when False (default), the function returns a numpy array of the data
in the data block. When True, the function instead returns a
dictionary of parameters and their corresponding values parsed from
header (information prior the data block). See hdel and hignore for
options to help with parsing header information.
hdel: str
(Only used when headers enabled.) Delimiter for parsing header information (default '='). e.g. using
default hdel, the line 'parameter = p_value' is put into the dictionary as {parameter: p_value}.
(Only used when headers enabled.) Delimiter for parsing header
information (default '='). e.g. using default hdel, the line '
parameter = p_value' is put into the dictionary as
{parameter: p_value}.
hignore: list
(Only used when headers enabled.) Ignore header rows beginning with any elements in hignore.
e.g. hignore=['# ', '['] causes the following lines to be skipped: '# qmax=10', '[defaults]'.
(Only used when headers enabled.) Ignore header rows beginning with
any elements in hignore. e.g. hignore=['# ', '['] causes the
following lines to be skipped: '# qmax=10', '[defaults]'.
kwargs:
Keyword arguments that are passed to numpy.loadtxt including the following arguments below. (See
numpy.loadtxt for more details.) Only pass kwargs used by numpy.loadtxt.
Keyword arguments that are passed to numpy.loadtxt including the
following arguments below. (See numpy.loadtxt for more details.) Only
pass kwargs used by numpy.loadtxt.

Useful kwargs
=============
comments: str, sequence of str
The characters or list of characters used to indicate the start of a comment (default '#').
Comment lines are ignored.
The characters or list of characters used to indicate the start of a
comment (default '#'). Comment lines are ignored.
delimiter: str
Delimiter for the data in the block (default use whitespace). For comma-separated data blocks,
set delimiter to ','.
Delimiter for the data in the block (default use whitespace). For
comma-separated data blocks, set delimiter to ','.
unpack: bool
Return data as a sequence of columns that allows tuple unpacking such as x, y =
loadData(FILENAME, unpack=True). Note transposing the loaded array as loadData(FILENAME).T has the same
effect.
Return data as a sequence of columns that allows tuple unpacking such
as x, y = loadData(FILENAME, unpack=True). Note transposing the
loaded array as loadData(FILENAME).T has the same effect.
usecols:
Zero-based index of columns to be loaded, by default use all detected columns. The reading skips
data blocks that do not have the usecols-specified columns.
Zero-based index of columns to be loaded, by default use all detected
columns. The reading skips data blocks that do not have the usecols-
specified columns.

Returns
-------
data_block: ndarray
A numpy array containing the found data block. (This is not returned if headers is enabled.)
A numpy array containing the found data block. (This is not returned
if headers is enabled.)
hdata: dict
If headers are enabled, return a dictionary of parameters read from the header.
If headers are enabled, return a dictionary of parameters read from
the header.
"""
from numpy import array, loadtxt

Expand Down Expand Up @@ -105,7 +116,12 @@ def countcolumnsvalues(line):

# Check if file exists before trying to open
if not os.path.exists(filename):
raise IOError(f"File {filename} cannot be found. Please rerun the program specifying a valid filename.")
raise IOError(
(
f"File {filename} cannot be found. "
"Please rerun the program specifying a valid filename."
)
)

# make sure fid gets cleaned up
with open(filename, "rb") as fid:
Expand Down Expand Up @@ -134,7 +150,10 @@ def countcolumnsvalues(line):
if hignore is not None:
for tag in hignore:
taglen = len(tag)
if len(hpair[0]) >= taglen and hpair[0][:taglen] == tag:
if (
len(hpair[0]) >= taglen
and hpair[0][:taglen] == tag
):
flag = False
# add header data
if flag:
Expand Down Expand Up @@ -187,7 +206,8 @@ class TextDataLoader(object):
minrows: int
Minimum number of rows in the first data block. (Default 10.)
usecols: tuple
Which columns in our dataset to use. Ignores all other columns. If None (default), use all columns.
Which columns in our dataset to use. Ignores all other columns. If
None (default), use all columns.
skiprows
Rows in dataset to skip. (Currently not functional.)
"""
Expand Down Expand Up @@ -235,7 +255,8 @@ def readfp(self, fp, append=False):
File details include:
* File name.
* All data blocks findable by loadData.
* Headers (if present) for each data block. (Generally the headers contain column name information).
* Headers (if present) for each data block. (Generally the headers
contain column name information).
"""
self._reset()
# try to read lines from fp first
Expand All @@ -258,7 +279,13 @@ def _findDataBlocks(self):
# nf - number of words, ok - has data
self._linerecs = numpy.recarray(
(nlines,),
dtype=[("idx", int), ("nw0", int), ("nw1", int), ("nf", int), ("ok", bool)],
dtype=[
("idx", int),
("nw0", int),
("nw1", int),
("nf", int),
("ok", bool),
],
)
lr = self._linerecs
lr.idx = numpy.arange(nlines)
Expand Down Expand Up @@ -319,7 +346,9 @@ def _findDataBlocks(self):
if self.usecols is None:
data = numpy.reshape(lw.value[bb1.nw0 : ee1.nw1], (-1, bb1.nf))
else:
tdata = numpy.empty((len(self.usecols), dend - dbeg), dtype=float)
tdata = numpy.empty(
(len(self.usecols), dend - dbeg), dtype=float
)
for j, trow in zip(self.usecols, tdata):
j %= bb1.nf
trow[:] = lw.value[bb1.nw0 + j : ee1.nw1 : bb1.nf]
Expand Down
43 changes: 29 additions & 14 deletions src/diffpy/utils/parsers/serialization.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,16 +47,19 @@ def serialize_data(
data_table: list or ndarray
Data table.
dt_colnames: list
Names of each column in data_table. Every name in data_table_cols will be put into the Dictionary
as a key with a value of that column in data_table (stored as a List). Put None for columns
without names. If dt_cols has less non-None entries than columns in data_table,
the pair {'data table': data_table} will be put in the dictionary.
(Default None: only entry {'data table': data_table} will be added to dictionary.)
Names of each column in data_table. Every name in data_table_cols
will be put into the Dictionary as a key with a value of that column
in data_table (stored as a List). Put None for columns without names.
If dt_cols has less non-None entries than columns in data_table, the
pair {'data table': data_table} will be put in the dictionary.
(Default None: only entry {'data table': data_table} will be added to
dictionary.)
show_path: bool
include a path element in the database entry (default True). If 'path' is not included in hddata,
extract path from filename.
include a path element in the database entry (default True). If
'path' is not included in hddata, extract path from filename.
serial_file
Serial language file to dump dictionary into. If None (default), no dumping will occur.
Serial language file to dump dictionary into. If None (default), no
dumping will occur.

Returns
-------
Expand All @@ -79,32 +82,44 @@ def serialize_data(
data.update(hdata)

# second add named columns in dt_cols
# performed second to prioritize overwriting hdata entries with data_table column entries
# performed second to prioritize overwriting hdata entries with data_
# table column entries
named_columns = 0 # initial value
max_columns = 1 # higher than named_columns to trigger 'data table' entry
if dt_colnames is not None:
num_columns = [len(row) for row in data_table]
max_columns = max(num_columns)
num_col_names = len(dt_colnames)
if max_columns < num_col_names: # assume numpy.loadtxt gives non-irregular array
raise ImproperSizeError("More entries in dt_colnames than columns in data_table.")
if (
max_columns < num_col_names
): # assume numpy.loadtxt gives non-irregular array
raise ImproperSizeError(
"More entries in dt_colnames than columns in data_table."
)
named_columns = 0
for idx in range(num_col_names):
colname = dt_colnames[idx]
if colname is not None:
if colname in hdata.keys():
warnings.warn(
f"Entry '{colname}' in hdata has been overwritten by a data_table entry.",
(
f"Entry '{colname}' in hdata has been "
"overwritten by a data_table entry."
),
RuntimeWarning,
)
data.update({colname: list(data_table[:, idx])})
named_columns += 1

# finally add data_table as an entry named 'data table' if not all columns were parsed
# finally add data_table as an entry named 'data table' if not all
# columns were parsed
if named_columns < max_columns:
if "data table" in data.keys():
warnings.warn(
"Entry 'data table' in hdata has been overwritten by data_table.",
(
"Entry 'data table' in hdata has been "
"overwritten by data_table."
),
RuntimeWarning,
)
data.update({"data table": data_table})
Expand Down
49 changes: 28 additions & 21 deletions src/diffpy/utils/resampler.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,11 @@
def wsinterp(x, xp, fp, left=None, right=None):
"""One-dimensional Whittaker-Shannon interpolation.

Reconstruct a continuous signal from discrete data points by utilizing sinc functions
as interpolation kernels. This function interpolates the values of fp (array),
which are defined over xp (array), at new points x (array or float).
The implementation is based on E. T. Whittaker's 1915 paper
(https://doi.org/10.1017/S0370164600017806).
Reconstruct a continuous signal from discrete data points by utilizing
sinc functions as interpolation kernels. This function interpolates the
values of fp (array), which are defined over xp (array), at new points x
(array or float). The implementation is based on E. T. Whittaker's 1915
paper (https://doi.org/10.1017/S0370164600017806).

Parameters
----------
Expand All @@ -37,17 +37,18 @@ def wsinterp(x, xp, fp, left=None, right=None):
fp: ndarray
The array of y values associated with xp.
left: float
If given, set fp for x < xp[0] to left. Otherwise, if left is None (default) or not given,
set fp for x < xp[0] to fp evaluated at xp[-1].
If given, set fp for x < xp[0] to left. Otherwise, if left is None
(default) or not given, set fp for x < xp[0] to fp evaluated at xp[-1].
right: float
If given, set fp for x > xp[-1] to right. Otherwise, if right is None (default) or not given, set fp for
x > xp[-1] to fp evaluated at xp[-1].
If given, set fp for x > xp[-1] to right. Otherwise, if right is None
(default) or not given, set fp for x > xp[-1] to fp evaluated at
xp[-1].

Returns
-------
ndarray or float
The interpolated values at points x. Returns a single float if x is a scalar,
otherwise returns a numpy.ndarray.
The interpolated values at points x. Returns a single float if x is a
scalar, otherwise returns a numpy.ndarray.
"""
scalar = np.isscalar(x)
if scalar:
Expand Down Expand Up @@ -82,10 +83,11 @@ def nsinterp(xp, fp, qmin=0, qmax=25, left=None, right=None):
"""One-dimensional Whittaker-Shannon interpolation onto the Nyquist-Shannon
grid.

Takes a band-limited function fp and original grid xp and resamples fp on the NS grid.
Uses the minimum number of points N required by the Nyquist sampling theorem.
N = (qmax-qmin)(rmax-rmin)/pi, where rmin and rmax are the ends of the real-space ranges.
fp must be finite, and the user inputs qmin and qmax of the frequency-domain.
Takes a band-limited function fp and original grid xp and resamples fp on
the NS grid. Uses the minimum number of points N required by the Nyquist
sampling theorem. N = (qmax-qmin)(rmax-rmin)/pi, where rmin and rmax are
the ends of the real-space ranges. fp must be finite, and the user inputs
qmin and qmax of the frequency-domain.

Parameters
----------
Expand All @@ -103,8 +105,8 @@ def nsinterp(xp, fp, qmin=0, qmax=25, left=None, right=None):
x: ndarray
The Nyquist-Shannon grid computed for the given qmin and qmax.
fp_at_x: ndarray
The interpolated values at points x. Returns a single float if x is a scalar,
otherwise returns a numpy.ndarray.
The interpolated values at points x. Returns a single float if x is a
scalar, otherwise returns a numpy.ndarray.
"""
# Ensure numpy array
xp = np.array(xp)
Expand All @@ -122,8 +124,9 @@ def nsinterp(xp, fp, qmin=0, qmax=25, left=None, right=None):
def resample(r, s, dr):
"""Resample a PDF on a new grid.

This uses the Whittaker-Shannon interpolation formula to put s1 on a new grid if dr is less than the sampling
interval of r1, or linear interpolation if dr is greater than the sampling interval of r1.
This uses the Whittaker-Shannon interpolation formula to put s1 on a new
grid if dr is less than the sampling interval of r1, or linear
interpolation if dr is greater than the sampling interval of r1.

Parameters
----------
Expand All @@ -140,8 +143,12 @@ def resample(r, s, dr):
"""

warnings.warn(
"The 'resample' function is deprecated and will be removed in a future release (3.8.0). \n"
"'resample' has been renamed 'wsinterp' to better reflect functionality. Please use 'wsinterp' instead.",
(
"The 'resample' function is deprecated and will be removed "
"in a future release (3.8.0). \n"
"'resample' has been renamed 'wsinterp' to better reflect "
"functionality. Please use 'wsinterp' instead."
),
DeprecationWarning,
stacklevel=2,
)
Expand Down
Loading
Loading