Skip to content

Part 2/3 of Cookiecutter pre-commit workflow (flake8) #54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 12 additions & 7 deletions diffpy/snmf/containers.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import numpy as np
import numdifftools
import numpy as np


class ComponentSignal:
Expand All @@ -8,7 +8,7 @@ class ComponentSignal:
----------
grid: 1d array of floats
The vector containing the grid points of the component.
iq: 1d array of floats

The intensity/g(r) values of the component.
weights: 1d array of floats
The vector containing the weight of the component signal for each signal.
Expand Down Expand Up @@ -36,13 +36,18 @@ def apply_stretch(self, m):
Returns
-------
tuple of 1d arrays
The tuple of vectors where one vector is the stretched component, one vector is the 1st derivative of the
stretching operation, and one vector is the second derivative of the stretching operation.
The tuple of vectors where one vector is the stretched component, one vector is the 1st derivative
of the stretching operation, and one vector is the second derivative of the stretching operation.
"""
normalized_grid = np.arange(len(self.grid))
func = lambda stretching_factor: np.interp(
normalized_grid / stretching_factor, normalized_grid, self.iq, left=0, right=0
)
# func = lambda stretching_factor: np.interp(
# normalized_grid / stretching_factor, normalized_grid, self.iq, left=0, right=0
# )

# E731 do not assign a lambda expression, use a def
def func(stretching_factor):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this comment. Less chatter is better in the future, though I appreciate you wanting to communicate with me what you are doing (I think I can figure it out though.

For these lambda flake8 errors we would normally see if the function is being used in more than one place, and if it is, define a private function somewhere outside the main part of the code. If it is only being used in one place, we would simply move the lambda to where it is being used (so not assign it to a variable and then put the variable where it is being used, or we could put a # noqa: E731 at the end of this line so we leave it as is but it won't fail flake8.

How do we decide? the objective of all of this is not to pass flake8 but to make the code better, so the choice we make is the one that will make the code the best.

Also, in general, don't leave commented code in there, just delete it. Again, I appreciate the impulse (more transparency) but we don't want to carry around every old version of code in hte future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. A new pull request addressing your feedback has been made: #56

return np.interp(normalized_grid / stretching_factor, normalized_grid, self.iq, left=0, right=0)

derivative_func = numdifftools.Derivative(func)
second_derivative_func = numdifftools.Derivative(derivative_func)

Expand Down
17 changes: 8 additions & 9 deletions diffpy/snmf/factorizers.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,27 +5,26 @@
def lsqnonneg(stretched_component_matrix, target_signal):
"""Finds the weights of stretched component signals under one-sided constraint.

Solves ``argmin_x || Ax - b ||_2`` for ``x>=0`` where A is the stretched_component_matrix and b is the target_signal
vector. Finds the weights of component signals given undecomposed signal data and stretched components under a
one-sided constraint on the weights.
Solves ``argmin_x || Ax - b ||_2`` for ``x>=0`` where A is the stretched_component_matrix and b is the
target_signal vector. Finds the weights of component signals given undecomposed signal data and stretched
components under a one-sided constraint on the weights.

Parameters
----------
stretched_component_matrix: 2d array like
The component matrix where each column contains a stretched component signal. Has dimensions R x C where R is
the length of the signal and C is the number of components. Does not need to be nonnegative. Corresponds with 'A'
from the objective function.
the length of the signal and C is the number of components. Does not need to be nonnegative. Corresponds with
'A' from the objective function.

target_signal: 1d array like
The signal that is used as reference against which weight factors will be determined. Any column from the matrix
of the entire, unfactorized input data could be used. Has length R. Does not need to be nonnegative. Corresponds
with 'b' from the objective function.
The signal that is used as reference against which weight factors will be determined. Any column from the
matrix of the entire, unfactorized input data could be used. Has length R. Does not need to be nonnegative.
Corresponds with 'b' from the objective function.

Returns
-------
1d array like
The vector containing component signal weights at a moment. Has length C.

"""
stretched_component_matrix = np.asarray(stretched_component_matrix)
target_signal = np.asarray(target_signal)
Expand Down
46 changes: 24 additions & 22 deletions diffpy/snmf/io.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
from pathlib import Path

import numpy as np
import scipy.sparse
from pathlib import Path

from diffpy.utils.parsers.loaddata import loadData


Expand All @@ -10,8 +12,8 @@ def initialize_variables(data_input, number_of_components, data_type, sparsity=1
Parameters
----------
data_input: 2d array like
The observed or simulated PDF or XRD data provided by the user. Has dimensions R x N where R is the signal length
and N is the number of PDF/XRD signals.
The observed or simulated PDF or XRD data provided by the user. Has dimensions R x N where R is the signa
length and N is the number of PDF/XRD signals.

number_of_components: int
The number of component signals the user would like to decompose 'data_input' into.
Expand All @@ -20,23 +22,23 @@ def initialize_variables(data_input, number_of_components, data_type, sparsity=1
The type of data the user has passed into the program. Can assume the value of 'PDF' or 'XRD.'

sparsity: float, optional
The regularization parameter that behaves as the coefficient of a "sparseness" regularization term that enhances
the ability to decompose signals in the case of sparse data e.g. X-ray Diffraction data. A non-zero value
indicates sparsity in the data; greater magnitudes indicate greater amounts of sparsity.
The regularization parameter that behaves as the coefficient of a "sparseness" regularization term that
enhances the ability to decompose signals in the case of sparse data e.g. X-ray Diffraction data.
A non-zero value indicates sparsity in the data; greater magnitudes indicate greater amounts of sparsity.

smoothness: float, optional
The regularization parameter that behaves as the coefficient of a "smoothness" term that ensures that component
signal weightings change smoothly with time. Assumes a default value of 1e18.
The regularization parameter that behaves as the coefficient of a "smoothness" term that ensures that
component signal weightings change smoothly with time. Assumes a default value of 1e18.

Returns
-------
dictionary
The collection of the names and values of the constants used in the algorithm. Contains the number of observed PDF
/XRD patterns, the length of each pattern, the type of the data, the number of components the user would like to
decompose the data into, an initial guess for the component matrix, and initial guess for the weight factor matrix
,an initial guess for the stretching factor matrix, a parameter controlling smoothness of the solution, a
parameter controlling sparseness of the solution, the matrix representing the smoothness term, and a matrix used
to construct a hessian matrix.
The collection of the names and values of the constants used in the algorithm. Contains the number of
observed PDF/XRD patterns, the length of each pattern, the type of the data, the number of components
the user would like to decompose the data into, an initial guess for the component matrix, and initial
guess for the weight factor matrix, an initial guess for the stretching factor matrix, a parameter
controlling smoothness of the solution, a parameter controlling sparseness of the solution, the matrix
representing the smoothness term, and a matrix used to construct a hessian matrix.

"""
signal_length = data_input.shape[0]
Expand Down Expand Up @@ -74,22 +76,22 @@ def initialize_variables(data_input, number_of_components, data_type, sparsity=1
def load_input_signals(file_path=None):
"""Processes a directory of a series of PDF/XRD patterns into a usable format.

Constructs a 2d array out of a directory of PDF/XRD patterns containing each files dependent variable column in a
new column. Constructs a 1d array containing the grid values.
Constructs a 2d array out of a directory of PDF/XRD patterns containing each files dependent variable
column in a new column. Constructs a 1d array containing the grid values.

Parameters
----------
file_path: str or Path object, optional
The path to the directory containing the input XRD/PDF data. If no path is specified, defaults to the current
working directory. Accepts a string or a pathlib.Path object. Input data not on the same grid as the first file
read will be ignored.
The path to the directory containing the input XRD/PDF data. If no path is specified, defaults to the
current working directory. Accepts a string or a pathlib.Path object. Input data not on the same grid
as the first file read will be ignored.

Returns
-------
tuple
The tuple whose first element is an R x M 2d array made of PDF/XRD patterns as each column; R is the length of the
signal and M is the number of patterns. The tuple contains a 1d array containing the values of the grid points as
its second element; Has length R.
The tuple whose first element is an R x M 2d array made of PDF/XRD patterns as each column; R is the
length of the signal and M is the number of patterns. The tuple contains a 1d array containing the values
of the grid points as its second element; Has length R.

"""

Expand Down
28 changes: 15 additions & 13 deletions diffpy/snmf/optimizers.py
Original file line number Diff line number Diff line change
@@ -1,37 +1,39 @@
import numpy as np
import cvxpy
import numpy as np


def get_weights(stretched_component_gram_matrix, linear_coefficient, lower_bound, upper_bound):
"""Finds the weights of stretched component signals under a two-sided constraint

Solves min J(y) = (linear_coefficient)' * y + (1/2) * y' * (quadratic coefficient) * y where lower_bound <= y <=
upper_bound and stretched_component_gram_matrix is symmetric positive definite. Finds the weightings of stretched
component signals under a two-sided constraint.
Solves min J(y) = (linear_coefficient)' * y + (1/2) * y' * (quadratic coefficient) * y where
lower_bound <= y <= upper_bound and stretched_component_gram_matrix is symmetric positive definite.
Finds the weightings of stretched component signals under a two-sided constraint.

Parameters
----------
stretched_component_gram_matrix: 2d array like
The Gram matrix constructed from the stretched component matrix. It is a square positive definite matrix. It has
dimensions C x C where C is the number of component signals. Must be symmetric positive definite.
The Gram matrix constructed from the stretched component matrix. It is a square positive definite matrix.
It has dimensions C x C where C is the number of component signals. Must be symmetric positive definite.

linear_coefficient: 1d array like
The vector containing the product of the stretched component matrix and the transpose of the observed data matrix.
Has length C.
The vector containing the product of the stretched component matrix and the transpose of the observed
data matrix. Has length C.

lower_bound: 1d array like
The lower bound on the values of the output weights. Has the same dimensions of the function output. Each
element in 'lower_bound' determines the minimum value the corresponding element in the function output may take.
element in 'lower_bound' determines the minimum value the corresponding element in the function output may
take.

upper_bound: 1d array like
The upper bound on the values of the output weights. Has the same dimensions of the function output. Each element
in 'upper_bound' determines the maximum value the corresponding element in the function output may take.
The upper bound on the values of the output weights. Has the same dimensions of the function output. Each
element in 'upper_bound' determines the maximum value the corresponding element in the function output may
take.

Returns
-------
1d array like
The vector containing the weightings of the components needed to reconstruct a given input signal from the input
set. Has length C
The vector containing the weightings of the components needed to reconstruct a given input signal from the
input set. Has length C

"""
stretched_component_gram_matrix = np.asarray(stretched_component_gram_matrix)
Expand Down
6 changes: 4 additions & 2 deletions diffpy/snmf/polynomials.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@

def rooth(linear_coefficient, constant_term):
"""
Returns the largest real root of x^3+(linear_coefficient) * x + constant_term. If there are no real roots return 0.
Returns the largest real root of x^3+(linear_coefficient) * x + constant_term. If there are no real roots
return 0.

Parameters
----------
Expand All @@ -15,7 +16,8 @@ def rooth(linear_coefficient, constant_term):
Returns
-------
ndarray of floats
The largest real root of x^3+(linear_coefficient) * x + constant_term if roots are real, else return 0 array
The largest real root of x^3+(linear_coefficient) * x + constant_term if roots are real, else
return 0 array


"""
Expand Down
9 changes: 4 additions & 5 deletions diffpy/snmf/stretchednmfapp.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
import numpy as np
import argparse
from pathlib import Path
from diffpy.snmf.subroutines import lift_data, initialize_components
from diffpy.snmf.containers import ComponentSignal
from diffpy.snmf.io import load_input_signals, initialize_variables

from diffpy.snmf.io import initialize_variables, load_input_signals
from diffpy.snmf.subroutines import initialize_components, lift_data

ALLOWED_DATA_TYPES = ["powder_diffraction", "pd", "pair_distribution_function", "pdf"]

Expand Down Expand Up @@ -38,7 +37,7 @@ def create_parser():
"--lift-factor",
type=float,
default=1,
help="The lifting factor. Data will be lifted by lifted_data = data + abs(min(data) * lift). Default is 1.",
help="The lifting factor. Data will be lifted by lifted_data = data + abs(min(data) * lift). Default 1.",
)
parser.add_argument(
"number-of-components",
Expand Down
Loading
Loading