44
55
66class SNMFOptimizer :
7+ """A self-contained implementation of the stretched NMF algorithm (sNMF),
8+ including sparse stretched NMF.
9+
10+ Instantiating the SNMFOptimizer class runs all the analysis immediately.
11+ The results matrices can then be accessed as instance attributes
12+ of the class (X, Y, and A).
13+
14+ For more information on sNMF, please reference:
15+ Gu, R., Rakita, Y., Lan, L. et al. Stretched non-negative matrix factorization.
16+ npj Comput Mater 10, 193 (2024). https://doi.org/10.1038/s41524-024-01377-5
17+ """
18+
719 def __init__ (
820 self ,
921 MM ,
@@ -17,48 +29,33 @@ def __init__(
1729 n_components = None ,
1830 random_state = None ,
1931 ):
20- """Run sNMF based on an ndarray, parameters, and either a number
21- of components or a set of initial guess matrices.
22-
23- Currently instantiating the SNMFOptimizer class runs all the analysis
24- immediately. The results can then be accessed as instance attributes
25- of the class (X, Y, and A). Eventually, this will be changed such
26- that __init__ only prepares for the optimization, which will can then
27- be done using fit_transform.
32+ """Initialize an instance of SNMF and run the optimization
2833
2934 Parameters
3035 ----------
3136 MM: ndarray
32- A numpy array containing the data to be decomposed. Rows correspond
33- to different samples/angles, while columns correspond to different
34- conditions with different stretching. Currently, there is no option
35- to treat the first column (commonly containing 2theta angles, sample
36- index, etc) differently, so if present it must be stripped in advance.
37+ The array containing the data to be decomposed. Shape is (length_of_signal,
38+ number_of_conditions).
3739 Y0: ndarray
38- A numpy array containing initial guesses for the component weights
39- at each stretching condition, with number of rows equal to the assumed
40- number of components and number of columns equal to the number of
41- conditions (same number of columns as MM). Must be provided if
42- n_components is not provided. Will override n_components if both are
43- provided.
40+ The array containing initial guesses for the component weights
41+ at each stretching condition. Shape is (number of components, number of
42+ conditions) Must be provided if n_components is not provided. Will override
43+ n_components if both are provided.
4444 X0: ndarray
45- A numpy array containing initial guesses for the intensities of each
46- component per row/sample/angle. Has rows equal to the rows of MM and
47- columns equal to n_components or the number of rows of Y0.
45+ The array containing initial guesses for the intensities of each component per
46+ row/sample/angle. Shape is (length_of_signal, number_of_components).
4847 A: ndarray
49- A numpy array containing initial guesses for the stretching factor for
50- each component, at each condition. Has number of rows equal to n_components
51- or the number of rows of Y0, and columns equal to the number of conditions
52- (columns of MM).
48+ The array containing initial guesses for the stretching factor for each component,
49+ at each condition. Shape is (number_of_components, number_of_conditions).
5350 rho: float
54- A stretching factor that influences the decomposition. Zero corresponds to
55- no stretching present. Relatively insensitive and typically adjusted in
56- powers of 10.
51+ The float which sets a stretching factor that influences the decomposition.
52+ Zero corresponds to no stretching present. Relatively insensitive and typically
53+ adjusted in powers of 10.
5754 eta: float
58- A sparsity factor than influences the decomposition. Should be set to zero
59- for non sparse data such as PDF. Can be used to improve results for sparse
60- data such as XRD, but due to instability, should be used only after first
61- selecting the best value for rho.
55+ The integer which sets a sparsity factor than influences the decomposition.
56+ Should be set to zero for non sparse data such as PDF. Can be used to improve
57+ results for sparse data such as XRD, but due to instability, should be used
58+ only after first selecting the best value for rho.
6259 max_iter: int
6360 The maximum number of times to update each of A, X, and Y before stopping
6461 the optimization.
@@ -71,10 +68,9 @@ def __init__(
7168 be overridden by Y0 if that is provided, but must be provided if no Y0 is
7269 provided.
7370 random_state: int
74- Used to set a reproducible seed for the initial matrices used in the
75- optimization. Due to the non-convex nature of the problem, results may vary
76- even with the same initial guesses, so this does not make the program
77- deterministic.
71+ The integer which acts as a reproducible seed for the initial matrices used in
72+ the optimization. Due to the non-convex nature of the problem, results may vary
73+ even with the same initial guesses, so this does not make the program deterministic.
7874 """
7975
8076 self .MM = MM
0 commit comments