-
Notifications
You must be signed in to change notification settings - Fork 9
feat: Add random state feature. #150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: john-development
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,8 +4,20 @@ | |
|
||
|
||
class SNMFOptimizer: | ||
def __init__(self, MM, Y0=None, X0=None, A=None, rho=1e12, eta=610, max_iter=500, tol=5e-7, components=None): | ||
print("Initializing SNMF Optimizer") | ||
def __init__( | ||
self, | ||
MM, | ||
Y0=None, | ||
X0=None, | ||
A=None, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. more descriptive name? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are many different standards for what to name these matrices. Zero agreement between sources that use NMF. I'm inclined to eventually use what sklearn.decomposition.non_negative_factorization uses, which would mean MM->X, X->W, Y->H. But I'd like to leave this as is for the moment until there's a consensus about what would be the most clear or standard. If people will be finding this tool from the sNMF paper, there's also an argument for using the X, Y, and A names because that was used there. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, sounds good. It has to be very good reason to break PEP8. The only good enough reason I can think of is to be consistent with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm fine with adopting the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. more readable code is always better, so lower-case descriptive is preferred by me. I don't actually like that scikit-learn breaks this. Shall we go with lower-case? Names can be short if they are defined in a function in the docstring and docs too. Just hte code benefits from being readable, so I would say use your judgement on that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've started the conversion to lower case, but it's a large enough process (involving many poorly labeled sub-variables of the uppercase ones) that it feels like it should be its own separate PR. Does that make sense? |
||
rho=1e12, | ||
eta=610, | ||
max_iter=500, | ||
tol=5e-7, | ||
components=None, | ||
random_state=None, | ||
): | ||
|
||
self.MM = MM | ||
sbillinge marked this conversation as resolved.
Show resolved
Hide resolved
|
||
self.X0 = X0 | ||
self.Y0 = Y0 | ||
|
@@ -15,23 +27,22 @@ def __init__(self, MM, Y0=None, X0=None, A=None, rho=1e12, eta=610, max_iter=500 | |
# Capture matrix dimensions | ||
self.N, self.M = MM.shape | ||
self.num_updates = 0 | ||
self.rng = np.random.default_rng(random_state) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can we have a more descriptive variable name? Is this a range? What is the range? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ping on this one. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, let's keep the name and we can say how it is used in the docstring. Something like "The value used to initialize the random state in ..." where .... differentiates it from the docstring for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, looking at the code, I see tha tit is generated from |
||
|
||
if Y0 is None: | ||
if components is None: | ||
raise ValueError("Must provide either Y0 or a number of components.") | ||
else: | ||
self.K = components | ||
self.Y0 = np.random.beta(a=2.5, b=1.5, size=(self.K, self.M)) # This is untested | ||
self.Y0 = self.rng.beta(a=2.5, b=1.5, size=(self.K, self.M)) | ||
else: | ||
self.K = Y0.shape[0] | ||
|
||
# Initialize A, X0 if not provided | ||
if self.A is None: | ||
self.A = np.ones((self.K, self.M)) + np.random.randn(self.K, self.M) * 1e-3 # Small perturbation | ||
self.A = np.ones((self.K, self.M)) + self.rng.normal(0, 1e-3, size=(self.K, self.M)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. K and M are probably good names if the matrix decomposition equation is in hte docstring, so they get defined there. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you addressed this with your comment to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it. I'd like to put the matrix decomposition in the docstring, but I'm having trouble formatting it. Might have to ask about this in one of the meetings. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, I am not 100% sure but I think there is a way. |
||
if self.X0 is None: | ||
self.X0 = np.random.rand(self.N, self.K) # Ensures values in [0,1] | ||
self.X0 = self.rng.random((self.N, self.K)) | ||
|
||
# Initialize solution matrices to be iterated on | ||
self.X = np.maximum(0, self.X0) | ||
self.Y = np.maximum(0, self.Y0) | ||
|
||
|
Uh oh!
There was an error while loading. Please reload this page.