Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add intercept to x arrays in solution to part d) of problem 3 of Problem Set 1 #8

Merged
merged 7 commits into from
Feb 14, 2025

Conversation

lorenzorossi7
Copy link
Contributor

Take the file problem-sets-solutions/PS1/code/src/p03d_poisson.py. This contains the solution to part d) of problem 3 of Problem Set 1.

An intercept should be added when the training set and validation set are loaded. To do so, I replaced the line
x_train, y_train = util.load_dataset(train_path, add_intercept=False)
with
x_train, y_train = util.load_dataset(train_path, add_intercept=True)

and replaced the line
x_eval, y_eval = util.load_dataset(eval_path, add_intercept=False)
with
x_eval, y_eval = util.load_dataset(eval_path, add_intercept=True)

lorenzorossi7 and others added 5 commits October 19, 2024 16:21
…ed to input features x for training and validation set.
Add intercept to x arrays in solution to part d) of problem 3 of Problem Set 1
…sv set as test set, instead of the proper ds5_test.csv set.

Corrected this by replacing line
test_x, test_y = util.load_csv('data/ds5_train.csv')
with line
test_x, test_y = util.load_csv('data/ds5_test.csv')
@lorenzorossi7
Copy link
Contributor Author

One more bug fix.
The function train_perceptron used to load the ds5_train.csv set as test set, instead of the proper ds5_test.csv set.

I have corrected this by replacing line
test_x, test_y = util.load_csv('data/ds5_train.csv')
with line
test_x, test_y = util.load_csv('data/ds5_test.csv')

@maxim5
Copy link
Owner

maxim5 commented Feb 11, 2025

Agree with your ds5_test fix, but not sure about the p03d_poisson.py.
Why do you suggest to add the intercept?
https://github.com/maxim5/cs229-2018-autumn/blob/main/problem-sets/PS1/src/p03d_poisson.py#L17

@maxim5 maxim5 self-assigned this Feb 11, 2025
@lorenzorossi7
Copy link
Contributor Author

Here is my understanding. Feel free to ask further clarifications.

Whenever we use the notation \theta^T x, it means that we are using the "x_0=1" convention, i.e., we are adding the intercept to x. If we don't apply this convention, \theta would have n+1 elements (\theta_0,\theta_1,...,\theta_n) while x would have n elements (x_1,x_2,...,x_n), so the notation \theta^T x would not make sense.

In the notes for Generalised Linear Model and in the problem set source code, we use the notation \theta^T x, so we should add the intercept to x. If we don't use it, your solution to the problem set still runs but the quantity \theta^T x is then be given by
\theta^T x = \theta_1 x_1 + ...+\theta_n x_n,
instead of
\theta^T x = \theta_0+\theta_1 x_1 + ...+\theta_n x_n,
which is what \theta^T x symbolises across the entire set of lecture notes (see for example the part about linear regression).

Another observation that convinces me about the need for the intercept comes by comparing the section of the notes on Linear Regression (in which we use the hypothesis h(x)=\theta^T x with the convention x_0=1) with the section "Ordinary least squares" in the part about Generalised Linear Models. In the latter, the notes show that we get the same hypothesis h(x)=\theta^T x, so we must be still using the convention x_0=1. In conclusion, in the discussion about Generalised Linear Models of the notes and the problem set, we are still using the intercept term.

@maxim5
Copy link
Owner

maxim5 commented Feb 11, 2025

I see, makes sense. Thanks for explaining. From the notebook, it also looks like a typo in the problem statement:
https://github.com/maxim5/cs229-2018-autumn/blob/main/problem-sets/PS1/PS1-3%20Poisson%20Regression.ipynb

Could you please update the problem statement code as well:
https://github.com/maxim5/cs229-2018-autumn/blob/main/problem-sets/PS1/src/p03d_poisson.py

And I'd also add a comment like # Original code from Stanford does not include the intercept but it makes more sense to add it.

…Problem Set 1: the intercept should be added when loading the training dataset.
@maxim5
Copy link
Owner

maxim5 commented Feb 13, 2025

Thanks! I'm happy with both fixes. One last question: what was the change in PS2-5 Kernelizing the Perceptron.ipynb? Hard to tell from the diff or visually.

@lorenzorossi7
Copy link
Contributor Author

Thank you for reviewing and for your making this repo available.

I changed that notebook in the commit 26936f5 above.

The commit message says what I changed:

Fixed bug: the function train_perceptron used to load the ds5_train.csv set as test set, instead of the proper ds5_test.csv set.
Corrected this by replacing line
test_x, test_y = util.load_csv('data/ds5_train.csv')
with line
test_x, test_y = util.load_csv('data/ds5_test.csv')

@maxim5
Copy link
Owner

maxim5 commented Feb 14, 2025

Thanks, I looked into that commit and still was confused about the notebook. Maybe just revert it? Happy to merge all other changes.

@lorenzorossi7
Copy link
Contributor Author

I see that there are a bunch of other changes in the metadata of the notebook. I don't really know how they were created, but I think it happened because I ran the notebook on a different machine. I will revert the commit and make a new commit that only replaces the line
test_x, test_y = util.load_csv('data/ds5_train.csv')
with line
test_x, test_y = util.load_csv('data/ds5_test.csv')
in the notebook.

@maxim5 maxim5 merged commit 1a6b01b into maxim5:main Feb 14, 2025
@lorenzorossi7
Copy link
Contributor Author

The previous commit only reverted the old commit, but did not replace the line
test_x, test_y = util.load_csv('data/ds5_train.csv')
with the correct line
test_x, test_y = util.load_csv('data/ds5_test.csv')
in the notebook.
I wanted to keep these changes in 2 separate commits to be tidy.
I see that you have already merged the two branches, so I will open a new pull request to merge the code in which this line is fixed into your main branch.
Moreover, after loading the correct test set in that notebook, also the plots shown need to be changed: they currently show the training set but the intended behaviour of the function train_perceptron is to show the decision boundary on the test set. I will fix this in the next merge request and I will write this in that request again.

@lorenzorossi7
Copy link
Contributor Author

This final fix is in the first commit (e177ad5)of this other Pull Request: #11

The other commit (24e0528) of that other PR fixes a different problem with some Jupyter Notebooks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants