-
Notifications
You must be signed in to change notification settings - Fork 14
Polynomial Regression Tutorial
For this tutorial we will be using a dataset contained within a CSV file. You can obtain the CSV file I'm working with here. The CSV file contains data that pertain to the properties of a house when deciding to purchase one such as the size of the house, price, number of bathrooms, number of beds, etc.
Once you have downloaded the CSV file provided (above) we can now start modifying that data to be used with the framework.
// Obtain data from csv file
let path = Bundle(for: [SELF OR OBJECT GOES HERE] ).path(forResource: "kc_house_data", ofType: "csv")
let csvUrl = NSURL(fileURLWithPath: path!)
let file = try! String(contentsOf: csvUrl as URL, encoding: String.Encoding.utf8)
let data = CSVReader(with: file)
For this example I will be using two features: Square Feet Living ("sqft_living" column of CSV) and Number of Bedrooms ("bedrooms" column CSV). In order to obtain these two columns we will use the columns method like so:
// Setup the features we need and convert them to floats if necessary
let training_data_string = data.columns["sqft_living"]!
let training_data_2_string = data.columns["bedrooms"]!
Since MLKit primarily uses Floats, we will proceed to converting the training data into type Float:
// Features
let training_data = training_data_string.map { Float($0)! }
let training_data_2 = training_data_2_string.map { Float($0)! }
Now that we have extracted our features it's time that we train our model. In order to do so we must instantiate a PolynomialLinearRegression Object.
let polynomialModel = PolynomialLinearRegression()
Next, we need to instantiate our weights. The weights chosen here are arbitrary. The last weight, 1.0
, is the intercept. Here we are using the Matrix class which comes from the Upsurge framework.
let initial_weights = Matrix<Float>(rows: 3, columns: 1, elements: [-100000.0, 1.0, 1.0])
The last step is to train the model. It's just one line of code.
let weights = try! polynomialModel.train([training_data, training_data_2], output: output_data, initialWeights: initial_weights, stepSize: Float(4e-12), tolerance: Float(1e9))
Your new weights are available in the weights
variable above.