Read Me

This is a repository for Cousera's Getting and Cleaning Data peer assessment project. The main purpose of this repository is to provide a script (run_analysis.R) that manipulates the Human Activity Recognition using smartphones data set in order to get a tidy data set by executing the following transformations:

Merge the training and the test sets to create one data set.
Extract only the measurements on the mean and standard deviation for each measurement.
Use descriptive activity names to name the activities in the data set
Appropriately label the data set with descriptive activity names.
Create a second, independent tidy data set with the average of each variable for each activity and each subject.

How to recreate dataset.csv.txt

In order to recreate the data set you need to:

Download this repository using git
Point your R working directory to the path of the downloaded repository. (Using setwd directive or, in case of rstudio, by clicking Session -> Set Working Directory -> Choose Directory...)
Run source("run_analysis.R")

This will recreate dataset.csv.txt by using the raw data set files in "UCI HAR dataset" directory.

What this script does

Loads training and test features, labels and subject from /UCI HAR Dataset/train/ and /UCI HAR Dataset/test/ directories.
Merges training and test features into a data frame, training and test labels into another data frame and training and test subjects into another data frame.
Parses /UCI HAR Dataset/features.txt file and add features names to the features data frame.
Select only those features that measures means or standard deviations and creates a new features data frame with only those features.
Parse /UCI HAR Dataset/activity_labels.txt and replaces labels data frame code values with their text version.
Uses regular expressions to replace feature data frame column names with better one by expanding abbreviations to complete words.
Joins subject and labels data frames into a single one and add the right column names.
Merges the data frame created in 7 with the features.
Dump the tidy data set to dataset.csv.txt file.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
UCI HAR Dataset		UCI HAR Dataset
CodeBook.md		CodeBook.md
README.md		README.md
dataset.csv.txt		dataset.csv.txt
get_descriptive_column_name.R		get_descriptive_column_name.R
get_label_from_code.R		get_label_from_code.R
run_analysis.R		run_analysis.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Read Me

Contents

How to recreate dataset.csv.txt

What this script does

About

Releases

Packages

Languages

mlespiau/getdata-002-runanalysis

Folders and files

Latest commit

History

Repository files navigation

Read Me

Contents

How to recreate dataset.csv.txt

What this script does

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages