Skip to content

Commit fe63105

Browse files
authored
Update README.md
1 parent d291f00 commit fe63105

File tree

1 file changed

+31
-7
lines changed

1 file changed

+31
-7
lines changed

README.md

+31-7
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,26 @@
11
# Parsing Text Using Map-Reduce Programming Model
22

3-
The evolution of big data systems is based on the foundational programming paradigm of Map-Reduce, involving high scale computation of data processing on a network of comodity hardware.This project is to illustrate on implementation of map-reduce and parallelize the process.
3+
### TABLE OF CONTENTS
4+
* [Objective](#objective)
5+
* [Technologies](#technologies)
6+
* [Data](#data)
7+
* [Map-Reduce](#map-reduce)
8+
* [Implementation](#implementation)
9+
* [Results](#results)
410

5-
<ins>**Concept of Map-Reduce**</ins>:
11+
## OBJECTIVE
12+
Perform processing of text and count the occurence of each word using map-reduce concept amd mimic Hadoop infrastructure with parallel processing. Multi-threading is used to execute two mapper and reducer functions.
13+
14+
## TECHNOLOGIES
15+
Project is created with:
16+
* Python - Multi-Threading
17+
18+
## DATA
19+
The data is made available [here](https://github.com/skotak2/Pasrsing-Text-with-MapReduce-programming-Paradigm-with-multithreading/blob/master/Data/Data.txt)
20+
21+
![GitHub Logo](https://github.com/skotak2/Pasrsing-Text-with-MapReduce-programming-Paradigm-with-multithreading/blob/master/Images/input.jpg)
22+
23+
## MAP REDUCE
624

725
Consider the following Text - "I am a human being. I am a Data Scientist"
826

@@ -29,16 +47,22 @@ Consider the following Text - "I am a human being. I am a Data Scientist"
2947

3048
Here we implement the concept of multithreading, to parallelize the process. Map Reduce is divided into sub tasks in parallel & aggregate teh results of sub-totals to final output. The process of mapping key to value and further aggregating them through reducers is achieved by the theards.
3149

32-
<ins>**Implementation:**</ins>
50+
51+
## IMPLEMENTATION
3352

3453
With the above concept in place, we implement the setup in the following steps:
3554

36-
**Step1** : Map for key value pairs with multiple mappers
55+
*Step1* : Map for key value pairs with multiple mappers
3756

38-
**Step2** : Sort the values and load in to the partition holder
57+
*Step2* : Sort the values and load in to the partition holder
3958

40-
**Step3** : Multiple Reducers to pic from the partition and aggregate them
59+
*Step3* : Multiple Reducers to pic from the partition and aggregate them
4160

4261
The above steps will yield a list of outputs from the reducer, which could be concatenated and loaded into a datafram or a spreasheet
4362

44-
The code is available on - "mapreduce.py"
63+
64+
## RESULTS
65+
The deployed model can be accessed from the url from any system to translate kannada sentences to english.
66+
67+
![GitHub Logo](https://github.com/skotak2/Pasrsing-Text-with-MapReduce-programming-Paradigm-with-multithreading/blob/master/Images/output.jpg)
68+

0 commit comments

Comments
 (0)