Skip to content

Commit 785d175

Browse files
authored
Merge pull request #27 from sethigeet/handle_empty_matrix_in_broadcast
Handle empty matrices in broadcast
2 parents 4afe623 + 4401ebe commit 785d175

File tree

10 files changed

+132
-88
lines changed

10 files changed

+132
-88
lines changed

Contributing.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Contribution Guidelines
2+
3+
Please note that this project is released with a **HELLO-FOSS**.<br>
4+
By participating in this project you agree to abide by its terms.
5+
6+
If you would like to contribute to the project, please follow these guidelines:
7+
8+
1. Fork the original WnCC repository to your personal account.
9+
10+
2. Clone the forked repository locally.
11+
12+
3. Create a new branch for your feature or bug fix.
13+
14+
4. Make the necessary changes and commit them.
15+
16+
5. Push your changes to your forked repository.
17+
18+
6. Submit a pull request to the main repository with your branch, explaining the changes you made and any additional information that might be helpful for review.
19+
20+
# Usage
21+
> Clone the Git repository:
22+
23+
```shell
24+
# Clone your fork of the GitHub Repo
25+
git clone https://github.com/your_username/Hello-Foss-CPP.git
26+
```
27+
> Follow the installation and compilation steps provided in Introduction page and [README.md](README.md). <br>
28+
29+
By following these guidelines, you help maintain the quality and organization of the project!<br>
30+
31+
## Resources
32+
- [Parallelization with MPI and OpenMPI](http://compphysics.github.io/ComputationalPhysics2/doc/LectureNotes/_build/html/parallelization.html#)
33+
- [OpenMP](https://medium.com/swlh/openmp-on-ubuntu-1145355eeb2)<br>
34+
35+
***HAPPY LEARNING 😀😀😀***

README.md

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -146,8 +146,6 @@ Here you can find out more about `MPICH`: [https://www.mpich.org/](https://www.m
146146
| |-- wrong_sum.cpp (Demo of a wrong summation example for learning purposes)
147147
|-- Application
148148
| |-- page-rank.cpp
149-
150-
## Resources
151-
- [Parallelization with MPI and OpenMPI](http://compphysics.github.io/ComputationalPhysics2/doc/LectureNotes/_build/html/parallelization.html#)
152-
- [OpenMP](https://medium.com/swlh/openmp-on-ubuntu-1145355eeb2)
153-
149+
### Note
150+
> Information about Functions in main is provided in [README.md](src/README.md) <br>
151+
> For contributing to this repo kindly go through the guidelines provided in [Contributing.md](Contributing.md)

concat

17 KB
Binary file not shown.

src/Contributing.md

Lines changed: 0 additions & 21 deletions
This file was deleted.

src/README.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -5,50 +5,50 @@ A list of functions that have been implemented can be found here :-
55
>This C++ code implements LU factorization using OpenMP for parallel execution of matrix updates. It optimizes the decomposition by distributing computations for the lower (L) and upper (U) triangular matrices across multiple threads.
66
77
### 2) Maximum element search
8-
>The code for this function can be found in [max.cpp](max.cpp), and input for the following can be found in input.cpp
8+
>The code for this function can be found in [max.cpp](src/max.cpp), and input for the following can be found in input.cpp
99
The code uses OpenMP for parallel programming to find the maximum element in an array. The search is distributed across multiple threads, improving performance by dividing the workload.
1010

1111
### 3) Matrix Matrix Multiplication
12-
>The code for the following function can be found in [mm.cpp](mm.cpp)<br>
12+
>The code for the following function can be found in [mm.cpp](src/mm.cpp)<br>
1313
This code performs matrix-matrix multiplication using OpenMP to parallelize the computation across multiple threads. It optimizes the multiplication process for large matrices, reducing execution time by distributing the workload across available CPU cores.
1414

1515
### 4) Montecarlo Method
16-
>The code for the following function can be found in [montecarlo.cpp](montecarlo.cpp)<br>
16+
>The code for the following function can be found in [montecarlo.cpp](src/montecarlo.cpp)<br>
1717
The code estimates the value of Pi using the Monte Carlo method with OpenMP for parallel processing. It simulates random points within a unit square and counts how many fall within the unit circle, then uses multiple threads to improve performance and speed up the estimation process.
1818

1919
### 5) Matrix Vector Multiplication
20-
>The code for the following function can be found in [mv.cpp](mv.cpp)<br>
20+
>The code for the following function can be found in [mv.cpp](src/mv.cpp)<br>
2121
The code performs matrix-vector multiplication using OpenMP for parallel processing. The dynamic scheduling with a chunk size of 16 distributes the computation of each row of the matrix across multiple threads, optimizing the execution for large-scale data by balancing the load dynamically.
2222

2323
### 6) Product of elements of an array
24-
>The code for the following function can be found in [prod.cpp](prod.cpp)<br>
24+
>The code for the following function can be found in [prod.cpp](src/prod.cpp)<br>
2525
This C++ code calculates the product of elements in an array using OpenMP to parallelize the computation. It optimizes large product calculations by summing the logarithms of array elements in parallel and exponentiating the result to obtain the final product, reducing potential overflow risks.
2626

2727
### 7) Pi reduction
28-
>The code for the following function can be found in [pi-reduction.cpp](pi-reduction.cpp)<br>
28+
>The code for the following function can be found in [pi-reduction.cpp](src/pi-reduction.cpp)<br>
2929
This C++ code estimates the value of Pi using numerical integration with the OpenMP library for parallelization. It divides the computation of the integral into multiple threads, summing partial results in parallel using a reduction clause to optimize the performance and accuracy when calculating Pi across a large number of steps.
3030

3131
### 8) Calculation of Standard Deviation
32-
>The code for the following function can be found in [standard_dev.cpp](standard_dev.cpp)<br>
32+
>The code for the following function can be found in [standard_dev.cpp](src/standard_dev.cpp)<br>
3333
This C++ code calculates the standard deviation of a dataset using OpenMP for parallel processing. It first computes the mean in parallel, then calculates the variance by summing the squared differences from the mean, distributing both tasks across multiple threads to improve performance with large datasets.
3434

3535
### 9) Sum of elements of an array
36-
>The code for the following function can be found in [sum2.cpp](sum2.cpp) <br>
36+
>The code for the following function can be found in [sum2.cpp](src/sum2.cpp) <br>
3737
This C++ code computes the sum of a large array (with 10 million elements) in parallel using OpenMP. It divides the workload among multiple threads based on the total number of threads, each thread calculates a partial sum, and the results are combined in a critical section to avoid race conditions. The execution time for the sum computation is also measured and displayed.
3838

3939
### 10) Vector-Vector Dot product calculation
40-
>The code for the following function can be found in [vvd.cpp](vvd.cpp) <br>
40+
>The code for the following function can be found in [vvd.cpp](src/vvd.cpp) <br>
4141
This C++ code calculates the dot product of two arrays using OpenMP for parallelization. It initializes two arrays, A and B, each containing 1000 elements set to 1. The dot product is computed in parallel using a dynamic scheduling strategy, with a chunk size of 100, and the results are combined using a reduction operation. The final result is printed to the console.
4242

4343
### 11) Sum calculation (wrong as pragma barrier is not calculated)
44-
>The code for the following function can be found in [wrong_sum.cpp](wrong.cpp)<br>
44+
>The code for the following function can be found in [wrong_sum.cpp](src/wrong.cpp)<br>
4545
This C++ code computes the sum of an array using OpenMP with task-based parallelism. It initializes an array of size 600 with all elements set to 1. The code divides the summation task into segments of size 100, allowing multiple threads to process these segments concurrently. The results from each task are accumulated into a shared variable sum using a critical section to prevent data races.
4646

4747
## 0.2) Compilation
4848
>
4949
```shell
5050
# compile using g++ for Openmp
51-
g++ - sum2.cpp -o sum2
51+
g++ sum2.cpp -o sum2 -fopenmp
5252
./sum2
5353

5454
# compile using g++ for MPI

src/broadcast.cpp

Lines changed: 46 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,56 +1,60 @@
1-
#include <iostream>
21
#include <omp.h>
3-
#include <vector>
2+
3+
#include <iostream>
44
#include <stdexcept>
5+
#include <vector>
56

67
// Function to broadcast two matrices
7-
void broadcast(const std::vector<std::vector<int>>& A, std::vector<std::vector<int>>& B) {
8-
size_t rowsA = A.size();
9-
size_t colsA = A[0].size();
10-
size_t rowsB = B.size();
11-
size_t colsB = B[0].size();
12-
13-
if (rowsA != rowsB && rowsB != 1) {
14-
throw std::invalid_argument("Incompatible dimensions for broadcasting");
15-
}
16-
if (colsA != colsB && colsB != 1) {
17-
throw std::invalid_argument("Incompatible dimensions for broadcasting");
18-
}
8+
void broadcast(const std::vector<std::vector<int>>& A,
9+
std::vector<std::vector<int>>& B) {
10+
size_t rowsA = A.size();
11+
size_t colsA = A[0].size();
12+
size_t rowsB = B.size();
13+
size_t colsB = B[0].size();
1914

20-
if (rowsB == 1) {
21-
B.resize(rowsA, B[0]);
22-
}
23-
if (colsB == 1) {
24-
for (auto& row : B) {
25-
row.resize(colsA, row[0]);
26-
}
15+
if (rowsA == 0 || colsA == 0 || rowsB == 0 || colsB == 0) {
16+
throw std::invalid_argument("Empty matrix cannot be broadcasted");
17+
}
18+
if (rowsA != rowsB && rowsB != 1) {
19+
throw std::invalid_argument("Incompatible dimensions for broadcasting");
20+
}
21+
if (colsA != colsB && colsB != 1) {
22+
throw std::invalid_argument("Incompatible dimensions for broadcasting");
23+
}
24+
25+
if (rowsB == 1) {
26+
B.resize(rowsA, B[0]);
27+
}
28+
if (colsB == 1) {
29+
for (auto& row : B) {
30+
row.resize(colsA, row[0]);
2731
}
32+
}
2833

29-
30-
#pragma omp parallel for
31-
for (size_t i = 0; i < rowsA; i++) {
32-
#pragma omp parallel for
33-
for (size_t j = 0; j < colsA; j++) {
34-
B[i][j] = A[i][j]+B[i][j];
35-
}
34+
#pragma omp parallel for
35+
for (size_t i = 0; i < rowsA; i++) {
36+
#pragma omp parallel for
37+
for (size_t j = 0; j < colsA; j++) {
38+
B[i][j] = A[i][j] + B[i][j];
3639
}
40+
}
3741
}
3842

3943
int main() {
40-
std::vector<std::vector<int>> A = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}};
41-
std::vector<std::vector<int>> B = {{1,2,3}}; // B has only one row
42-
43-
try {
44-
broadcast(A, B);
45-
for (const auto& row : B) {
46-
for (const auto& elem : row) {
47-
std::cout << elem << " ";
48-
}
49-
std::cout << std::endl;
50-
}
51-
} catch (const std::invalid_argument& e) {
52-
std::cerr << "Error: " << e.what() << std::endl;
44+
std::vector<std::vector<int>> A = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}};
45+
std::vector<std::vector<int>> B = {{1, 2, 3}}; // B has only one row
46+
47+
try {
48+
broadcast(A, B);
49+
for (const auto& row : B) {
50+
for (const auto& elem : row) {
51+
std::cout << elem << " ";
52+
}
53+
std::cout << std::endl;
5354
}
55+
} catch (const std::invalid_argument& e) {
56+
std::cerr << "Error: " << e.what() << std::endl;
57+
}
5458

55-
return 0;
59+
return 0;
5660
}

src/concatenate.cpp

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
using namespace std;
66

77
// Function to concatenate two arrays in parallel
8-
void concatenate(int arr1[], int arr2[], int n1, int n2, int arr3[]) {
8+
void concatenate(int* arr1, int* arr2, int n1, int n2, int* arr3) {
99
#pragma omp parallel for
1010
for (int i = 0; i < n1; i++) {
1111
arr3[i] = arr1[i];
@@ -18,10 +18,19 @@ void concatenate(int arr1[], int arr2[], int n1, int n2, int arr3[]) {
1818
}
1919

2020
int main() {
21-
int n1 = 5, n2 = 5;
22-
int arr1[n1] = {1, 2, 3, 4, 5};
23-
int arr2[n2] = {6, 7, 8, 9, 10};
24-
int arr3[n1 + n2];
21+
22+
int n1 = 5000000, n2 = 5000000;
23+
24+
int* arr1 = new int[n1];
25+
int* arr2 = new int[n2];
26+
int* arr3 = new int[n1+n2];
27+
28+
// initialisation
29+
for(int i = 0; i < n1; i++)
30+
{
31+
arr1[i] = 0;
32+
arr2[i] = 1;
33+
}
2534

2635
// Set the number of threads
2736
omp_set_num_threads(2);
@@ -35,5 +44,9 @@ int main() {
3544
}
3645
cout << endl;
3746

47+
delete[] arr1;
48+
delete[] arr2;
49+
delete[] arr3;
50+
3851
return 0;
3952
}

src/max.cpp

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,14 +18,13 @@ int findMax(int elementsToProcess, const std::vector<int>& data) {
1818

1919
int main(int argc, char** argv) {
2020
MPI_Init(&argc, &argv);
21-
double time = MPI_Wtime(); // Start timing
22-
2321
int processorsNr;
2422
MPI_Comm_size(MPI_COMM_WORLD, &processorsNr);
2523
int processId;
2624
MPI_Comm_rank(MPI_COMM_WORLD, &processId);
2725

2826
int buff;
27+
double time;
2928

3029
if (processId == 0) {
3130
int max = 0; // Global maximum
@@ -66,6 +65,10 @@ int main(int argc, char** argv) {
6665
MPI_Send(data.data() + startIdx, buff, MPI_INT, i, 2, MPI_COMM_WORLD);
6766
}
6867

68+
// Synchronize all processes before starting the timing
69+
MPI_Barrier(MPI_COMM_WORLD);
70+
time = MPI_Wtime(); // Start timing
71+
6972
// Master process finds its own max
7073
max = findMax(elementsToEachProcess + remainder, data);
7174

@@ -78,8 +81,8 @@ int main(int argc, char** argv) {
7881
}
7982
}
8083

81-
std::cout << "Global maximum is " << max << std::endl;
8284
time = MPI_Wtime() - time; // Stop timing
85+
std::cout << "Global maximum is " << max << std::endl;
8386
std::cout << "Time elapsed: " << time << " seconds" << std::endl;
8487
}
8588

@@ -89,6 +92,9 @@ int main(int argc, char** argv) {
8992
std::vector<int> dataToProcess(buff);
9093
MPI_Recv(dataToProcess.data(), buff, MPI_INT, 0, 2, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
9194

95+
// Synchronize before starting the computation
96+
MPI_Barrier(MPI_COMM_WORLD);
97+
9298
int theMax = findMax(buff, dataToProcess);
9399

94100
// Send local max to master process

src/montecarlo.cpp

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,16 +2,25 @@
22
#include <iostream>
33
#include <omp.h>
44
#include <cstdlib>
5+
#include <chrono>
56
using namespace std;
67

8+
unsigned long long getCurrentTimeInMilliseconds() {
9+
return std::chrono::duration_cast<std::chrono::milliseconds>(
10+
std::chrono::high_resolution_clock::now().time_since_epoch()).count();
11+
}
12+
713
void montecarlo(int n, int num_threads) {
814
int pCircle = 0, pSquare = 0;
915
double x, y, d;
1016
int i;
17+
unsigned long long seed = getCurrentTimeInMilliseconds();
18+
srand(seed); // Seeding with a modified value
1119

1220
// Parallelize the loop using OpenMP
1321
#pragma omp parallel for private(x, y, d, i) reduction(+:pCircle, pSquare) num_threads(num_threads)
1422
for(i = 0; i < n; i++) {
23+
1524
// Generate random points between 0 and 1
1625
x = (double)rand() / RAND_MAX;
1726
y = (double)rand() / RAND_MAX;

src/wrong_sum.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ int parallel_sum(const int arr[], int size, int step_size) {
77

88
#pragma omp parallel
99
{
10-
#pragma omp for nowait
10+
1111
for (int i = 0; i < size; i += step_size) {
1212
int start = i, end = i + step_size - 1;
1313

0 commit comments

Comments
 (0)