Merge pull request #27 from sethigeet/handle_empty_matrix_in_broadcast

LoPA607 · web-flow · commit 785d175f5014 · 2025-01-06T17:54:10.000+05:30
Handle empty matrices in broadcast
diff --git a/Contributing.md b/Contributing.md
@@ -0,0 +1,35 @@
+# Contribution Guidelines
+
+Please note that this project is released with a **HELLO-FOSS**.<br>
+By participating in this project you agree to abide by its terms.
+
+If you would like to contribute to the project, please follow these guidelines:
+
+1. Fork the original WnCC repository to your personal account.
+
+2. Clone the forked repository locally.
+
+3. Create a new branch for your feature or bug fix.
+
+4. Make the necessary changes and commit them.
+
+5. Push your changes to your forked repository.
+
+6. Submit a pull request to the main repository with your branch, explaining the changes you made and any additional information that might be helpful for review.
+
+# Usage 
+> Clone the Git repository:
+
+```shell
+  # Clone your fork of the GitHub Repo
+  git clone https://github.com/your_username/Hello-Foss-CPP.git
+```
+> Follow the installation and compilation steps provided in Introduction page and [README.md](README.md). <br>
+
+By following these guidelines, you help maintain the quality and organization of the project!<br>
+          
+## Resources
+- [Parallelization with MPI and OpenMPI](http://compphysics.github.io/ComputationalPhysics2/doc/LectureNotes/_build/html/parallelization.html#)
+- [OpenMP](https://medium.com/swlh/openmp-on-ubuntu-1145355eeb2)<br>
+
+***HAPPY LEARNING 😀😀😀***
diff --git a/README.md b/README.md
@@ -146,8 +146,6 @@ Here you can find out more about `MPICH`: [https://www.mpich.org/](https://www.m
     |   |-- wrong_sum.cpp (Demo of a wrong summation example for learning purposes)
     |-- Application
     |   |-- page-rank.cpp
-
-## Resources
-- [Parallelization with MPI and OpenMPI](http://compphysics.github.io/ComputationalPhysics2/doc/LectureNotes/_build/html/parallelization.html#)
-- [OpenMP](https://medium.com/swlh/openmp-on-ubuntu-1145355eeb2)
-
+### Note
+> Information about Functions in main is provided in [README.md](src/README.md) <br>
+> For contributing to this repo kindly go through the guidelines provided in [Contributing.md](Contributing.md)
diff --git a/concat b/concat
diff --git a/src/Contributing.md b/src/Contributing.md
diff --git a/src/README.md b/src/README.md
@@ -5,50 +5,50 @@ A list of functions that have been implemented can be found here :-
 >This C++ code implements LU factorization using OpenMP for parallel execution of matrix updates. It optimizes the decomposition by distributing computations for the lower (L) and upper (U) triangular matrices across multiple threads.
 
 ### 2) Maximum element search
->The code for this function can be found in [max.cpp](max.cpp), and input for the following can be found in input.cpp
+>The code for this function can be found in [max.cpp](src/max.cpp), and input for the following can be found in input.cpp
 The code uses OpenMP for parallel programming to find the maximum element in an array. The search is distributed across multiple threads, improving performance by dividing the workload.
 
 ### 3) Matrix Matrix Multiplication
->The code for the following function can be found in [mm.cpp](mm.cpp)<br>
+>The code for the following function can be found in [mm.cpp](src/mm.cpp)<br>
 This code performs matrix-matrix multiplication using OpenMP to parallelize the computation across multiple threads. It optimizes the multiplication process for large matrices, reducing execution time by distributing the workload across available CPU cores.
 
 ### 4) Montecarlo Method
->The code for the following function can be found in [montecarlo.cpp](montecarlo.cpp)<br>
+>The code for the following function can be found in [montecarlo.cpp](src/montecarlo.cpp)<br>
 The code estimates the value of Pi using the Monte Carlo method with OpenMP for parallel processing. It simulates random points within a unit square and counts how many fall within the unit circle, then uses multiple threads to improve performance and speed up the estimation process.
 
 ### 5) Matrix Vector Multiplication
->The code for the following function can be found in [mv.cpp](mv.cpp)<br>
+>The code for the following function can be found in [mv.cpp](src/mv.cpp)<br>
 The code performs matrix-vector multiplication using OpenMP for parallel processing. The dynamic scheduling with a chunk size of 16 distributes the computation of each row of the matrix across multiple threads, optimizing the execution for large-scale data by balancing the load dynamically.
 
 ### 6) Product of elements of an array
->The code for the following function can be found in [prod.cpp](prod.cpp)<br>
+>The code for the following function can be found in [prod.cpp](src/prod.cpp)<br>
 This C++ code calculates the product of elements in an array using OpenMP to parallelize the computation. It optimizes large product calculations by summing the logarithms of array elements in parallel and exponentiating the result to obtain the final product, reducing potential overflow risks.
 
 ### 7) Pi reduction
->The code for the following function can be found in [pi-reduction.cpp](pi-reduction.cpp)<br>
+>The code for the following function can be found in [pi-reduction.cpp](src/pi-reduction.cpp)<br>
 This C++ code estimates the value of Pi using numerical integration with the OpenMP library for parallelization. It divides the computation of the integral into multiple threads, summing partial results in parallel using a reduction clause to optimize the performance and accuracy when calculating Pi across a large number of steps.
 
 ### 8) Calculation of Standard Deviation
->The code for the following function can be found in [standard_dev.cpp](standard_dev.cpp)<br>
+>The code for the following function can be found in [standard_dev.cpp](src/standard_dev.cpp)<br>
 This C++ code calculates the standard deviation of a dataset using OpenMP for parallel processing. It first computes the mean in parallel, then calculates the variance by summing the squared differences from the mean, distributing both tasks across multiple threads to improve performance with large datasets.
 
 ### 9) Sum of elements of an array
->The code for the following function can be found in [sum2.cpp](sum2.cpp) <br>
+>The code for the following function can be found in [sum2.cpp](src/sum2.cpp) <br>
 This C++ code computes the sum of a large array (with 10 million elements) in parallel using OpenMP. It divides the workload among multiple threads based on the total number of threads, each thread calculates a partial sum, and the results are combined in a critical section to avoid race conditions. The execution time for the sum computation is also measured and displayed. 
 
 ### 10) Vector-Vector Dot product calculation
->The code for the following function can be found in [vvd.cpp](vvd.cpp) <br>
+>The code for the following function can be found in [vvd.cpp](src/vvd.cpp) <br>
 This C++ code calculates the dot product of two arrays using OpenMP for parallelization. It initializes two arrays, A and B, each containing 1000 elements set to 1. The dot product is computed in parallel using a dynamic scheduling strategy, with a chunk size of 100, and the results are combined using a reduction operation. The final result is printed to the console.
 
 ### 11) Sum calculation (wrong as pragma barrier is not calculated)
->The code for the following function can be found in [wrong_sum.cpp](wrong.cpp)<br>
+>The code for the following function can be found in [wrong_sum.cpp](src/wrong.cpp)<br>
 This C++ code computes the sum of an array using OpenMP with task-based parallelism. It initializes an array of size 600 with all elements set to 1. The code divides the summation task into segments of size 100, allowing multiple threads to process these segments concurrently. The results from each task are accumulated into a shared variable sum using a critical section to prevent data races.
 
 ## 0.2) Compilation
 >
 ```shell
 # compile using g++ for Openmp
-g++ - sum2.cpp -o sum2
+g++ sum2.cpp -o sum2 -fopenmp
 ./sum2
 
 # compile using g++ for MPI
diff --git a/src/broadcast.cpp b/src/broadcast.cpp
@@ -1,56 +1,60 @@
-#include <iostream>
 #include <omp.h>
-#include <vector>
+
+#include <iostream>
 #include <stdexcept>
+#include <vector>
 
 // Function to broadcast two matrices
-void broadcast(const std::vector<std::vector<int>>& A, std::vector<std::vector<int>>& B) {
-    size_t rowsA = A.size();
-    size_t colsA = A[0].size();
-    size_t rowsB = B.size();
-    size_t colsB = B[0].size();
-
-    if (rowsA != rowsB && rowsB != 1) {
-        throw std::invalid_argument("Incompatible dimensions for broadcasting");
-    }
-    if (colsA != colsB && colsB != 1) {
-        throw std::invalid_argument("Incompatible dimensions for broadcasting");
-    }
+void broadcast(const std::vector<std::vector<int>>& A,
+               std::vector<std::vector<int>>& B) {
+  size_t rowsA = A.size();
+  size_t colsA = A[0].size();
+  size_t rowsB = B.size();
+  size_t colsB = B[0].size();
 
-    if (rowsB == 1) {
-        B.resize(rowsA, B[0]);
-    }
-    if (colsB == 1) {
-        for (auto& row : B) {
-            row.resize(colsA, row[0]);
-        }
+  if (rowsA == 0 || colsA == 0 || rowsB == 0 || colsB == 0) {
+    throw std::invalid_argument("Empty matrix cannot be broadcasted");
+  }
+  if (rowsA != rowsB && rowsB != 1) {
+    throw std::invalid_argument("Incompatible dimensions for broadcasting");
+  }
+  if (colsA != colsB && colsB != 1) {
+    throw std::invalid_argument("Incompatible dimensions for broadcasting");
+  }
+
+  if (rowsB == 1) {
+    B.resize(rowsA, B[0]);
+  }
+  if (colsB == 1) {
+    for (auto& row : B) {
+      row.resize(colsA, row[0]);
     }
+  }
 
-   
-    #pragma omp parallel for
-    for (size_t i = 0; i < rowsA; i++) {
-        #pragma omp parallel for
-        for (size_t j = 0; j < colsA; j++) {
-            B[i][j] = A[i][j]+B[i][j];
-        }
+#pragma omp parallel for
+  for (size_t i = 0; i < rowsA; i++) {
+#pragma omp parallel for
+    for (size_t j = 0; j < colsA; j++) {
+      B[i][j] = A[i][j] + B[i][j];
     }
+  }
 }
 
 int main() {
-    std::vector<std::vector<int>> A = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}};
-    std::vector<std::vector<int>> B = {{1,2,3}}; // B has only one row
-
-    try {
-        broadcast(A, B);
-        for (const auto& row : B) {
-            for (const auto& elem : row) {
-                std::cout << elem << " ";
-            }
-            std::cout << std::endl;
-        }
-    } catch (const std::invalid_argument& e) {
-        std::cerr << "Error: " << e.what() << std::endl;
+  std::vector<std::vector<int>> A = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}};
+  std::vector<std::vector<int>> B = {{1, 2, 3}};  // B has only one row
+
+  try {
+    broadcast(A, B);
+    for (const auto& row : B) {
+      for (const auto& elem : row) {
+        std::cout << elem << " ";
+      }
+      std::cout << std::endl;
     }
+  } catch (const std::invalid_argument& e) {
+    std::cerr << "Error: " << e.what() << std::endl;
+  }
 
-    return 0;
+  return 0;
 }
diff --git a/src/concatenate.cpp b/src/concatenate.cpp
@@ -5,7 +5,7 @@
 using namespace std;
 
 // Function to concatenate two arrays in parallel
-void concatenate(int arr1[], int arr2[], int n1, int n2, int arr3[]) {
+void concatenate(int* arr1, int* arr2, int n1, int n2, int* arr3) {
     #pragma omp parallel for
     for (int i = 0; i < n1; i++) {
         arr3[i] = arr1[i];
@@ -18,10 +18,19 @@ void concatenate(int arr1[], int arr2[], int n1, int n2, int arr3[]) {
 }
 
 int main() {
-    int n1 = 5, n2 = 5;
-    int arr1[n1] = {1, 2, 3, 4, 5};
-    int arr2[n2] = {6, 7, 8, 9, 10};
-    int arr3[n1 + n2];
+
+    int n1 = 5000000, n2 = 5000000;
+
+    int* arr1 = new int[n1];
+    int* arr2 = new int[n2];
+    int* arr3 = new int[n1+n2];
+
+    // initialisation
+    for(int i = 0; i < n1; i++)
+    {
+        arr1[i] = 0;
+        arr2[i] = 1;
+    }
 
     // Set the number of threads
     omp_set_num_threads(2);
@@ -35,5 +44,9 @@ int main() {
     }
     cout << endl;
 
+    delete[] arr1;
+    delete[] arr2;
+    delete[] arr3;
+
     return 0;
 }
diff --git a/src/max.cpp b/src/max.cpp
@@ -18,14 +18,13 @@ int findMax(int elementsToProcess, const std::vector<int>& data) {
 
 int main(int argc, char** argv) {
     MPI_Init(&argc, &argv);  
-    double time = MPI_Wtime();  // Start timing
-
     int processorsNr;
     MPI_Comm_size(MPI_COMM_WORLD, &processorsNr);  
     int processId;
     MPI_Comm_rank(MPI_COMM_WORLD, &processId); 
 
     int buff;  
+    double time;
 
     if (processId == 0) {
         int max = 0;  // Global maximum
@@ -66,6 +65,10 @@ int main(int argc, char** argv) {
             MPI_Send(data.data() + startIdx, buff, MPI_INT, i, 2, MPI_COMM_WORLD);
         }
 
+        // Synchronize all processes before starting the timing
+        MPI_Barrier(MPI_COMM_WORLD);
+        time = MPI_Wtime();  // Start timing
+
         // Master process finds its own max
         max = findMax(elementsToEachProcess + remainder, data);
 
@@ -78,8 +81,8 @@ int main(int argc, char** argv) {
             }
         }
 
-        std::cout << "Global maximum is " << max << std::endl;
         time = MPI_Wtime() - time;  // Stop timing
+        std::cout << "Global maximum is " << max << std::endl;
         std::cout << "Time elapsed: " << time << " seconds" << std::endl;
     }
     
@@ -89,6 +92,9 @@ int main(int argc, char** argv) {
         std::vector<int> dataToProcess(buff);  
         MPI_Recv(dataToProcess.data(), buff, MPI_INT, 0, 2, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
 
+        // Synchronize before starting the computation
+        MPI_Barrier(MPI_COMM_WORLD);
+
         int theMax = findMax(buff, dataToProcess);
 
         // Send local max to master process
diff --git a/src/montecarlo.cpp b/src/montecarlo.cpp
@@ -2,16 +2,25 @@
 #include <iostream>
 #include <omp.h>
 #include <cstdlib>
+#include <chrono>
 using namespace std;
 
+unsigned long long getCurrentTimeInMilliseconds() {
+    return std::chrono::duration_cast<std::chrono::milliseconds>(
+        std::chrono::high_resolution_clock::now().time_since_epoch()).count();
+}
+
 void montecarlo(int n, int num_threads) {
     int pCircle = 0, pSquare = 0;
     double x, y, d;
     int i;
+    unsigned long long seed = getCurrentTimeInMilliseconds();
+    srand(seed); // Seeding with a modified value
 
     // Parallelize the loop using OpenMP
     #pragma omp parallel for private(x, y, d, i) reduction(+:pCircle, pSquare) num_threads(num_threads)
     for(i = 0; i < n; i++) {
+
         // Generate random points between 0 and 1
         x = (double)rand() / RAND_MAX;
         y = (double)rand() / RAND_MAX;
diff --git a/src/wrong_sum.cpp b/src/wrong_sum.cpp
@@ -7,7 +7,7 @@ int parallel_sum(const int arr[], int size, int step_size) {
 
     #pragma omp parallel
     {
-        #pragma omp for nowait
+       
         for (int i = 0; i < size; i += step_size) {
             int start = i, end = i + step_size - 1;
 

Original file line number	Diff line number	Diff line change
`@@ -7,7 +7,7 @@ int parallel_sum(const int arr[], int size, int step_size) {`
`7`	`7`
`8`	`8`	`#pragma omp parallel`
`9`	`9`	`{`
`10`		`- #pragma omp for nowait`
	`10`	`+`
`11`	`11`	`for (int i = 0; i < size; i += step_size) {`
`12`	`12`	`int start = i, end = i + step_size - 1;`
`13`	`13`