diff --git a/Contributing.md b/Contributing.md
new file mode 100644
index 0000000..fec17b2
--- /dev/null
+++ b/Contributing.md
@@ -0,0 +1,35 @@
+# Contribution Guidelines
+
+Please note that this project is released with a **HELLO-FOSS**.
+By participating in this project you agree to abide by its terms.
+
+If you would like to contribute to the project, please follow these guidelines:
+
+1. Fork the original WnCC repository to your personal account.
+
+2. Clone the forked repository locally.
+
+3. Create a new branch for your feature or bug fix.
+
+4. Make the necessary changes and commit them.
+
+5. Push your changes to your forked repository.
+
+6. Submit a pull request to the main repository with your branch, explaining the changes you made and any additional information that might be helpful for review.
+
+# Usage
+> Clone the Git repository:
+
+```shell
+ # Clone your fork of the GitHub Repo
+ git clone https://github.com/your_username/Hello-Foss-CPP.git
+```
+> Follow the installation and compilation steps provided in Introduction page and [README.md](README.md).
+
+By following these guidelines, you help maintain the quality and organization of the project!
+
+## Resources
+- [Parallelization with MPI and OpenMPI](http://compphysics.github.io/ComputationalPhysics2/doc/LectureNotes/_build/html/parallelization.html#)
+- [OpenMP](https://medium.com/swlh/openmp-on-ubuntu-1145355eeb2)
+
+***HAPPY LEARNING 😀😀😀***
diff --git a/README.md b/README.md
index 0b3ac9f..45bcf9a 100644
--- a/README.md
+++ b/README.md
@@ -146,8 +146,6 @@ Here you can find out more about `MPICH`: [https://www.mpich.org/](https://www.m
| |-- wrong_sum.cpp (Demo of a wrong summation example for learning purposes)
|-- Application
| |-- page-rank.cpp
-
-## Resources
-- [Parallelization with MPI and OpenMPI](http://compphysics.github.io/ComputationalPhysics2/doc/LectureNotes/_build/html/parallelization.html#)
-- [OpenMP](https://medium.com/swlh/openmp-on-ubuntu-1145355eeb2)
-
+### Note
+> Information about Functions in main is provided in [README.md](src/README.md)
+> For contributing to this repo kindly go through the guidelines provided in [Contributing.md](Contributing.md)
diff --git a/concat b/concat
new file mode 100755
index 0000000..a6339bd
Binary files /dev/null and b/concat differ
diff --git a/src/Contributing.md b/src/Contributing.md
deleted file mode 100644
index dda2152..0000000
--- a/src/Contributing.md
+++ /dev/null
@@ -1,21 +0,0 @@
-# Contribution Guidelines
-
-Please note that this project is released with a **HELLO-FOSS**.
-By participating in this project you agree to abide by its terms.
-
-Ensure your pull request adheres to the following guidelines:
-
-- Before submitting, please ensure that similar suggestions haven't already been made by searching through previous contributions.
-- Open source applications submitted must include an English-language README.md, a screenshot of the app in the README, and provide binaries for at least one operating system, ideally covering macOS, Linux, and Windows.- Submitted packages should be tested and documented.
-- Make an individual pull request for each suggestion.
-- Any submitted packages must be properly tested and come with clear documentation.
-- New categories, or improvements to the existing categorization are welcome.
-- Keep descriptions short and simple, but descriptive.
-- Start the description with a capital and end with a full stop/period.
-- Check your spelling and grammar.
-- Make sure your text editor is set to remove trailing whitespace.
-- The pull request should have a useful title and include a link to the package and why it should be included.
-
-By following these guidelines, you help maintain the quality and organization of the project!
-
-***HAPPY LEARNING 😀😀😀***
diff --git a/src/README.md b/src/README.md
index 5c4074a..34ea6ea 100644
--- a/src/README.md
+++ b/src/README.md
@@ -5,50 +5,50 @@ A list of functions that have been implemented can be found here :-
>This C++ code implements LU factorization using OpenMP for parallel execution of matrix updates. It optimizes the decomposition by distributing computations for the lower (L) and upper (U) triangular matrices across multiple threads.
### 2) Maximum element search
->The code for this function can be found in [max.cpp](max.cpp), and input for the following can be found in input.cpp
+>The code for this function can be found in [max.cpp](src/max.cpp), and input for the following can be found in input.cpp
The code uses OpenMP for parallel programming to find the maximum element in an array. The search is distributed across multiple threads, improving performance by dividing the workload.
### 3) Matrix Matrix Multiplication
->The code for the following function can be found in [mm.cpp](mm.cpp)
+>The code for the following function can be found in [mm.cpp](src/mm.cpp)
This code performs matrix-matrix multiplication using OpenMP to parallelize the computation across multiple threads. It optimizes the multiplication process for large matrices, reducing execution time by distributing the workload across available CPU cores.
### 4) Montecarlo Method
->The code for the following function can be found in [montecarlo.cpp](montecarlo.cpp)
+>The code for the following function can be found in [montecarlo.cpp](src/montecarlo.cpp)
The code estimates the value of Pi using the Monte Carlo method with OpenMP for parallel processing. It simulates random points within a unit square and counts how many fall within the unit circle, then uses multiple threads to improve performance and speed up the estimation process.
### 5) Matrix Vector Multiplication
->The code for the following function can be found in [mv.cpp](mv.cpp)
+>The code for the following function can be found in [mv.cpp](src/mv.cpp)
The code performs matrix-vector multiplication using OpenMP for parallel processing. The dynamic scheduling with a chunk size of 16 distributes the computation of each row of the matrix across multiple threads, optimizing the execution for large-scale data by balancing the load dynamically.
### 6) Product of elements of an array
->The code for the following function can be found in [prod.cpp](prod.cpp)
+>The code for the following function can be found in [prod.cpp](src/prod.cpp)
This C++ code calculates the product of elements in an array using OpenMP to parallelize the computation. It optimizes large product calculations by summing the logarithms of array elements in parallel and exponentiating the result to obtain the final product, reducing potential overflow risks.
### 7) Pi reduction
->The code for the following function can be found in [pi-reduction.cpp](pi-reduction.cpp)
+>The code for the following function can be found in [pi-reduction.cpp](src/pi-reduction.cpp)
This C++ code estimates the value of Pi using numerical integration with the OpenMP library for parallelization. It divides the computation of the integral into multiple threads, summing partial results in parallel using a reduction clause to optimize the performance and accuracy when calculating Pi across a large number of steps.
### 8) Calculation of Standard Deviation
->The code for the following function can be found in [standard_dev.cpp](standard_dev.cpp)
+>The code for the following function can be found in [standard_dev.cpp](src/standard_dev.cpp)
This C++ code calculates the standard deviation of a dataset using OpenMP for parallel processing. It first computes the mean in parallel, then calculates the variance by summing the squared differences from the mean, distributing both tasks across multiple threads to improve performance with large datasets.
### 9) Sum of elements of an array
->The code for the following function can be found in [sum2.cpp](sum2.cpp)
+>The code for the following function can be found in [sum2.cpp](src/sum2.cpp)
This C++ code computes the sum of a large array (with 10 million elements) in parallel using OpenMP. It divides the workload among multiple threads based on the total number of threads, each thread calculates a partial sum, and the results are combined in a critical section to avoid race conditions. The execution time for the sum computation is also measured and displayed.
### 10) Vector-Vector Dot product calculation
->The code for the following function can be found in [vvd.cpp](vvd.cpp)
+>The code for the following function can be found in [vvd.cpp](src/vvd.cpp)
This C++ code calculates the dot product of two arrays using OpenMP for parallelization. It initializes two arrays, A and B, each containing 1000 elements set to 1. The dot product is computed in parallel using a dynamic scheduling strategy, with a chunk size of 100, and the results are combined using a reduction operation. The final result is printed to the console.
### 11) Sum calculation (wrong as pragma barrier is not calculated)
->The code for the following function can be found in [wrong_sum.cpp](wrong.cpp)
+>The code for the following function can be found in [wrong_sum.cpp](src/wrong.cpp)
This C++ code computes the sum of an array using OpenMP with task-based parallelism. It initializes an array of size 600 with all elements set to 1. The code divides the summation task into segments of size 100, allowing multiple threads to process these segments concurrently. The results from each task are accumulated into a shared variable sum using a critical section to prevent data races.
## 0.2) Compilation
>
```shell
# compile using g++ for Openmp
-g++ - sum2.cpp -o sum2
+g++ sum2.cpp -o sum2 -fopenmp
./sum2
# compile using g++ for MPI
diff --git a/src/broadcast.cpp b/src/broadcast.cpp
index c291276..70bb6c8 100644
--- a/src/broadcast.cpp
+++ b/src/broadcast.cpp
@@ -1,56 +1,60 @@
-#include
#include
-#include
+
+#include
#include
+#include
// Function to broadcast two matrices
-void broadcast(const std::vector>& A, std::vector>& B) {
- size_t rowsA = A.size();
- size_t colsA = A[0].size();
- size_t rowsB = B.size();
- size_t colsB = B[0].size();
-
- if (rowsA != rowsB && rowsB != 1) {
- throw std::invalid_argument("Incompatible dimensions for broadcasting");
- }
- if (colsA != colsB && colsB != 1) {
- throw std::invalid_argument("Incompatible dimensions for broadcasting");
- }
+void broadcast(const std::vector>& A,
+ std::vector>& B) {
+ size_t rowsA = A.size();
+ size_t colsA = A[0].size();
+ size_t rowsB = B.size();
+ size_t colsB = B[0].size();
- if (rowsB == 1) {
- B.resize(rowsA, B[0]);
- }
- if (colsB == 1) {
- for (auto& row : B) {
- row.resize(colsA, row[0]);
- }
+ if (rowsA == 0 || colsA == 0 || rowsB == 0 || colsB == 0) {
+ throw std::invalid_argument("Empty matrix cannot be broadcasted");
+ }
+ if (rowsA != rowsB && rowsB != 1) {
+ throw std::invalid_argument("Incompatible dimensions for broadcasting");
+ }
+ if (colsA != colsB && colsB != 1) {
+ throw std::invalid_argument("Incompatible dimensions for broadcasting");
+ }
+
+ if (rowsB == 1) {
+ B.resize(rowsA, B[0]);
+ }
+ if (colsB == 1) {
+ for (auto& row : B) {
+ row.resize(colsA, row[0]);
}
+ }
-
- #pragma omp parallel for
- for (size_t i = 0; i < rowsA; i++) {
- #pragma omp parallel for
- for (size_t j = 0; j < colsA; j++) {
- B[i][j] = A[i][j]+B[i][j];
- }
+#pragma omp parallel for
+ for (size_t i = 0; i < rowsA; i++) {
+#pragma omp parallel for
+ for (size_t j = 0; j < colsA; j++) {
+ B[i][j] = A[i][j] + B[i][j];
}
+ }
}
int main() {
- std::vector> A = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}};
- std::vector> B = {{1,2,3}}; // B has only one row
-
- try {
- broadcast(A, B);
- for (const auto& row : B) {
- for (const auto& elem : row) {
- std::cout << elem << " ";
- }
- std::cout << std::endl;
- }
- } catch (const std::invalid_argument& e) {
- std::cerr << "Error: " << e.what() << std::endl;
+ std::vector> A = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}};
+ std::vector> B = {{1, 2, 3}}; // B has only one row
+
+ try {
+ broadcast(A, B);
+ for (const auto& row : B) {
+ for (const auto& elem : row) {
+ std::cout << elem << " ";
+ }
+ std::cout << std::endl;
}
+ } catch (const std::invalid_argument& e) {
+ std::cerr << "Error: " << e.what() << std::endl;
+ }
- return 0;
+ return 0;
}
diff --git a/src/concatenate.cpp b/src/concatenate.cpp
index 7bbd123..fab03ad 100644
--- a/src/concatenate.cpp
+++ b/src/concatenate.cpp
@@ -5,7 +5,7 @@
using namespace std;
// Function to concatenate two arrays in parallel
-void concatenate(int arr1[], int arr2[], int n1, int n2, int arr3[]) {
+void concatenate(int* arr1, int* arr2, int n1, int n2, int* arr3) {
#pragma omp parallel for
for (int i = 0; i < n1; i++) {
arr3[i] = arr1[i];
@@ -18,10 +18,19 @@ void concatenate(int arr1[], int arr2[], int n1, int n2, int arr3[]) {
}
int main() {
- int n1 = 5, n2 = 5;
- int arr1[n1] = {1, 2, 3, 4, 5};
- int arr2[n2] = {6, 7, 8, 9, 10};
- int arr3[n1 + n2];
+
+ int n1 = 5000000, n2 = 5000000;
+
+ int* arr1 = new int[n1];
+ int* arr2 = new int[n2];
+ int* arr3 = new int[n1+n2];
+
+ // initialisation
+ for(int i = 0; i < n1; i++)
+ {
+ arr1[i] = 0;
+ arr2[i] = 1;
+ }
// Set the number of threads
omp_set_num_threads(2);
@@ -35,5 +44,9 @@ int main() {
}
cout << endl;
+ delete[] arr1;
+ delete[] arr2;
+ delete[] arr3;
+
return 0;
}
\ No newline at end of file
diff --git a/src/max.cpp b/src/max.cpp
index 02e7854..a458971 100644
--- a/src/max.cpp
+++ b/src/max.cpp
@@ -18,14 +18,13 @@ int findMax(int elementsToProcess, const std::vector& data) {
int main(int argc, char** argv) {
MPI_Init(&argc, &argv);
- double time = MPI_Wtime(); // Start timing
-
int processorsNr;
MPI_Comm_size(MPI_COMM_WORLD, &processorsNr);
int processId;
MPI_Comm_rank(MPI_COMM_WORLD, &processId);
int buff;
+ double time;
if (processId == 0) {
int max = 0; // Global maximum
@@ -66,6 +65,10 @@ int main(int argc, char** argv) {
MPI_Send(data.data() + startIdx, buff, MPI_INT, i, 2, MPI_COMM_WORLD);
}
+ // Synchronize all processes before starting the timing
+ MPI_Barrier(MPI_COMM_WORLD);
+ time = MPI_Wtime(); // Start timing
+
// Master process finds its own max
max = findMax(elementsToEachProcess + remainder, data);
@@ -78,8 +81,8 @@ int main(int argc, char** argv) {
}
}
- std::cout << "Global maximum is " << max << std::endl;
time = MPI_Wtime() - time; // Stop timing
+ std::cout << "Global maximum is " << max << std::endl;
std::cout << "Time elapsed: " << time << " seconds" << std::endl;
}
@@ -89,6 +92,9 @@ int main(int argc, char** argv) {
std::vector dataToProcess(buff);
MPI_Recv(dataToProcess.data(), buff, MPI_INT, 0, 2, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
+ // Synchronize before starting the computation
+ MPI_Barrier(MPI_COMM_WORLD);
+
int theMax = findMax(buff, dataToProcess);
// Send local max to master process
diff --git a/src/montecarlo.cpp b/src/montecarlo.cpp
index c7d38da..f315408 100644
--- a/src/montecarlo.cpp
+++ b/src/montecarlo.cpp
@@ -2,16 +2,25 @@
#include
#include
#include
+#include
using namespace std;
+unsigned long long getCurrentTimeInMilliseconds() {
+ return std::chrono::duration_cast(
+ std::chrono::high_resolution_clock::now().time_since_epoch()).count();
+}
+
void montecarlo(int n, int num_threads) {
int pCircle = 0, pSquare = 0;
double x, y, d;
int i;
+ unsigned long long seed = getCurrentTimeInMilliseconds();
+ srand(seed); // Seeding with a modified value
// Parallelize the loop using OpenMP
#pragma omp parallel for private(x, y, d, i) reduction(+:pCircle, pSquare) num_threads(num_threads)
for(i = 0; i < n; i++) {
+
// Generate random points between 0 and 1
x = (double)rand() / RAND_MAX;
y = (double)rand() / RAND_MAX;
diff --git a/src/wrong_sum.cpp b/src/wrong_sum.cpp
index 81ce347..fd04ade 100644
--- a/src/wrong_sum.cpp
+++ b/src/wrong_sum.cpp
@@ -7,7 +7,7 @@ int parallel_sum(const int arr[], int size, int step_size) {
#pragma omp parallel
{
- #pragma omp for nowait
+
for (int i = 0; i < size; i += step_size) {
int start = i, end = i + step_size - 1;