Skip to content

[Feature Request]: Implementing Aho-Corasick Algorithm in C++ under DSA folder. #3771

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
4 tasks done
sjain1970 opened this issue Jul 22, 2024 · 3 comments · Fixed by #3796
Closed
4 tasks done

[Feature Request]: Implementing Aho-Corasick Algorithm in C++ under DSA folder. #3771

sjain1970 opened this issue Jul 22, 2024 · 3 comments · Fixed by #3796
Assignees
Labels
CodeHarborHub - Thanks for creating an issue! documentation Improvements or additions to documentation gssoc GirlScript Summer of Code | Contributor GSSOC'24 GirlScript Summer of Code | Contributor level1 GirlScript Summer of Code | Contributor's Levels

Comments

@sjain1970
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Feature Description

I propose adding the implementation of the Aho-Corasick Algorithm in C++ to the DSA (Data Structures and Algorithms) folder. The Aho-Corasick Algorithm is used for searching multiple patterns simultaneously in a given text, making it highly efficient for string matching tasks.

Use Case

This feature would enhance the project by providing a robust method for efficiently searching multiple patterns within a text. For example, in applications involving large-scale text processing, such as DNA sequence analysis or searching for keywords in a document, the Aho-Corasick Algorithm can significantly improve performance.

Benefits

A specific scenario where this feature would be beneficial is in cybersecurity applications, where the algorithm can be used to detect multiple malicious patterns or signatures in network traffic data. The algorithm allows simultaneous searching for multiple patterns, reducing the time complexity compared to separate searches.

Add ScreenShots

No response

Priority

High

Record

  • I have read the Contributing Guidelines
  • I'm a GSSOC'24 contributor
  • I have starred the repository
@sjain1970 sjain1970 added the enhancement New feature or request label Jul 22, 2024
Copy link

Hi @sjain1909! Thanks for opening this issue. We appreciate your contribution to this open-source project. Your input is valuable and we aim to respond or assign your issue as soon as possible. Thanks again!

@sjain1970
Copy link
Contributor Author

@ajay-dhangar sir, kindly assign me this issue under GSSOC'24.

@ajay-dhangar ajay-dhangar added documentation Improvements or additions to documentation GSSOC'24 GirlScript Summer of Code | Contributor level1 GirlScript Summer of Code | Contributor's Levels gssoc GirlScript Summer of Code | Contributor labels Jul 22, 2024
sjain1970 added a commit to sjain1970/gssocproject1 that referenced this issue Jul 22, 2024
## Fixing Issue codeharborhub#3771

## Description

This pull request adds the implementation of the Aho-Corasick algorithm in C++ to the DSA folder. The Aho-Corasick algorithm is used for searching multiple patterns simultaneously within a text. It builds a trie of the patterns and uses a failure function to efficiently search for all patterns in O(n + m + z) time complexity, where n is the length of the text, m is the total length of all patterns, and z is the number of pattern occurrences.

## Type of PR

- [ ] Bug fix
- [x] Feature enhancement
- [ ] Documentation update
- [ ] Security enhancement
- [ ] Other (specify): _______________

## Checklist
- [x] I have performed a self-review of my code.
- [x] I have read and followed the Contribution Guidelines.
- [x] I have tested the changes thoroughly before submitting this pull request.
- [x] I have provided relevant issue numbers, screenshots, and videos after making the changes.
- [x] I have commented my code, particularly in hard-to-understand areas.
- [x] I have followed the code style guidelines of this project.
- [x] I have checked for any existing open issues that my pull request may address.
- [x] I have ensured that my changes do not break any existing functionality.
- [x] Each contributor is allowed to create a maximum of 4 issues per day. This helps us manage and address issues efficiently.
- [x] I have read the resources for guidance listed below.
- [x] I have followed security best practices in my code changes.

## Additional Context

The Aho-Corasick algorithm implementation includes:
- Construction of a trie for multiple pattern strings.
- Implementation of the failure function to handle mismatches efficiently.
- Efficient search for multiple patterns within a given text.
- Clear and well-commented code for easy understanding and maintenance.

## Resources for Guidance

Please read the following resources before submitting your contribution:

- [x] [Code Harbor Hub Community Features](https://www.codeharborhub.live/community/features)
- [x] [Markdown Guide](https://www.markdownguide.org/)
Copy link

Hello @sjain1909! Your issue #3771 has been closed. Thank you for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CodeHarborHub - Thanks for creating an issue! documentation Improvements or additions to documentation gssoc GirlScript Summer of Code | Contributor GSSOC'24 GirlScript Summer of Code | Contributor level1 GirlScript Summer of Code | Contributor's Levels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants