Skip to content

KTL-4174 Create a job to collect index requests from GH issues#306

Open
Zofia Wiora (zwiora) wants to merge 4 commits into
feature/KTL-4062-add-index-my-package-endpointfrom
feature/KTL-4174-Create-a-job-to-collect-index-requests
Open

KTL-4174 Create a job to collect index requests from GH issues#306
Zofia Wiora (zwiora) wants to merge 4 commits into
feature/KTL-4062-add-index-my-package-endpointfrom
feature/KTL-4174-Create-a-job-to-collect-index-requests

Conversation

@zwiora

@zwiora Zofia Wiora (zwiora) commented May 25, 2026

Copy link
Copy Markdown
Contributor
  • The previous naming conventions used for this feature (such as RequestIndexingService or ScraperType.MANUAL_REQUEST) were inconsistent and could be confused with distinct parts of the project, like IndexingRequestRepository. These occurrences were renamed to user requests now.
  • io.klibs.integration.maven.search.impl.BaseMavenSearchClient#getRemoteFileUrl now also processes untrusted data directly provided by users. That's why an URLBuilder was added to it.
  • The number of GitHub issues fetched in a single execution loop is limited. It should prevent potential API quota exhaustion in case anyone attempts to flood the service with spam issues.
  • The duplication verification logic has two stages:
    1. UserRequestCheckService.findDuplicateIssueNumber filters out duplicate requests within the current batch;
    2. UserRequestIndexingService.isAlreadyIndexedOrQueued checks if request is not already saved in the database.

@zwiora Zofia Wiora (zwiora) force-pushed the feature/KTL-4062-add-index-my-package-endpoint branch 2 times, most recently from cebe075 to 8e094b1 Compare June 1, 2026 12:22
* Updated only when a polling run completes without any server errors
*/
@Column(name = "user_request_check_timestamp")
var userRequestCheckTimestamp: Instant

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to have a field called "log_type" or something similar, and to have two separate records in the DB for the Maven index update and the user request check.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

val since = mavenCentralLogRepository.retrieveUserRequestCheckTimestamp()
val runStartedAt = Instant.now()

// Get batch of issues from GitHub

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't leave excessive comments. Code should be self-documenting. Comment should be only in non-obvious places and should answer question "why", not "how".
Fix in other places as well please.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed some comments, please let me know if you find any more of them that can be deleted

@@ -0,0 +1,336 @@
package io.klibs.app.indexing

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the service to io.klibs.app.service.impl

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}

internal data class ParsedRequest(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move these data calsses to a separate files in io.klibs.app.dto package.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved ParsedRequest to io.klibs.integration.maven.dto, but I left ProcessedRequestInfo in io.klibs.app.dto. Please let me know, if that's ok

}
}

internal data class ParsedRequest(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also could be named as MavenArtifactDto and lay in the io.klibs.integration.maven.dto

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

* Updates the timestamp in the database only if all issues were processed
* without server-side errors.
*/
fun checkUserRequests() {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to divide such big methods, the best if it fits one screen without scrolling.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return
}

val issuesToProcess = issuesBatch.issues.mapNotNull { issue ->

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be shortened to val issuesToProcess = issuesBatch.issues.associateWith { issue -> convertToValidMavenArtifact(issue) } if the whole body moves to a separate method

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

*
* Returns null if the request is valid, or an error message if it is not
*/
internal fun validateRequest(parsed: ParsedRequest): String? {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method could be extracted to a seperate MavenArtifactUtils class

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -17,22 +17,22 @@ import org.springframework.transaction.annotation.Transactional
import org.springframework.web.server.ResponseStatusException

@Service
class RequestIndexingService(
class UserRequestIndexingService(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this service to io.klibs.app.service.impl package as well

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@zwiora Zofia Wiora (zwiora) force-pushed the feature/KTL-4174-Create-a-job-to-collect-index-requests branch from 7c16089 to d51c715 Compare June 5, 2026 09:33
@zwiora Zofia Wiora (zwiora) force-pushed the feature/KTL-4062-add-index-my-package-endpoint branch from 8e094b1 to c4033c7 Compare June 5, 2026 09:34
@zwiora Zofia Wiora (zwiora) force-pushed the feature/KTL-4174-Create-a-job-to-collect-index-requests branch from d51c715 to 650588b Compare June 5, 2026 09:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants