Skip to content

Commit fec3db4

Browse files
benedekilsulak
andauthored
#245 Add the ability to query REST endpoints from Reader module (#297)
* class `Reader` an expect ancestor to all reader, that is able to query the Atum server easily * class `ServerSetup` to pack Atum server access information * trait `PartitioningIdProvider` to add ability to easily get Partitioning ID * `RequestResult[R]` represents an Atum server query response. * Offered implicits for `MonadError` type class needed for `Reader` and `ReaderWithPartitioningId` - there are `Future`, Cats `IO` * `AtumPartitions` and `AdditionalData` moved from _Agent_ to _Module_ * `ErrorResponse` received a method to decode from Json based on http status code * README.md update --------- Co-authored-by: Ladislav Sulak <[email protected]>
1 parent 8603720 commit fec3db4

38 files changed

+1143
-123
lines changed

.github/workflows/jacoco_report.yml

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -109,14 +109,14 @@ jobs:
109109
- name: Get the Coverage info
110110
if: steps.jacocorun.outcome == 'success'
111111
run: |
112-
echo "Total agent module coverage ${{ steps.jacoco-agent.outputs.coverage-overall }}"
113-
echo "Changed Files coverage ${{ steps.jacoco-agent.outputs.coverage-changed-files }}"
114-
echo "Total agent module coverage ${{ steps.jacoco-reader.outputs.coverage-overall }}"
115-
echo "Changed Files coverage ${{ steps.jacoco-reader.outputs.coverage-changed-files }}"
116-
echo "Total model module coverage ${{ steps.jacoco-model.outputs.coverage-overall }}"
117-
echo "Changed Files coverage ${{ steps.jacoco-model.outputs.coverage-changed-files }}"
118-
echo "Total server module coverage ${{ steps.jacoco-server.outputs.coverage-overall }}"
119-
echo "Changed Files coverage ${{ steps.jacoco-server.outputs.coverage-changed-files }}"
112+
echo "Total 'agent' module coverage ${{ steps.jacoco-agent.outputs.coverage-overall }}"
113+
echo "Changed files of 'agent' module coverage ${{ steps.jacoco-agent.outputs.coverage-changed-files }}"
114+
echo "Total 'reader' module coverage ${{ steps.jacoco-reader.outputs.coverage-overall }}"
115+
echo "Changed files of 'reader' module coverage ${{ steps.jacoco-reader.outputs.coverage-changed-files }}"
116+
echo "Total 'model' module coverage ${{ steps.jacoco-model.outputs.coverage-overall }}"
117+
echo "Changed files of 'model' module coverage ${{ steps.jacoco-model.outputs.coverage-changed-files }}"
118+
echo "Total 'server' module coverage ${{ steps.jacoco-server.outputs.coverage-overall }}"
119+
echo "Changed files of 'server' module coverage ${{ steps.jacoco-server.outputs.coverage-changed-files }}"
120120
- name: Fail PR if changed files coverage is less than ${{ env.coverage-changed-files }}%
121121
if: steps.jacocorun.outcome == 'success'
122122
uses: actions/github-script@v6

.github/workflows/test_filenames_check.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ jobs:
4141
excludes: |
4242
server/src/test/scala/za/co/absa/atum/server/api/TestData.scala,
4343
server/src/test/scala/za/co/absa/atum/server/api/TestTransactorProvider.scala,
44-
server/src/test/scala/za/co/absa/atum/server/ConfigProviderTest.scala
44+
server/src/test/scala/za/co/absa/atum/server/ConfigProviderTest.scala,
45+
model/src/test/scala/za/co/absa/atum/testing/*
4546
verbose-logging: 'false'
4647
fail-on-violation: 'true'

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -205,7 +205,6 @@ We can even say, that `Checkpoint` is a result of particular `Measurements` (ver
205205
The journey of a dataset throughout various data transformations and pipelines. It captures the whole journey,
206206
even if it involves multiple applications or ETL pipelines.
207207

208-
209208
## Usage
210209

211210
### Atum Agent routines
@@ -247,6 +246,7 @@ Code coverage wil be generated on path:
247246
To make this project runnable via IntelliJ, do the following:
248247
- Make sure that your configuration in `server/src/main/resources/reference.conf`
249248
is configured according to your needs
249+
- When building within an IDE sure to have the option `-language:higherKinds` on in the compiler options, as it's often not picked up from the SBT project settings.
250250

251251
## How to Run Tests
252252

agent/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,26 +7,26 @@
77

88
## Usage
99

10-
Create multiple `AtumContext` with different control measures to be applied
10+
Create multiple `AtumContext` with different control measures to be applied
1111

1212
### Option 1
1313
```scala
1414
val atumContextInstanceWithRecordCount = AtumContext(processor = processor)
15-
.withMeasureAdded(RecordCount(MockMeasureNames.recordCount1, measuredColumn = "id"))
15+
.withMeasureAdded(RecordCount(MockMeasureNames.recordCount1))
1616

1717
val atumContextWithSalaryAbsMeasure = atumContextInstanceWithRecordCount
1818
.withMeasureAdded(AbsSumOfValuesOfColumn(measuredColumn = "salary"))
1919
```
2020

21-
### Option 2
21+
### Option 2
2222
Use `AtumPartitions` to get an `AtumContext` from the service using the `AtumAgent`.
2323
```scala
2424
val atumContext1 = AtumAgent.createAtumContext(atumPartition)
2525
```
2626

2727
#### AtumPartitions
28-
A list of key values that maintains the order of arrival of the items, the `AtumService`
29-
is able to deliver the correct `AtumContext` according to the `AtumPartitions` we give it.
28+
A list of key values that maintains the order of arrival of the items, the `AtumService`
29+
is able to deliver the correct `AtumContext` according to the `AtumPartitions` we give it.
3030
```scala
3131
val atumPartitions = AtumPartitions().withPartitions(ListMap("name" -> "partition-name", "country" -> "SA", "gender" -> "female" ))
3232

agent/src/main/scala/za/co/absa/atum/agent/AtumAgent.scala

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,10 @@
1717
package za.co.absa.atum.agent
1818

1919
import com.typesafe.config.{Config, ConfigFactory}
20-
import za.co.absa.atum.agent.AtumContext.AtumPartitions
2120
import za.co.absa.atum.agent.dispatcher.{CapturingDispatcher, ConsoleDispatcher, Dispatcher, HttpDispatcher}
2221
import za.co.absa.atum.model.dto.{AdditionalDataDTO, AdditionalDataPatchDTO, CheckpointDTO, PartitioningSubmitDTO}
22+
import za.co.absa.atum.model.types.basic.AtumPartitions
23+
import za.co.absa.atum.model.types.basic.AtumPartitionsOps
2324

2425
/**
2526
* Entity that communicate with the API, primarily focused on spawning Atum Context(s).
@@ -58,7 +59,7 @@ trait AtumAgent {
5859
atumPartitions: AtumPartitions,
5960
additionalDataPatchDTO: AdditionalDataPatchDTO
6061
): AdditionalDataDTO = {
61-
dispatcher.updateAdditionalData(AtumPartitions.toSeqPartitionDTO(atumPartitions), additionalDataPatchDTO)
62+
dispatcher.updateAdditionalData(atumPartitions.toPartitioningDTO, additionalDataPatchDTO)
6263
}
6364

6465
/**
@@ -75,7 +76,7 @@ trait AtumAgent {
7576
*/
7677
def getOrCreateAtumContext(atumPartitions: AtumPartitions): AtumContext = {
7778
val authorIfNew = AtumAgent.currentUser
78-
val partitioningDTO = PartitioningSubmitDTO(AtumPartitions.toSeqPartitionDTO(atumPartitions), None, authorIfNew)
79+
val partitioningDTO = PartitioningSubmitDTO(atumPartitions.toPartitioningDTO, None, authorIfNew)
7980

8081
val atumContextDTO = dispatcher.createPartitioning(partitioningDTO)
8182
val atumContext = AtumContext.fromDTO(atumContextDTO, this)
@@ -94,8 +95,8 @@ trait AtumAgent {
9495
val authorIfNew = AtumAgent.currentUser
9596
val newPartitions: AtumPartitions = parentAtumContext.atumPartitions ++ subPartitions
9697

97-
val newPartitionsDTO = AtumPartitions.toSeqPartitionDTO(newPartitions)
98-
val parentPartitionsDTO = Some(AtumPartitions.toSeqPartitionDTO(parentAtumContext.atumPartitions))
98+
val newPartitionsDTO = newPartitions.toPartitioningDTO
99+
val parentPartitionsDTO = Some(parentAtumContext.atumPartitions.toPartitioningDTO)
99100
val partitioningDTO = PartitioningSubmitDTO(newPartitionsDTO, parentPartitionsDTO, authorIfNew)
100101

101102
val atumContextDTO = dispatcher.createPartitioning(partitioningDTO)

agent/src/main/scala/za/co/absa/atum/agent/AtumContext.scala

Lines changed: 4 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -17,14 +17,13 @@
1717
package za.co.absa.atum.agent
1818

1919
import org.apache.spark.sql.DataFrame
20-
import za.co.absa.atum.agent.AtumContext.{AdditionalData, AtumPartitions}
2120
import za.co.absa.atum.agent.exception.AtumAgentException.PartitioningUpdateException
2221
import za.co.absa.atum.agent.model._
2322
import za.co.absa.atum.model.dto._
23+
import za.co.absa.atum.model.types.basic.{AdditionalData, AtumPartitions, AtumPartitionsOps, PartitioningDTOOps}
2424

2525
import java.time.ZonedDateTime
2626
import java.util.UUID
27-
import scala.collection.immutable.ListMap
2827

2928
/**
3029
* This class provides the methods to measure Spark `Dataframe`. Also allows to add and remove measures.
@@ -91,7 +90,7 @@ class AtumContext private[agent] (
9190
name = checkpointName,
9291
author = agent.currentUser,
9392
measuredByAtumAgent = true,
94-
partitioning = AtumPartitions.toSeqPartitionDTO(atumPartitions),
93+
partitioning = atumPartitions.toPartitioningDTO,
9594
processStartTime = startTime,
9695
processEndTime = Some(endTime),
9796
measurements = measurementDTOs
@@ -115,7 +114,7 @@ class AtumContext private[agent] (
115114
id = UUID.randomUUID(),
116115
name = checkpointName,
117116
author = agent.currentUser,
118-
partitioning = AtumPartitions.toSeqPartitionDTO(atumPartitions),
117+
partitioning = atumPartitions.toPartitioningDTO,
119118
processStartTime = dateTimeNow,
120119
processEndTime = Some(dateTimeNow),
121120
measurements = MeasurementBuilder.buildAndValidateMeasurementsDTO(measurements)
@@ -206,36 +205,10 @@ class AtumContext private[agent] (
206205
}
207206

208207
object AtumContext {
209-
/**
210-
* Type alias for Atum partitions.
211-
*/
212-
type AtumPartitions = ListMap[String, String]
213-
type AdditionalData = Map[String, Option[String]]
214-
215-
/**
216-
* Object contains helper methods to work with Atum partitions.
217-
*/
218-
object AtumPartitions {
219-
def apply(elems: (String, String)): AtumPartitions = {
220-
ListMap(elems)
221-
}
222-
223-
def apply(elems: List[(String, String)]): AtumPartitions = {
224-
ListMap(elems:_*)
225-
}
226-
227-
private[agent] def toSeqPartitionDTO(atumPartitions: AtumPartitions): PartitioningDTO = {
228-
atumPartitions.map { case (key, value) => PartitionDTO(key, value) }.toSeq
229-
}
230-
231-
private[agent] def fromPartitioning(partitioning: PartitioningDTO): AtumPartitions = {
232-
AtumPartitions(partitioning.map(partition => Tuple2(partition.key, partition.value)).toList)
233-
}
234-
}
235208

236209
private[agent] def fromDTO(atumContextDTO: AtumContextDTO, agent: AtumAgent): AtumContext = {
237210
new AtumContext(
238-
AtumPartitions.fromPartitioning(atumContextDTO.partitioning),
211+
atumContextDTO.partitioning.toAtumPartitions,
239212
agent,
240213
MeasuresBuilder.mapToMeasures(atumContextDTO.measures),
241214
atumContextDTO.additionalData

agent/src/test/scala/za/co/absa/atum/agent/AtumAgentUnitTests.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,8 @@ package za.co.absa.atum.agent
1818

1919
import com.typesafe.config.{Config, ConfigException, ConfigFactory, ConfigValueFactory}
2020
import org.scalatest.funsuite.AnyFunSuiteLike
21-
import za.co.absa.atum.agent.AtumContext.AtumPartitions
2221
import za.co.absa.atum.agent.dispatcher.{CapturingDispatcher, ConsoleDispatcher, HttpDispatcher}
22+
import za.co.absa.atum.model.types.basic.AtumPartitions
2323

2424
class AtumAgentUnitTests extends AnyFunSuiteLike {
2525

agent/src/test/scala/za/co/absa/atum/agent/AtumContextUnitTests.scala

Lines changed: 27 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -22,11 +22,11 @@ import org.mockito.ArgumentCaptor
2222
import org.mockito.Mockito.{mock, times, verify, when}
2323
import org.scalatest.flatspec.AnyFlatSpec
2424
import org.scalatest.matchers.should.Matchers
25-
import za.co.absa.atum.agent.AtumContext.AtumPartitions
2625
import za.co.absa.atum.agent.model.AtumMeasure.{RecordCount, SumOfValuesOfColumn}
2726
import za.co.absa.atum.agent.model.{Measure, MeasureResult, MeasurementBuilder, UnknownMeasure}
2827
import za.co.absa.atum.model.ResultValueType
2928
import za.co.absa.atum.model.dto.CheckpointDTO
29+
import za.co.absa.atum.model.types.basic._
3030

3131
class AtumContextUnitTests extends AnyFlatSpec with Matchers {
3232

@@ -95,12 +95,12 @@ class AtumContextUnitTests extends AnyFlatSpec with Matchers {
9595

9696
val argument = ArgumentCaptor.forClass(classOf[CheckpointDTO])
9797
verify(mockAgent).saveCheckpoint(argument.capture())
98-
99-
assert(argument.getValue.name == "testCheckpoint")
100-
assert(argument.getValue.author == authorTest)
101-
assert(argument.getValue.partitioning == AtumPartitions.toSeqPartitionDTO(atumPartitions))
102-
assert(argument.getValue.measurements.head.result.mainValue.value == "3")
103-
assert(argument.getValue.measurements.head.result.mainValue.valueType == ResultValueType.LongValue)
98+
val value: CheckpointDTO = argument.getValue
99+
assert(value.name == "testCheckpoint")
100+
assert(value.author == authorTest)
101+
assert(value.partitioning == atumPartitions.toPartitioningDTO)
102+
assert(value.measurements.head.result.mainValue.value == "3")
103+
assert(value.measurements.head.result.mainValue.valueType == ResultValueType.LongValue)
104104
}
105105

106106
"createCheckpointOnProvidedData" should "create a Checkpoint on provided data" in {
@@ -123,13 +123,14 @@ class AtumContextUnitTests extends AnyFlatSpec with Matchers {
123123

124124
val argument = ArgumentCaptor.forClass(classOf[CheckpointDTO])
125125
verify(mockAgent).saveCheckpoint(argument.capture())
126-
127-
assert(argument.getValue.name == "name")
128-
assert(argument.getValue.author == authorTest)
129-
assert(!argument.getValue.measuredByAtumAgent)
130-
assert(argument.getValue.partitioning == AtumPartitions.toSeqPartitionDTO(atumPartitions))
131-
assert(argument.getValue.processStartTime == argument.getValue.processEndTime.get)
132-
assert(argument.getValue.measurements == MeasurementBuilder.buildAndValidateMeasurementsDTO(measurements))
126+
val value: CheckpointDTO = argument.getValue
127+
128+
assert(value.name == "name")
129+
assert(value.author == authorTest)
130+
assert(!value.measuredByAtumAgent)
131+
assert(value.partitioning == atumPartitions.toPartitioningDTO)
132+
assert(value.processStartTime == value.processEndTime.get)
133+
assert(value.measurements == MeasurementBuilder.buildAndValidateMeasurementsDTO(measurements))
133134
}
134135

135136
"createCheckpoint" should "take measurements and create a Checkpoint, multiple measure changes" in {
@@ -167,25 +168,27 @@ class AtumContextUnitTests extends AnyFlatSpec with Matchers {
167168

168169
val argumentFirst = ArgumentCaptor.forClass(classOf[CheckpointDTO])
169170
verify(mockAgent, times(1)).saveCheckpoint(argumentFirst.capture())
171+
val valueFirst: CheckpointDTO = argumentFirst.getValue
170172

171-
assert(argumentFirst.getValue.name == "checkPointNameCount")
172-
assert(argumentFirst.getValue.author == authorTest)
173-
assert(argumentFirst.getValue.partitioning == AtumPartitions.toSeqPartitionDTO(atumPartitions))
174-
assert(argumentFirst.getValue.measurements.head.result.mainValue.value == "4")
175-
assert(argumentFirst.getValue.measurements.head.result.mainValue.valueType == ResultValueType.LongValue)
173+
assert(valueFirst.name == "checkPointNameCount")
174+
assert(valueFirst.author == authorTest)
175+
assert(valueFirst.partitioning == atumPartitions.toPartitioningDTO)
176+
assert(valueFirst.measurements.head.result.mainValue.value == "4")
177+
assert(valueFirst.measurements.head.result.mainValue.valueType == ResultValueType.LongValue)
176178

177179
atumContext.addMeasure(SumOfValuesOfColumn("columnForSum"))
178180
when(mockAgent.currentUser).thenReturn(authorTest + "Another") // maybe a process changed the author / current user
179181
df.createCheckpoint("checkPointNameSum")
180182

181183
val argumentSecond = ArgumentCaptor.forClass(classOf[CheckpointDTO])
182184
verify(mockAgent, times(2)).saveCheckpoint(argumentSecond.capture())
185+
val valueSecond: CheckpointDTO = argumentSecond.getValue
183186

184-
assert(argumentSecond.getValue.name == "checkPointNameSum")
185-
assert(argumentSecond.getValue.author == authorTest + "Another")
186-
assert(argumentSecond.getValue.partitioning == AtumPartitions.toSeqPartitionDTO(atumPartitions))
187-
assert(argumentSecond.getValue.measurements.tail.head.result.mainValue.value == "22.5")
188-
assert(argumentSecond.getValue.measurements.tail.head.result.mainValue.valueType == ResultValueType.BigDecimalValue)
187+
assert(valueSecond.name == "checkPointNameSum")
188+
assert(valueSecond.author == authorTest + "Another")
189+
assert(valueSecond.partitioning == atumPartitions.toPartitioningDTO)
190+
assert(valueSecond.measurements.tail.head.result.mainValue.value == "22.5")
191+
assert(valueSecond.measurements.tail.head.result.mainValue.valueType == ResultValueType.BigDecimalValue)
189192
}
190193

191194
"addAdditionalData" should "add key/value pair to map for additional data" in {

agent/src/test/scala/za/co/absa/atum/agent/model/AtumMeasureUnitTests.scala

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,10 @@ import org.apache.spark.sql.types.{IntegerType, StringType, StructField, StructT
2121
import org.scalatest.flatspec.AnyFlatSpec
2222
import org.scalatest.matchers.should.Matchers
2323
import za.co.absa.atum.agent.AtumAgent
24-
import za.co.absa.atum.agent.AtumContext.{AtumPartitions, DatasetWrapper}
24+
import za.co.absa.atum.agent.AtumContext.DatasetWrapper
2525
import za.co.absa.atum.agent.model.AtumMeasure._
2626
import za.co.absa.atum.model.ResultValueType
27+
import za.co.absa.atum.model.types.basic.AtumPartitions
2728
import za.co.absa.spark.commons.test.SparkTestBase
2829

2930
class AtumMeasureUnitTests extends AnyFlatSpec with Matchers with SparkTestBase { self =>

agent/src/test/scala/za/co/absa/atum/agent/model/MeasureUnitTests.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,11 @@ package za.co.absa.atum.agent.model
1919
import org.scalatest.flatspec.AnyFlatSpec
2020
import org.scalatest.matchers.should.Matchers
2121
import za.co.absa.atum.agent.AtumAgent
22-
import za.co.absa.atum.agent.AtumContext.AtumPartitions
2322
import za.co.absa.atum.agent.model.AtumMeasure.{AbsSumOfValuesOfColumn, RecordCount, SumOfHashesOfColumn, SumOfValuesOfColumn}
2423
import za.co.absa.spark.commons.test.SparkTestBase
2524
import za.co.absa.atum.agent.AtumContext._
2625
import za.co.absa.atum.model.ResultValueType
26+
import za.co.absa.atum.model.types.basic.AtumPartitions
2727

2828
class MeasureUnitTests extends AnyFlatSpec with Matchers with SparkTestBase { self =>
2929

0 commit comments

Comments
 (0)