Skip to content

Commit 4f68c18

Browse files
authored
docs: ✏fies changes in README
1 parent ff26aea commit 4f68c18

File tree

2 files changed

+8
-8
lines changed

2 files changed

+8
-8
lines changed

docs/index.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -113,8 +113,8 @@ withSpark {
113113
### `withCached`
114114

115115
It can easily happen that we need to fork our computation to several paths. To compute things only once we should call `cache`
116-
method. But there it is hard to control when we're using cached `Dataset` and when not.
117-
It is also easy to forget to unpersist cached data, which can break things unexpectably or take more memory
116+
method. However, it becomes hard to control when we're using cached `Dataset` and when not.
117+
It is also easy to forget to unpersist cached data, which can break things unexpectedly or take more memory
118118
than intended.
119119

120120
To solve these problems we introduce `withCached` function

kotlin-spark-api/src/main/kotlin/org/jetbrains/spark/api/ApiV1.kt

+6-6
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ inline fun <reified T> List<T>.toDS(spark: SparkSession): Dataset<T> =
7979
spark.createDataset(this, encoder<T>())
8080

8181
/**
82-
* Main method of API, which gives you seamless integraion with Spark:
82+
* Main method of API, which gives you seamless integration with Spark:
8383
* It creates encoder for any given supported type T
8484
*
8585
* Supported types are data classes, primitives, and Lists, Maps and Arrays containing them
@@ -220,7 +220,7 @@ inline fun <reified L : Any?, reified R : Any?> Dataset<L>.fullJoin(right: Datas
220220
}
221221

222222
/**
223-
* Alias for [Dataset.sort] which forces user to provide sortedcolumns from source dataset
223+
* Alias for [Dataset.sort] which forces user to provide sorted columns from the source dataset
224224
*
225225
* @receiver source [Dataset]
226226
* @param columns producer of sort columns
@@ -232,7 +232,7 @@ inline fun <reified T> Dataset<T>.sort(columns: (Dataset<T>) -> Array<Column>) =
232232
* This function creates block, where one can call any further computations on already cached dataset
233233
* Data will be unpersisted automatically at the end of computation
234234
*
235-
* it may be useful in many situatiions, for example when one needs to write data to several targets
235+
* it may be useful in many situations, for example, when one needs to write data to several targets
236236
* ```kotlin
237237
* ds.withCached {
238238
* write()
@@ -241,7 +241,7 @@ inline fun <reified T> Dataset<T>.sort(columns: (Dataset<T>) -> Array<Column>) =
241241
* }
242242
* ```
243243
*
244-
* @param blockingUnpersist if execution should be blocked until everyting persisted will be deleted
244+
* @param blockingUnpersist if execution should be blocked until everything persisted will be deleted
245245
* @param executeOnCached Block which should be executed on cached dataset.
246246
* @return result of block execution for further usage. It may be anything including source or new dataset
247247
*/
@@ -254,8 +254,8 @@ inline fun <reified T> Dataset<Row>.toList() = KSparkExtensions.collectAsList(to
254254
inline fun <reified R> Dataset<*>.toArray(): Array<R> = to<R>().collect() as Array<R>
255255

256256
/**
257-
* Alternative to [Dataset.show] which returns surce dataset.
258-
* Useful in debug purposes when you need to view contant of dataset as intermediate operation
257+
* Alternative to [Dataset.show] which returns source dataset.
258+
* Useful for debug purposes when you need to view content of a dataset as an intermediate operation
259259
*/
260260
fun <T> Dataset<T>.showDS(numRows: Int = 20, truncate: Boolean = true) = apply { show(numRows, truncate) }
261261

0 commit comments

Comments
 (0)