From 5dd884755192cd074399620ad991ddc9848f3b1f Mon Sep 17 00:00:00 2001 From: SuryaViswanath Date: Fri, 3 Jan 2025 13:23:18 +0000 Subject: [PATCH] docs: correct documentation mistakes and typos --- docs/building/introduction.rst | 46 +++++++++++++++++----------------- 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/docs/building/introduction.rst b/docs/building/introduction.rst index c2f7c8bd0..2d217c51d 100644 --- a/docs/building/introduction.rst +++ b/docs/building/introduction.rst @@ -10,7 +10,7 @@ file, which is a YAML file that describes sources of meteorological fields as well as the operations to perform on them, before they are written to a zarr file. The input of the process is a range of dates and some options to control the layout of the output. Statistics will be -computed as the dataset is build, and stored in the metadata, with other +computed as the dataset is built, and stored in the metadata, with other information such as the the locations of the grid points, the list of variables, etc. @@ -24,33 +24,33 @@ variables, etc. date Throughout this document, the term `date` refers to a date and time, - not just a date. A training dataset is covers a continuous range of + not just a date. A training dataset covers a continuous range of dates with a given frequency. Missing dates are still part of the - dataset, but the data are missing and marked as such using NaNs. + dataset, but missing data are marked as such using NaNs. Dates are always in UTC, and refer to date at which the data is valid. For accumulations and fluxes, that would be the end of the accumulation period. variable - A `variable` is meteorological parameter, such as temperature, wind, + A `variable` is a meteorological parameter, such as temperature, wind, etc. Multilevel parameters are treated as separate variables, one for each level. For example, temperature at 850 hPa and temperature at 500 hPa will be treated as two separate variables (`t_850` and `t_500`). field - A `field` is a variable at a given date. It is represented by a array + A `field` is a variable at a given date. It is represented by an array of values at each grid point. source - The `source` is a software component that given a list of dates and - variables will return the corresponding fields. A example of source + The `source` is a software component that, given a list of dates and + variables will return the corresponding fields. An example of source is ECMWF's MARS archive, a collection of GRIB or NetCDF files, a database, etc. See :ref:`sources` for more information. filter A `filter` is a software component that takes as input the output of - a source or the output of another filter can modify the fields and/or + a source or another filter and can modify the fields and/or their metadata. For example, typical filters are interpolations, renaming of variables, etc. See :ref:`filters` for more information. @@ -62,19 +62,19 @@ In order to build a training dataset, sources and filters are combined using the following operations: join - The join is the process of combining several sources data. Each - source is expected to provide different variables at the same dates. + The join is the process of combining several sources of data. Each + source is expected to provide different variables for the same of dates. -pipe +pipe The pipe is the process of transforming fields using filters. The - first step of a pipe is typically a source, a join or another pipe. - The following steps are filters. + first step of a pipe is typically a source, a join, or another pipe. + This can subsequently followed by more filters. concat The concatenation is the process of combining different sets of - operation that handle different dates. This is typically used to - build a dataset that spans several years, when the several sources - are involved, each providing a different period. + operations that handle different dates. This is typically used to + build a dataset that spans several years, when several sources + are involved, each providing data for different period. Each operation is considered as a :ref:`source `, therefore operations can be combined to build complex datasets. @@ -87,7 +87,7 @@ First recipe ============ The simplest `recipe` file must contain a ``dates`` section and an -``input`` section. The latter must contain a `source` In that case, the +``input`` section. The latter must contain a `source`. In that case, the source is ``mars`` .. literalinclude:: yaml/building1.yaml @@ -132,15 +132,15 @@ This will build the following dataset: Adding some forcing variables ============================= -When training a data-driven models, some forcing variables may be +When training a data-driven model, some forcing variables may be required such as the solar radiation, the time of day, the day in the year, etc. -These are provided by the ``forcings`` source. In that example, we add a -few of them. The `template` option is used to point to another source, -in that case the first instance of ``mars``. This source is used to get -information about the grid points, as some of the forcing variables are -grid dependent. +These are provided by the ``forcings`` source. Let us add a few of them +to the above example. The `template` option is used to point to another +source, in that case the first instance of ``mars``. This source is used +to get information about the grid points, as some of the forcing variables +are grid dependent. .. literalinclude:: yaml/building3.yaml :language: yaml