Skip to content

Commit dfd45f9

Browse files
authored
Improve READMEs and organization (#46)
1 parent 7c03915 commit dfd45f9

File tree

6 files changed

+27
-8
lines changed

6 files changed

+27
-8
lines changed

README.md

+6
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,9 @@
2020
# Apache Arrow Experiments
2121

2222
This repository is for collaborative prototyping and research in the Apache Arrow project.
23+
24+
| Directory | Contents |
25+
| --------- | -------- |
26+
| **[data](./data)** | Various datasets that are used by the experiments in this repository or intended to be used in future Arrow experiments |
27+
| **[dissociated-ipc](./dissociated-ipc)** | Reference example implementation of the experimental [Arrow Dissociated IPC Protocol](https://arrow.apache.org/docs/dev/format/DissociatedIPC.html) |
28+
| **[http](./http)** | Examples demonstrating ways of sending and receiving data in Arrow IPC stream format (IANA media type `application/vnd.apache.arrow.stream`) over HTTP APIs |

data/README.md

+6-6
Original file line numberDiff line numberDiff line change
@@ -19,13 +19,13 @@
1919

2020
# Apache Arrow Data Experiments
2121

22-
This subdirectory contains experimental Arrow data whose purpose has not
23-
yet become clear but may be useful in the future. This currently includes
24-
data used to generate compelling examples that is more realistic than
25-
generated data or the testing data found in
22+
This directory contains various datasets that are used by the experiments
23+
in this repository or intended to be used in future Arrow experiments.
24+
This currently includes data used to generate compelling examples that is
25+
more realistic than generated data or the testing data found in
2626
[apache/arrow-testing](http://github.com/apache/arrow-testing). This
27-
subdirectory is intended as a semi-temporary staging area: eventually,
28-
data here should find a permanent home elsewhere or be removed.
27+
directory is intended as a semi-temporary staging area; eventually, much
28+
of the data here should find a permanent home elsewhere.
2929

3030
> [!IMPORTANT]
3131
> Please install and use [Git LFS](https://git-lfs.com) when contributing to this subdirectory. Add any new large file extensions to [`.gitattributes`](https://github.com/apache/arrow-experiments/blob/main/.gitattributes).

http/README.md

+14-1
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,20 @@
1919

2020
# Apache Arrow HTTP Data Transport
2121

22-
This area of the Apache Arrow Experiments repository is for collaborative prototyping and research on the subject of sending and receiving Arrow-formatted data over HTTP APIs.
22+
This area of the Apache Arrow Experiments repository is for collaborative prototyping and research on the subject of sending and receiving data in Arrow IPC stream format (IANA media type `application/vnd.apache.arrow.stream`) over HTTP APIs.
23+
24+
The subdirectories beginning with **get** demonstrate clients receiving data from servers (HTTP GET request). Those beginning with **post** demonstrate clients sending data to servers (HTTP POST request).
25+
26+
| Subdirectory | Purpose |
27+
| ------------ | ------- |
28+
| **[get_compressed](get_compressed)** | Demonstrates various ways of using compression when sending and receiving Arrow IPC stream data over HTTP |
29+
| **[get_indirect](get_indirect)** | Demonstrates a two-step sequence for fetching Arrow data from a server, in which a JSON document provides the URIs for the Arrow data |
30+
| **[get_multipart](get_multipart)** | Demonstrates how to send and receive a multipart HTTP response body (`multipart/mixed`) containing Arrow IPC stream data and other data |
31+
| **[get_range](get_range)** | Demonstrates how to use HTTP range requests to download Arrow IPC stream data of known length in multiple requests |
32+
| **[get_simple](get_simple)** | Contains a large set of examples demonstrating the basics of fetching an Arrow IPC stream from a server to a client in 12+ languages |
33+
| **[post_multipart](post_multipart)** | Demonstrates how to send and receive a multipart HTTP request body (`multipart/form-data`) containing Arrow IPC stream data and other data |
34+
| **[post_simple](post_simple)** | Demonstrates the basics of sending Arrow IPC stream data from a client to a server |
35+
2336

2437
The intent of this work is to:
2538
- Ensure excellent interoperability across languages.

http/get_simple/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ This directory contains a set of minimal examples of HTTP clients and servers im
2525

2626
The examples here assume that the server cannot determine the exact length in bytes of the full Arrow IPC stream before sending it, so they cannot set the `Content-Length` header or serve Range requests.
2727

28-
The client examples here assume that the client needs to hold the full received data in memory in an Arrow data structure for further in-memory processing. (The case in which the client simply writes the result directly to a file is much simpler and can be achieved trivially by using [curl](https://curl.se) or similar.)
28+
Most of the client examples here assume that the client needs to hold the full received data in memory in an Arrow data structure for further in-memory processing. The case in which the client simply writes the result directly to a file is much simpler and is demonstrated by the [curl client example](curl/client).
2929

3030
To enable performance comparisons to Arrow Flight RPC, the server examples generate the data in exactly the same way as in [`flight_benchmark.cc`](https://github.com/apache/arrow/blob/7346bdffbdca36492089f6160534bfa2b81bad90/cpp/src/arrow/flight/flight_benchmark.cc#L194-L245) as cited in the [original blog post introducing Flight RPC](https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/). But note that Flight example sends four concurrent streams.
3131

File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)