cppalliance · ashtum · Aug 25, 2024 · Aug 9, 2024
diff --git a/doc/modules/ROOT/nav.adoc b/doc/modules/ROOT/nav.adoc
@@ -1,8 +1,21 @@
 * xref:sans_io_philosophy.adoc[]
 
+* xref:http_protocol_basics.adoc[]
+
+* xref:header_containers.adoc[]
+
+* xref:message_bodies.adoc[]
+
+* Serializing
+
+* Parsing
+
 * xref:Message.adoc[]
 
 * Design Requirements
 ** xref:design_requirements/serializer.adoc[Serializer]
+** xref:design_requirements/parser.adoc[Parser]
+
+* Design Choices
 
 * xref:reference:boost/http_proto.adoc[Reference]
diff --git a/doc/modules/ROOT/pages/design_requirements/parser.adoc b/doc/modules/ROOT/pages/design_requirements/parser.adoc
@@ -0,0 +1,189 @@
+//
+// Copyright (c) 2024 Mohammad Nejati
+//
+// Distributed under the Boost Software License, Version 1.0. (See accompanying
+// file LICENSE_1_0.txt or copy at https://www.boost.org/LICENSE_1_0.txt)
+//
+// Official repository: https://github.com/cppalliance/http_proto
+//
+
+= Parser Design Requirements
+
+== Memory Allocation and Memory Utilization
+
+The `parser` must use a single block of memory allocated during construction and
+guarantee that it will never exceed the specified size. It should also be able
+to reuse this space for parsing multiple HTTP messages (one message at a time).
+
+The `parser` must efficiently utilize the allocated memory for the following
+purposes:
+
+- Provide a mutable buffer for reading raw input (for example, from a socket).
+- Store HTTP headers and provide a non-owning, read-only view that allows
+  efficient access and iteration through header names and values.
+- O(1) access to important HTTP headers, including the request method, target,
+  and response status code.
+- Store all or part of an HTTP message and provide the necessary interfaces for
+  retrieving it.
+- Take ownership of user-provided dynamic buffers and sink objects.
+- Store the necessary state for inflate algorithms.
+
+Using a `parser` that works with a fixed-size buffer, an application can ensure
+it never exceeds capacity if all resources are provisioned at program startup.
+
+== Input Buffer Preparation
+
+The `parser` must use its understanding of the current HTTP message to provide
+an input buffer size that balances the number of I/O operations and memory
+movements. A naive approach might offer the largest possible buffer to minimize
+I/O operations, but this would result in excessive memory movement, consuming
+significant computational resources. For example:
+
+- If the exact size of the message body is known and no transformation is
+  needed, we can offer an input buffer that matches the remaining body size,
+  plus a controlled overread that balances the cost of subsequent memory
+  movement.
+- If the body type is chunked, we can provide an input buffer that accommodates
+  the remaining chunk size, plus a controlled overread to read the next chunk
+  header or the terminating chunk. This allows us to position the subsequent
+  chunk directly after the current without having to perform memory movement
+  due to the existence of a chunk header.
+
+== Use Cases and Interfaces
+
+To keep things simple, we will use the following synchronous free functions to
+demonstrate the flow of the parse operation in each example:
+
+[source,cpp]
+----
+void
+read_some(stream& s, parser& pr)
+{
+    system::error_code ec;
+    if(pr.need_data())
+    {
+        auto n = s.read_some(pr.prepare(), ec);
+        pr.commit(n);
+        if(ec == asio::error::eof)
+        {
+            pr.commit_eof();
+            ec = {};
+        }
+        if(ec.failed())
+            throw system::system_error{ec};
+    }
+    pr.parse(ec);
+    if(ec.failed() && ec != condition::need_more_input)
+        throw system::system_error{ec};
+}
+
+void
+read_header(stream& s, parser& pr)
+{
+    while(!pr.got_header())
+        read_some(s, pr);
+}
+
+void
+read(stream& s, parser& pr)
+{
+    while(!pr.is_complete())
+        read_some(s, pr);
+}
+----
+
+
+=== In-Place Body
+
+It must be possible to use the internal buffer of the `parser` for storing the
+entire or part of an HTTP message body.
+
+[source,cpp]
+----
+request_parser pr{ctx};
+pr.start();
+
+read_header(stream, pr);
+
+// When the entire body can fit in-place
+read(stream, pr);
+string_view body = pr.body();
+
+// When need to read body piece by piece
+while(!pr.is_complete())
+{
+    read_some(stream, pr);
+    auto cbs = pr.pull_body();
+    pr.consume_body(buffer::buffer_size(cbs));
+}
+----
+
+
+=== Sink Body
+
+A `sink`-like body enables algorithms to read body contents directly from the
+`parser` 's internal buffer, either in one step or multiple steps, such as when
+writing the body to a file. The `parser` takes ownership of the `sink` object,
+drives the algorithm, and provides a `ConstBufferSequence` by calling the
+relevant virtual interfaces on the `sink`.
+
+[source,cpp]
+----
+response_parser pr{ctx};
+pr.start();
+
+read_header(stream, pr);
+
+http_proto::file file;
+system::error_code ec;
+file.open("./index.html", file_mode::write_new, ec);
+if(ec.failed())
+    return ec;
+
+pr.set_body<file_body>(std::move(file));
+
+read(stream, pr);
+----
+
+
+=== Dynamic Buffer
+
+Using the dynamic buffer interface, the `parser` can store body contents
+directly into the user-provided buffer or container, avoiding double copying.
+
+[source,cpp]
+----
+response_parser pr{ctx};
+pr.start();
+
+read_header(stream, pr);
+
+std::string body;
+pr.set_body(buffers::dynamic_for(body));
+
+read(stream, pr);
+----
+
+
+=== Accessing Buffered Data
+
+The HTTP/1.1 protocol allows upgrading an established connection to a different
+protocol by sending an upgrade request and receiving a `101 Switching Protocols`
+status code in response. During this process, the `parser` might overread the
+HTTP response, such as reading part or all of a WebSocket frame after the
+response. The `parser` must provide a way to access this buffered data so it can
+be passed to another entity, like a WebSocket stream object.
+
+[source,cpp]
+----
+response_parser pr{ctx};
+pr.start();
+
+read_header(stream, pr);
+
+if(is_upgrade_successful(pr.get()))
+{
+    auto cbs = pr.buffered_data();
+    // Pass the buffered data to the next layer ...
+}
+----
diff --git a/doc/modules/ROOT/pages/header_containers.adoc b/doc/modules/ROOT/pages/header_containers.adoc
@@ -0,0 +1 @@
+= Header Containers
diff --git a/doc/modules/ROOT/pages/http_protocol_basics.adoc b/doc/modules/ROOT/pages/http_protocol_basics.adoc
@@ -0,0 +1 @@
+= HTTP Protocol Basics
diff --git a/doc/modules/ROOT/pages/message_bodies.adoc b/doc/modules/ROOT/pages/message_bodies.adoc
@@ -0,0 +1 @@
+= Message Bodies