Skip to content

Commit 6b27f6b

Browse files
committed
docs: update the notes for WASM
1 parent 14fd16a commit 6b27f6b

File tree

1 file changed

+272
-4
lines changed

1 file changed

+272
-4
lines changed

docs/webassembly/introduction.md

Lines changed: 272 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,7 @@ Here:
155155
- `i32.add` is an instruction that adds the two values together.
156156
- The function returns the result of the addition.
157157

158-
In WASM stack0base model, the result of the last expression in the function body is implicitly returned. So, we don't need to use `return` keyword.
158+
In WASM stack-base model, the result of the last expression in the function body is implicitly returned. So, we don't need to use `return` keyword.
159159

160160
Talking about value stack it will change in this order:
161161

@@ -173,7 +173,7 @@ For example for image processing module, it can have functions for applying filt
173173

174174
Apart from functions modules can have other components like:
175175

176-
**Global Variables**: They are variables that are accessible from anywhere in the module. They can be imported from other modules or defined within the module. They can be mutable or immutable. They can be used to store constants, configuration values, etc.
176+
**Global Variables**: These are variables that are accessible from anywhere in the module. They can be imported from other modules or defined within the module. They can be mutable or immutable. They can be used to store constants, configuration values, etc.
177177

178178
**Memory**: It's a linear array of bytes that can be accessed by the WebAssembly module. It can be used to store data that needs to be shared between functions. It can be used to store images, audio, video, etc. It can be imported from other modules or defined within the module.
179179

@@ -249,7 +249,7 @@ Instructions - References:
249249

250250
Data types define the kind of data we can work with. It help set the rule and how the data is stored and processed. For example in a weather app the temperature in once city is a whole number like 25°C, while in another city it's a decimal number like 25.5°C. So, we represent using `i32` (32-bit integer) and `f32` (32-bit floating point number) respectively.
251251

252-
Data types are very essential to make the code is predictable by defining the kind of data and hoe it's used, we avoid errors and ensure efficient use of memory. For example imagine pouring water into a glass. Each glass (data type) can hold a specific amount and shape of water (data). If we try to pour too much water or wrong type of liquid, it either won't fit or won't function as intended. Same goes with data types, ensuring data fits well and works as intended.
252+
Data types are very essential to make the code is predictable by defining the kind of data and how it's used, we avoid errors and ensure efficient use of memory. For example imagine pouring water into a glass. Each glass (data type) can hold a specific amount and shape of water (data). If we try to pour too much water or wrong type of liquid, it either won't fit or won't function as intended. Same goes with data types, ensuring data fits well and works as intended.
253253

254254
Data Types - References:
255255

@@ -323,4 +323,272 @@ Tailored Framework: A framework for building WebAssembly microservices, and Wasm
323323

324324
**Debugging and Observability**: Debugging and observability are essential for any application in production. Tools like WASI logging are making it easier to debug and monitor WASM applications.
325325

326-
**Artifacts** It's really important for software supply chain. Repositories like DockerHub and Harbor are stepping to store, and track WASM packages.
326+
**Artifacts** It's really important for software supply chain. Repositories like DockerHub and Harbor are stepping to store, and track WASM packages.
327+
328+
## Working with WebAssembly
329+
330+
As we know WebAssembly comes in Binary and Text format. The binary format with `.wasm` extension. It serves as a universal compilation target for high-level languages. The text format with `.wat` extension. It's a human-readable representation of the binary format. It's useful for debugging and understanding the structure of the WebAssembly module.
331+
332+
### Binary Format
333+
334+
To compile a C++ code to WebAssembly binary format, we can use the Emscripten compiler. It's a toolchain for compiling to WebAssembly. It can compile C and C++ code to WebAssembly. It can also compile other languages like Rust, Go, etc.
335+
336+
To compile a C++ code to WebAssembly, we can use the following command:
337+
338+
```bash
339+
emcc hello.cpp -o hello.js
340+
```
341+
342+
This above command compiles the `hello.cpp` file to WebAssembly and generates the `hello.js` along with the `hello.wasm` file.
343+
344+
#### Structure of Wasm Binary
345+
346+
The Binary file is organized into modules. Each module starts with a specific structure that includes a magic number and a version number. The magic number is `00 61 73 6d` which spells out `wasm` in ASCII. The version number is `01 00 00 00` which indicates the version of the WebAssembly binary format.
347+
348+
And each module can have sections. Each section has a specific purpose. For example, the `type` section defines the types of functions, the `function` section defines the functions, the `memory` section defines the memory, etc. So, Type Section (Section ID: 1) defines the function signature, Function Section (Section ID: 3) defines the function bodies, Memory Section (Section ID: 5) defines the memory, etc.
349+
350+
All sections and their purpose:
351+
352+
- Section 0: Custom Section: It's used to define custom data.
353+
- Section 1: Type Section: It used to define the function signature.
354+
- Section 2: Import Section: It used to import functions, memories, tables, and globals from other modules.
355+
- Section 3: Function Section: It used to define the function bodies.
356+
- Section 4: Table Section: It used to define the tables. It defines the table types, including the element type and the limits.
357+
- Section 5: Memory Section: It used to define the memory.
358+
- Section 6: Global Section: It used to define the global variables.
359+
- Section 7: Export Section: It used to export functions, memories, tables, and globals to other modules.
360+
361+
Let's take an example of Table Section:
362+
363+
```wasm
364+
04 ;; Section ID: Table Section
365+
06 ;; Section Length: 8 bytes
366+
367+
01 ;; Number of tables: 1
368+
70 00 01 ;; Table type: Element type (anyfunc), Initial size (1)
369+
```
370+
371+
So, here
372+
- `04` is the Section ID for Table Section for the Memory Section.
373+
- `06` is the Section Length in bytes. Indicating that Memory Section is 6 bytes long.
374+
- `00 01` is the limit of the table. It's a 32-bit integer. The first byte `00` indicates the minimum size of the table, and the second byte `01` indicates the maximum size of the table. Here, the table has a minimum size of 0 and a maximum size of 1.
375+
376+
Another example of Global Section
377+
378+
```wasm
379+
06 ;; Section ID: Global Section
380+
19 ;; Section Length: 25 bytes.
381+
382+
03 ;; Number of Global Variables
383+
7F 01 41 0B ;; Global variable 1: Type (i32), Mutable, Initialization(i32.const 11)
384+
```
385+
386+
So, here
387+
- `06` is the Section ID for Global Section.
388+
- `19` is the Section Length in bytes. Indicating that Global Section is 25 bytes long.
389+
- `03` is the Number of Global Variables.
390+
- `7F` is the Global variable type. Here, it's `i32`.
391+
- `01` is the mutability of the global variable. Here, it's mutable.
392+
- `41 0B` is the initialization value of the global variable. Here, it's `i32.const 11`.
393+
394+
It's not necessary to understand the binary format in detail. But it's good to know how it's structured and how it works. Knowing it helps us achieve performance optimization, debugging, security auditing, adv. features, etc. Otherwise dev may work at high level of abstraction using languages like Rust, C+++, or JavaScript that compiles to WebAssembly.
395+
396+
### Text Format
397+
398+
As we know WASM binary format is not human readable. To make it human readable, we have WebAssembly Text Format (WAT). It's a simple, verbose, and readable representation of the binary format. It's useful for debugging and understanding the structure of the WebAssembly module.
399+
400+
For example, the following binary code:
401+
402+
Example:
403+
404+
```Wasm
405+
(module
406+
;; Define a function working
407+
(func $add (param $a i32) (param $b i32) (result i32)
408+
get_local $a
409+
get_local $b
410+
i32.add)
411+
(export "add" (func $add))
412+
)
413+
```
414+
415+
So, here:
416+
417+
- `(module...)` : Module's definition. Everything inside this will help construct the module.
418+
- `(func $add (param $a i32) (param $b i32) (result i32)..)` : Function's definition. Here, we define a function called `add`.
419+
- `(param $a i32) (param $b i32)` : Parameters of the function. Here, we define two parameters of type `i32`.
420+
- `(result i32)` : Return type of the function. Here, we define the return type of the function as `i32`.
421+
- `get_local $a` and `get_local $b` : Instructions that get the values of the parameters.
422+
- `i32.add` : Instruction that adds the two values together.
423+
- `(export "add" (func $add))` : Export the function. Here, we export the `add` function.
424+
- `i32.add` : Instruction that adds the two values together.
425+
426+
Seeing from a different pair of eyes, WASM a language. It has Grammar, Whitespace and structure. So, in above parenthesis defining scope and structure. While `func`, `param` and `result` specify the code's functionality. White space and indentation are used to make the code more readable.
427+
428+
Let's break down the above code and see how it works behind the scenes:
429+
430+
- `get_local $a` gets the value and pushes it onto the stack.
431+
- `get_local $b` gets the value and pushes it onto the stack.
432+
- `i32.add` pops the two values from the stack, adds them together, and pushes the result back onto the stack.
433+
- The function returns the result.
434+
435+
Comments are also supported in the WebAssembly Text Format. They are enclosed in `;;` and are ignored by the WebAssembly engine. They are used to add notes, explanations, and documentation to the code.
436+
437+
438+
#### Lexical Structure
439+
440+
If we look at Lexical Structure of WAT, it's more like a grammar of the language. It defines how the code is written and how it's structured. So, in above example we have identifiers like `add`, `a`, `b`, etc. We have keywords like `module`, `func`, `param`, `result`, etc.
441+
442+
#### Types
443+
444+
In WAT, we have different types like `i32`, `i64`, `f32`, `f64`, etc. They define the kind of data we are working with. For example, `i32` is a 32-bit integer, `i64` is a 64-bit integer, `f32` is a 32-bit floating point number, and `f64` is a 64-bit floating point number. It also supports vector types like `v128`, reference types like `funcref`, `externref`, `anyref`, etc.
445+
446+
An example of vector type:
447+
448+
```wasm
449+
(module
450+
;; .... Previous definition ...
451+
(func $add (param $a v128) (param $b v128) (result v128)
452+
get_local $a
453+
get_local $b
454+
v128.add)
455+
456+
;; .... other functions ...
457+
458+
(export "add" (func $add))
459+
460+
;; .... other exports ...
461+
)
462+
```
463+
464+
An example of reference type:
465+
466+
```wasm
467+
(module
468+
;; .... Previous definition ...
469+
(func $add (param $a funcref) (param $b externref) (result anyref)
470+
get_local $a
471+
get_local $b
472+
anyref.add)
473+
474+
;; .... other functions ...
475+
476+
(export "add" (func $add))
477+
478+
;; .... other exports ...
479+
)
480+
```
481+
482+
#### Instructions
483+
484+
They define the actions that the WebAssembly engine can perform. For example, `i32.add` is an instruction that adds two 32-bit integers.
485+
486+
##### if-else block
487+
488+
```wasm
489+
(module
490+
;; .... Previous definition ...
491+
(func $add (param $a i32) (param $b i32) (result i32)
492+
get_local $a
493+
f32.const 0
494+
f32.lt ;; Check if $a is less than 0
495+
if (result i32)
496+
f32.const 0 ;; If $a is less than 0, return 0
497+
else
498+
get_local $a
499+
get_local $b
500+
f32.add ;; If $a is greater than 0, add $a and $b
501+
end)
502+
;; .... other functions ...
503+
)
504+
```
505+
506+
Here `if` and `else` are instructions that define a conditional block. The Bloc can be nested within each other for complex conditions.
507+
508+
509+
##### Loop block
510+
511+
```wasm
512+
(module
513+
;; .... Previous definition ...
514+
(func $add (param $a i32) (param $b i32) (result i32)
515+
get_local $a
516+
get_local $b
517+
loop (result i32)
518+
i32.add
519+
get_local $a
520+
i32.const 1
521+
i32.sub
522+
tee_local $a
523+
i32.const 0
524+
i32.eq
525+
br_if 0
526+
end)
527+
;; .... other functions ...
528+
)
529+
```
530+
531+
Here `loop` is an instruction that defines a loop block. The loop block can have a condition that determines when to exit the loop. The `br_if` instruction is used to break out of the loop if the condition is met
532+
533+
So here they way it's working is:
534+
535+
- `loop` starts the loop block.
536+
- `i32.add` adds the two values together.
537+
- `get_local $a` gets the value of `$a`.
538+
- `i32.const 1` pushes the value `1` onto the stack.
539+
- `i32.sub` subtracts `1` from `$a`.
540+
- `tee_local $a` duplicates the value of `$a` and stores it back in `$a`.
541+
- `i32.const 0` pushes the value `0` onto the stack.
542+
- `i32.eq` checks if `$a` is equal to `0`.
543+
- `br_if 0` breaks out of the loop if `$a` is equal to `0`.
544+
- `end` ends the loop block.
545+
546+
## WASI - WebAssembly System Interface
547+
548+
WASI (WebAssembly System Interface) is a system interface for WebAssembly. It provides a set of APIs that allow WebAssembly modules to interact with the host environment in a secure and controlled manner. It allows WebAssembly modules to access system resources like files, network, and environment variables.
549+
550+
![WASI](https://github.com/user-attachments/assets/e701c4dc-bcaf-4f74-b4f3-a7d4940a4535)
551+
552+
Without WASI WASM modules can't access the file system, network, or other resources. It's isn't a flaw, it's by design. A couple of more important points to understand WASI's design:
553+
554+
- It ensures whether a person is using the application in Japan, France, or the US, on a variety of devices, the application will work the same way.
555+
- It only unlock specific permissions. For example, if a WebAssembly module needs to read a file, it can only access the file system and not the network or other resources.
556+
- WASI is giving dynamism to the static web pages.
557+
- Incorporating the specific functionalities without burdening the application with unnecessary features.
558+
559+
![WASI](https://github.com/user-attachments/assets/a271f1fb-32f5-4356-997d-6297037d2ac6)
560+
561+
### WASI Functions
562+
563+
WASI provides a set of functions that allow WebAssembly modules to interact with the host environment. These functions are divided into different categories like:
564+
565+
![WASI Functions](https://github.com/user-attachments/assets/6e324ae0-e74b-4b53-a72b-9932be008004)
566+
567+
- **File Operations**: Functions like `fd_read`, `fd_write`, `fd_seek`, etc. allow WebAssembly modules to read and write files.
568+
- **Network Activities**: Functions like `sock_send`, `sock_recv`, `sock_shutdown`, etc. allow WebAssembly modules to send and receive data over the network.
569+
- **System Information**: Functions like `args_sizes_get`, `args_get`, `environ_sizes_get`, `environ_get`, etc. allow WebAssembly modules to access system information like command-line arguments, environment variables, etc.
570+
- **Time and Clock**: Functions like `clock_time_get`, `clock_res_get`, etc. allow WebAssembly modules to access time and clock information.
571+
572+
An example of `fd_read` function:
573+
574+
```wasm
575+
(module
576+
;; Import the fd_read function from the WASI module
577+
(import "wasi_snapshot_preview1" "fd_read" (func $fd_read (param i32 i32 i32 i32) (result i32)))
578+
579+
;; Memory declaration and other necessary code...
580+
581+
;; A function to read data from a file
582+
(func $read_file
583+
;; Assuming file descriptor is stored in memory at offset 0
584+
(i32.const 0) ;; File descriptor
585+
(i32.const data_offset) ;; pointer to memory location where data will be stored
586+
(i32.const data_length) ;; length of the data to read
587+
(i32.const result_offset) ;; pointer to memory location where the result will be stored
588+
(call $fd_read) ;; Call the fd_read function
589+
590+
;; Handle the read data as necessary
591+
)
592+
;; Export our read_file function
593+
(export "read_file" (func $read_file))
594+
)

0 commit comments

Comments
 (0)