Skip to content

Commit fbfb88c

Browse files
fix grammer module database devops
1 parent bcd61b0 commit fbfb88c

13 files changed

+123
-123
lines changed

database/cockroachdb.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,10 @@
22

33
![](./screen/CockroachDB.png)
44

5-
- `CockroachDB` is mainly a distributed, replicated, transactional key value store.
5+
- `CockroachDB` is mainly a distributed, replicated, transactional key-value store.
66

77
- [The Challenges of Writing a Massive and Complex Go Application](https://youtu.be/hWNwI5q01gI)
88
- Design decisions while building cockroachDB
9-
- Gc cost is based on number of allocations, not number of bytes
10-
- Values thst are used together can be allocated in one struct
9+
- Gc cost is based on the number of allocations, not the number of bytes
10+
- Values that are used together can be allocated in one struct
1111
- [Incomplete]

database/elastic search.md

+11-11
Original file line numberDiff line numberDiff line change
@@ -5,47 +5,47 @@
55
- Elastic Search forms with index
66
- Index (Logical namespace / DB Name)
77
- Divided into Shrads (Types / Table / Schema)
8-
- Each shrad has replica
9-
- Each shrad is Lucene index
8+
- Each shard has replica
9+
- Each shard is Lucene index
1010
- For inverted search
1111
- Divided into Segments
1212
- Inverted index
1313
- Segments are immutable
14-
- Apache lucene
14+
- Apache Lucene
1515
- Powerful open-source full-text search library
1616
- DB normalize
17-
- Case dispose, punctuation
17+
- Case disposal, punctuation
1818
- Common word remove
1919
- Stop word list
2020
- Lemma
2121
- Mapping different usage to one
2222
- Tokenization
23-
- Stemming > Buying > Buy
23+
- Stemming > Buying > Buy
2424
- After 1 second index refresh happens
2525
- Doc indexed
2626
- Translog (30 min or 512mb)
2727
- Master node and Data Node available
2828

2929
- Inverted index
30-
- A document is the unit of data in Elasticsearch and an inverted index is created by tokenizing the terms in the document, creating a sorted list of all unique terms and associating a list of documents with where the word can be found.
30+
- A document is the unit of data in Elasticsearch and an inverted index is created by tokenizing the terms in the document, creating a sorted list of all unique terms, and associating a list of documents with where the word can be found.
3131
- Record level inverted index (word -> record)
3232
- Word level inverted index (word -> record, position )
3333

3434
### Internal Usage
3535
- Porters stemmer algo
3636
- The Porter stemming algorithm (or ‘Porter stemmer’) is a process for removing the commoner morphological and inflexional endings from words in English.
37-
- Its main use is as part of a term normalisation process that is usually done when setting up Information Retrieval systems.
37+
- Its main use is as part of a term normalization process that is usually done when setting up Information Retrieval systems.
3838
- Murmur3 hash function
3939
- Why indexing?
40-
- Database blocks stored like linked list
40+
- Database blocks stored like a linked list
4141
- For searching N blocks
4242
- when not sorted searching requires `N/2` access avg
4343
- when duplicates then `N`
4444
- when sorted `logN`
4545

4646
### Database Indexing
4747

48-
- Creating an index on a field in a table creates another data structure which holds the field value, and a pointer to the record it relates to.
48+
- Creating an index on a field in a table creates another data structure that holds the field value and a pointer to the record it relates to.
4949

5050
- This index structure is then sorted, allowing Binary Searches to be performed on it.
5151

@@ -64,13 +64,13 @@
6464
- All records map to index
6565
- Sparse index
6666
- Few records map to sorted index
67-
- Sparse index not possible if records not organized/sorted
67+
- Sparse index is not possible if records are not organized/sorted
6868

6969
- Graph database which considers relation and data
7070
secondary index
7171

7272
### Notes
73-
- Low cardinality in database
73+
- Low cardinality in the database
7474
- Common attribute
7575
- Easy for indexing
7676
- Made efficient through bitmap indexing

database/embedded database.md

+14-14
Original file line numberDiff line numberDiff line change
@@ -1,51 +1,51 @@
11
# Embedded Database
22

33
### Level DB
4-
- Has too much mutex contension
4+
- Has too much mutex contention
55
- Bitcoin core, go ethereum uses it
66
- Sqlite used in past chrome, now uses leveldb
77
- Has different fork `rocksdb`, `hyperLevelDB`
88

99
### RocksDB
1010
- Log Structured Merge Tree
1111
- MemTable / SSTable
12-
- InnoDB used in mySQL
12+
- InnoDB used in MySQL
1313

1414
### References
1515
- [DropBox Engineering Evening on RocksDB with Dhruba Borthakur @ Rockset](https://www.youtube.com/watch?v=aKAJMd0iKtI&ab_channel=DhrubaBorthakur)
1616

1717
- [Embedded Database: RocksDB](youtube.com/watch?v=V_C-T5S-w8g)
18-
- Shows benchmark between sqlite, levelDB, kyoto TreeDB
18+
- Shows benchmark between SQLite, level DB, Kyoto TreeDB
1919
- `LevelDB` was good at random reads and random write
20-
- In LSM database, the amount of data you can write directly proportional to how fast you can compact.
21-
- `Bloom filter` not very useful when you do range caps
20+
- In the LSM database, the amount of data you can write is directly proportional to how fast you can compact.
21+
- The `Bloom filter` is not very useful when you do range caps
2222
- Prefix Scan for locality search
23-
- Range scans with same key prefix
23+
- Range scans with the same key prefix
2424
- Blooms created for prefix
2525
- Reduce read amplification
26-
- Thread aware compaction used on top of leveldb
27-
- `Write amplification` change. Compared to how many bytes you write to the database, how many times it needs to be re written
26+
- Thread-aware compaction used on top of leveldb
27+
- `Write amplification` change. Compared to how many bytes you write to the database, how many times it needs to be re-written
2828
- `Read amplification` resolved
2929
- `Read modify write`
3030

3131
- [RockDB internals](https://www.youtube.com/watch?v=aKAJMd0iKtI)
3232
- Everything is pluggable
33-
- Each block has index and filter block
34-
- Each block has starting and end index in block to perform binary search
33+
- Each block has an index and filter block
34+
- Each block has a starting and end index in block to perform a binary search
3535
- Database shadowing
3636

3737
- [RocksDB Port](https://youtu.be/jGCv4r8CJEI)
3838
- MySQL, Mongo to use rocksdb as storage engine
3939

4040
- [LSM Tree](https://www.youtube.com/watch?v=V1iqN2ie__w)
4141
- `b+ tree` used when we need less search and insertion time
42-
- `lsm tree` when we have write intensive dataset
42+
- `lsm tree` when we have written an intensive dataset
4343
- Write 0(1), Read logn
4444
- Four key concepts
45-
- wal - write ahead log
46-
- memtable - batching write
45+
- wal - write-ahead log
46+
- memorable - batching write
4747
- compaction - making efficient
4848
- bloom filters - to discard, make query efficient
4949

50-
- [LSM based Storage Techniques Strengths and Trade Offs (SDC 2019)](https://www.youtube.com/watch?v=V1iqN2ie__w)
50+
- [LSM-based Storage Techniques Strengths and Trade-Offs (SDC 2019)](https://www.youtube.com/watch?v=V1iqN2ie__w)
5151
- To Be Continued

database/mnesia.md

+10-10
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,26 @@
11
# Mnesia
22

3-
- Mnesia combines many concepts found in traditional databases such as transactions and queries with concepts found in data management systems for telecommunications applications such as very fast realtime operations, configurable degree of fault tolerance (by means of replication) and the ability to reconfigure the system without stopping or suspending it.
3+
- Mnesia combines many concepts found in traditional databases such as transactions and queries with concepts found in data management systems for telecommunications applications such as very fast real-time operations, configurable degree of fault tolerance (using replication), and the ability to reconfigure the system without stopping or suspending it.
44

55
- Mnesia is also interesting due to its tight coupling to the programming language Erlang, thus almost turning Erlang into a database programming language.
66

7-
- This has many benefits, the foremost being that the impedance mismatch between data format used by the DBMS and data format used by the programming language which is being used to manipulate the data, completely disappears.
7+
- This has many benefits, the foremost being that the impedance mismatch between the data format used by the DBMS and the data format used by the programming language that is being used to manipulate the data, completely disappears.
88

9-
- Mnesia has four methods of reading from database: read, match_object, select, qlc.
10-
- `read` always uses a Key-lookup on the keypos. It is basically the key-value lookup.
9+
- Mnesia has four methods of reading from the database: read, match_object, select, qlc.
10+
- `read` always uses a key lookup on the key position. It is the key-value lookup.
1111
- `match_object` and select will optimize the query if it can on the keypos key. That is, it only uses that key for optimization. It never utilizes further index types.
12-
- `qlc` has a query-compiler and will attempt to use additional indexes if possible, but it all depends on the query planner and if it triggers. erl -man qlc has the details and you can ask it to output its plan.
12+
- `qlc` has a query compiler and will attempt to use additional indexes if possible, but it all depends on the query planner and if it triggers. Erl -man qlc has the details and you can ask it to output its plan.
1313
- read is just a `key-value` lookup, also functions `index_read` and `index_write`
1414

1515
- A power of two number of fragments is simply related to the fact the default fragmentation module `mnesia_frag` uses linear hashing so using `2^n` fragments assures that records are equally distributed (more or less, obviously) between fragments.
1616

1717
- Using `disc_only_copies` most of the time is spent in two operations:
1818
- Decide which fragment holds which record
19-
- Retrieve the record from corresponding dets table (Mnesia backend)
19+
- Retrieve the record from the corresponding dets table (Mnesia backend)
2020

2121
- `DCD` and `DCL` files represent `disc_copies` tables. The `DCD` is an image of the contents from the latest time the table was "dumped", while the `DCL` contains a log of the side-effects made to that table since it was dumped. A dump creates a new `DCD` and removes the `DCL`.
2222

23-
- `DAT` files are `DETS:es` which contain `disc_only_copies` tables.
23+
- `DAT` files are `DETS: es` which contain `disc_only_copies` tables.
2424

2525
- `SCHEMA.DAT` is a special DETS that contains the schema for that Mnesia instance.
2626

@@ -29,15 +29,15 @@
2929
- Applies argument `Fun` to all records in the table.
3030
- `Fun` is a function that takes a record of the old type and returns a transformed record of the new type.
3131
- Argument `Fun` can also be the atom ignore, which indicates that only the metadata about the table is updated.
32-
- Use of ignore is not recommended, but included as a possibility for the user do to an own transformation.
32+
- Use of ignore is not recommended but included as a possibility for the user do to their own transformation.
3333

3434
- `NewAttributeList` and `NewRecordName` specify the attributes and the new record type of the converted table.
3535

3636
- Table name always remains unchanged. If `record_name` is changed, only the Mnesia functions that use table identifiers work, for example, `mnesia:write/3` works, but not `mnesia:write/1`.
3737

38-
- A good solution could be to have more fragments and less records per fragment but trying at the same time to find the middle ground and not lose the advantages of some hard disk performance boosts like buffers and caches.
38+
- A good solution could be to have more fragments and fewer records per fragment but trying at the same time to find the middle ground and not lose the advantages of some hard disk performance boosts like buffers and caches.
3939

40-
- Very often start with a few `ETS` tables when prototyping (or even on early versions of features in production), then start finding places I need data serialized and on disk, then realize I need multiple indexes, etc. and wind up moving a lot of the data management stuff that was initially in ETS into Mnesia anyway.
40+
- Very often start with a few `ETS` tables when prototyping (or even on early versions of features in production), then start finding places I need data serialized and on disk, then realize I need multiple indexes, etc., and wind up moving a lot of the data management stuff that was initially in ETS into Mnesia anyway.
4141

4242
- If abstracted away the concept of data access properly it is not an issue to change the implementation of this part of your system either way.
4343

database/mongoDB.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# NoSQL
22

3-
- No SQL duplicates but first read fast write slow but read write ration `7000:1`
3+
- No SQL duplicates but first read fast write slow but read-write ratio `7000:1`
44
Cloud firestore
55

6-
- Uses ` WiredTiger storage engine`
6+
- Uses ` WiredTiger storage engine

database/mysql.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,11 @@
22

33
![](./Screen/Database.png )
44

5-
- `Mysql` has nine storage engine.
5+
- `Mysql` has nine storage engines.
66
- `InnoDB` locks record level
77
- `MYSIum`
88

99
- In Memory database uses `AVL tree`
10-
- Used highly in real time header bidding to show advertisement
10+
- Used highly in real-time header bidding to show an advertisement
1111
- Relational database uses `B+ tree`
12-
- Set uses `Red black tree`
12+
- Set uses `Red Black Tree`

database/sqlite.md

+14-14
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# SQLite
22

3-
- An `application file format` is the file format used to persist application state to disk or to exchange information between programs. It's alternative is `fopen`.
3+
- An `application file format` is the file format used to persist application state to disk or to exchange information between programs. Its alternative is `fopen`.
44

55
- `file format` vs `application format`
66

@@ -10,42 +10,42 @@
1010

1111
- Sqlite file is application format.
1212

13-
### Full Text Search
14-
- FTS3 has smaller file, slower search
15-
- FTS4 has bigger file, faster search
13+
### Full-Text Search
14+
- FTS3 has a smaller file, a slower search
15+
- FTS4 has bigger files, faster search
1616

1717
- rowID
1818
- Integer Primary Key highly optimized
1919

2020
- Isolation
2121
- Journal mode
2222
- Changes are written directly into the database file
23-
- while simultaneously a separate rollback journal file is constructed that is able to restore the database to its original state if the transaction rolls back.
23+
- while simultaneously a separate rollback journal file is constructed that can restore the database to its original state if the transaction rolls back.
2424
- Two times write.
2525
- In rollback mode, SQLite implements isolation by locking the database file and preventing any reads by other database connections while each write transaction is underway.
26-
- Readers can be be active at the beginning of a write, before any content is flushed to disk and while all changes are still held in the writer's private memory space.
27-
- But before any changes are made to the database file on disk, all readers must be (temporally) expelled in order to give the writer exclusive access to the database file.
28-
- Readers are prohibited from seeing incomplete transactions by virtue of being locked out of the database while the transaction is being written to disk.
26+
- Readers can be active at the beginning of a write before any content is flushed to disk and while all changes are still held in the writer's private memory space.
27+
- But before any changes are made to the database file on disk, all readers must be (temporally) expelled to give the writer exclusive access to the database file.
28+
- Readers are prohibited from seeing incomplete transactions by being locked out of the database while the transaction is being written to disk.
2929
- Only after the transaction is completely written and synced to disk and commits are the readers allowed back into the database.
3030
- Hence readers never get a chance to see partially written changes.
3131

3232
- [WAL mode](https://www.sqlite.org/wal.html)
3333
- WAL mode permits simultaneous readers and writers.
3434
- It can do this because changes do not overwrite the original database file, but rather go into the separate write-ahead log file.
35-
- It means that readers can continue to read the old, original, unaltered content from the original database file at the same time that the writer is appending to the write ahead log.
35+
- It means that readers can continue to read the old, original, unaltered content from the original database file at the same time that the writer is appending to the write-ahead log.
3636
- In WAL mode, SQLite exhibits "snapshot isolation".
3737
- When a read transaction starts, that reader continues to see an unchanging `snapshot` of the database file as it existed at the moment in time when the read transaction started.
38-
- Any write transactions that commit while the read transaction is active are still invisible to the read transaction, because the reader is seeing a snapshot of database file from a prior moment in time.
38+
- Any write transactions that commit while the read transaction is active are still invisible to the read transaction because the reader is seeing a snapshot of the database file from a prior moment in time.
3939

4040
- Without file param used as `in memory` database.
4141

42-
- Functional difference between STORED columns cannot be added using the `ALTER TABLE ADD COLUMN` command.
42+
- Functional differences between STORED columns cannot be added using the `ALTER TABLE ADD COLUMN` command.
4343

4444
- Only VIRTUAL columns can be added using `ALTER TABLE`.
4545

46-
- Strict Typed tables more common though dynamic type used by default in sqlite. If a type is not convertible then it stores as that.
46+
- Strict Typed tables are more common though dynamic type used by default in SQLite. If a type is not convertible then it stores as that.
4747

48-
- Partial indexing used in many case.
48+
- Partial indexing is used in many cases.
4949

5050
- The life-cycle of a prepared statement object usually goes like this:
5151
- Create the prepared statement object using `sqlite3_prepare_v2()`.
@@ -54,7 +54,7 @@
5454
- Reset the prepared statement using `sqlite3_reset()` then go back to step 2. Do this zero or more times.
5555
- Destroy the object using `sqlite3_finalize()`.
5656

57-
- SQLite `vaccum` command optimizes size.
57+
- SQLite `vacuum` command optimizes size.
5858

5959
### References
6060
- [VFS](https://www.sqlite.org/vfs.html)

database/time series db.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Time series Database
1+
# Time Series Database
22

33
### References
44
- [InfluxDB Storage Engine Internals | Metamarkets](https://www.youtube.com/watch?v=rtEalnKT25I)

0 commit comments

Comments
 (0)