You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: bson/_rbson/README.md
+88-15Lines changed: 88 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -183,20 +183,94 @@ When these missing features were added to achieve 100% compatibility, the true p
183
183
- Edge case handling for all 88 tests
184
184
3.**The Fundamental Issue**: Both implementations suffer from the same architectural limitation (Python → Bson enum → bytes), but it only becomes a significant bottleneck when you implement all the features required for production use.
185
185
186
+
## Direct Byte-Writing Performance Results
187
+
188
+
### Implementation: `_dict_to_bson_direct()`
189
+
190
+
A new implementation has been added that writes BSON bytes directly from Python objects without converting to `Bson` enum types first. This eliminates the intermediate conversion layer.
1.**Massive speedup for simple types**: 4.51x faster for documents with Python native types
223
+
2.**Consistent 2x improvement for real-world documents**: All realistic mixed-type documents show 1.77x - 2.28x speedup
224
+
3.**Slight slowdown for pure BSON types**: Documents with only BSON-specific types (ObjectId, Binary, etc.) are 10% slower due to extra Python attribute lookups
225
+
4.**100% correctness**: All outputs verified to be byte-identical to the regular implementation
226
+
227
+
### Why Direct Byte-Writing is Faster
228
+
229
+
1.**Eliminates heap allocations**: No need to create intermediate `Bson` enum values
230
+
2.**Reduces function call overhead**: Writes bytes immediately instead of going through `python_to_bson()` → `write_bson_value()`
231
+
3.**Better for common types**: Python's native types (int, str, float, bool) can be written directly without any conversion
232
+
233
+
### Implementation Details
234
+
235
+
The direct approach is implemented in these functions:
If all optimizations are implemented successfully:
225
-
- Current: 0.21x (5x slower)
226
-
- Target: 0.21x × 3.5 × 1.3 × 1.2 × 1.05 = **~1.13x** (13% faster than C)
297
+
### Combined Potential (Updated with Direct Byte-Writing Results)
298
+
With direct byte-writing implemented:
299
+
-**Before**: 0.21x (5x slower than C)
300
+
-**After direct byte-writing**: 0.43x (2.3x slower than C) ✅
301
+
-**With all optimizations**: 0.43x × 1.3 × 1.2 × 1.05 = **~0.71x** (1.4x slower than C)
302
+
-**Optimistic target**: Could potentially reach **~0.9x - 1.0x** (parity with C)
227
303
228
-
However, achieving this would require:
229
-
- Significant engineering effort (weeks to months)
230
-
- Bypassing the `bson` crate (losing its benefits)
231
-
- Complex low-level code (harder to maintain)
304
+
The direct byte-writing approach has already delivered the largest performance gain (2x). Additional optimizations could close the remaining gap to C extension performance.
0 commit comments