Skip to content

Commit e0d72ac

Browse files
authored
[EM] Check whether memory policy is set. (#11556)
1 parent 70e47c0 commit e0d72ac

File tree

10 files changed

+486
-43
lines changed

10 files changed

+486
-43
lines changed

doc/tutorials/external_memory.rst

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -274,6 +274,33 @@ with version ``>=565.47`` is required, it should come with CTK 12.7 and later
274274
versions. Lastly, there's a known issue with Linux 6.11 that can lead to CUDA host memory
275275
allocation failure with an ``invalid argument`` error.
276276

277+
.. _extmem-adaptive-cache:
278+
279+
==============
280+
Adaptive Cache
281+
==============
282+
283+
Starting with 3.1, XGBoost introduces an adaptive cache for GPU-based external memory
284+
training. The feature helps split the data cache into a host cache and a device cache. By
285+
keeping a portion of the cache on the GPU, we can reduce the amount of data transfer
286+
during training when there's sufficient amount of GPU memory. The feature can be
287+
controlled by the ``cache_host_ratio`` parameter in the
288+
:py:class:`xgboost.ExtMemQuantileDMatrix`. It is disabled when the device has full C2C
289+
bandwidth since it's not needed there. On devices that with reduced bandwidth or devices
290+
with PCIe connections, unless explicitly specified, the ratio is automatically estimated
291+
based on device memory size and the size of the dataset.
292+
293+
However, this parameter increases memory fragmentation as XGBoost needs large memory pages
294+
with irregular sizes. As a result, you might see out of memory error after the
295+
construction of the ``DMatrix`` but before the actual training begins.
296+
297+
For reference, we tested the adaptive cache with a 128GB (512 features) dense 32bit
298+
floating dataset using a NVIDIA A6000 GPU, which comes with 48GB device memory. The
299+
``cache_host_ratio`` was estimated to be about 0.3, meaning about 30 percent of the
300+
quantized cache was on the host and rest of 70 percent was actually in-core. Given this
301+
ratio, the overhead is minimal. However, the estimated ratio increases as the data size
302+
grows.
303+
277304
================================
278305
Non-Uniform Memory Access (NUMA)
279306
================================
@@ -314,6 +341,13 @@ shown below, the `GPU0` is associated with the `0` node ID::
314341
NIC2 NODE SYS NODE NODE X SYS
315342
NIC3 SYS NODE SYS SYS SYS X
316343

344+
Alternatively, one can also use the ``hwloc`` command line interface, please make sure the
345+
strict flag is used:
346+
347+
.. code-block:: sh
348+
349+
hwloc-bind --strict --membind node:${NODEID} --cpubind node:${NODEID} ./myapp
350+
317351
Another approach is to use the CPU affinity. The `dask-cuda
318352
<https://github.com/rapidsai/dask-cuda>`__ project configures optimal CPU affinity for the
319353
Dask interface through using the `nvml` library in addition to the Linux sched

python-package/xgboost/core.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1841,6 +1841,8 @@ def __init__( # pylint: disable=super-init-not-called
18411841
parameter specifies the size of host cache compared to the size of the
18421842
entire cache: :math:`host / (host + device)`.
18431843
1844+
See :ref:`extmem-adaptive-cache` for more info.
1845+
18441846
"""
18451847
self.max_bin = max_bin
18461848
self.missing = missing if missing is not None else np.nan

src/common/error_msg.cc

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,14 @@
11
/**
2-
* Copyright 2023 by XGBoost contributors
2+
* Copyright 2023-2025, XGBoost contributors
33
*/
44
#include "error_msg.h"
55

6-
#include <mutex> // for call_once, once_flag
7-
#include <sstream> // for stringstream
6+
#include <mutex> // for call_once, once_flag
7+
#include <sstream> // for stringstream
8+
#include <system_error> // for error_code, system_category
89

910
#include "../collective/communicator-inl.h" // for GetRank
11+
#include "xgboost/collective/socket.h" // for LastError
1012
#include "xgboost/context.h" // for Context
1113
#include "xgboost/logging.h"
1214

@@ -76,4 +78,10 @@ void CheckOldNccl(std::int32_t major, std::int32_t minor, std::int32_t patch) {
7678
LOG(WARNING) << msg();
7779
}
7880
}
81+
82+
[[nodiscard]] std::error_code SystemError() {
83+
std::int32_t errsv = system::LastError();
84+
auto err = std::error_code{errsv, std::system_category()};
85+
return err;
86+
}
7987
} // namespace xgboost::error

src/common/error_msg.h

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,10 @@
66
#ifndef XGBOOST_COMMON_ERROR_MSG_H_
77
#define XGBOOST_COMMON_ERROR_MSG_H_
88

9-
#include <cstdint> // for uint64_t
10-
#include <limits> // for numeric_limits
11-
#include <string> // for string
9+
#include <cstdint> // for uint64_t
10+
#include <limits> // for numeric_limits
11+
#include <string> // for string
12+
#include <system_error> // for error_code
1213

1314
#include "xgboost/base.h" // for bst_feature_t
1415
#include "xgboost/context.h" // for Context
@@ -140,5 +141,7 @@ constexpr StringView CacheHostRatioNotImpl() {
140141
constexpr StringView CacheHostRatioInvalid() {
141142
return "`cache_host_ratio` must be in range [0, 1].";
142143
}
144+
145+
[[nodiscard]] std::error_code SystemError();
143146
} // namespace xgboost::error
144147
#endif // XGBOOST_COMMON_ERROR_MSG_H_

src/common/io.cc

Lines changed: 20 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
/**
22
* Copyright 2019-2025, by XGBoost Contributors
33
*/
4+
#include "error_msg.h"
45
#if defined(__unix__) || defined(__APPLE__)
56

67
#include <fcntl.h> // for open, O_RDONLY, posix_fadvise
@@ -29,12 +30,10 @@
2930
#include <iterator> // for distance
3031
#include <memory> // for unique_ptr, make_unique
3132
#include <string> // for string
32-
#include <system_error> // for error_code, system_category
3333
#include <utility> // for move
3434
#include <vector> // for vector
3535

3636
#include "io.h"
37-
#include "xgboost/collective/socket.h" // for LastError
3837
#include "xgboost/logging.h" // for CHECK_LE
3938
#include "xgboost/string_view.h" // for StringView
4039

@@ -134,19 +133,13 @@ std::size_t GetMmapAlignment() {
134133
return getpagesize();
135134
#endif
136135
}
137-
138-
auto SystemErrorMsg() {
139-
std::int32_t errsv = system::LastError();
140-
auto err = std::error_code{errsv, std::system_category()};
141-
return err.message();
142-
}
143136
} // anonymous namespace
144137

145138
std::vector<char> LoadSequentialFile(std::string uri) {
146139
auto OpenErr = [&uri]() {
147140
std::string msg;
148141
msg = "Opening " + uri + " failed: ";
149-
msg += SystemErrorMsg();
142+
msg += error::SystemError().message();
150143
LOG(FATAL) << msg;
151144
};
152145

@@ -193,10 +186,11 @@ MMAPFile* detail::OpenMmap(std::string path, std::size_t offset, std::size_t len
193186
#if defined(xgboost_IS_WIN)
194187
HANDLE fd = CreateFile(path.c_str(), GENERIC_READ, FILE_SHARE_READ, nullptr, OPEN_EXISTING,
195188
FILE_ATTRIBUTE_NORMAL | FILE_FLAG_OVERLAPPED, nullptr);
196-
CHECK_NE(fd, INVALID_HANDLE_VALUE) << "Failed to open:" << path << ". " << SystemErrorMsg();
189+
CHECK_NE(fd, INVALID_HANDLE_VALUE)
190+
<< "Failed to open:" << path << ". " << error::SystemError().message();
197191
#else
198192
auto fd = open(path.c_str(), O_RDONLY);
199-
CHECK_GE(fd, 0) << "Failed to open:" << path << ". " << SystemErrorMsg();
193+
CHECK_GE(fd, 0) << "Failed to open:" << path << ". " << error::SystemError().message();
200194
#endif
201195

202196
std::byte* ptr{nullptr};
@@ -207,7 +201,7 @@ MMAPFile* detail::OpenMmap(std::string path, std::size_t offset, std::size_t len
207201
#if defined(__linux__) || defined(__GLIBC__)
208202
int prot{PROT_READ};
209203
ptr = reinterpret_cast<std::byte*>(mmap(nullptr, view_size, prot, MAP_PRIVATE, fd, view_start));
210-
CHECK_NE(ptr, MAP_FAILED) << "Failed to map: " << path << ". " << SystemErrorMsg();
204+
CHECK_NE(ptr, MAP_FAILED) << "Failed to map: " << path << ". " << error::SystemError().message();
211205
auto handle = new MMAPFile{fd, ptr, view_size, offset - view_start, std::move(path)};
212206
#elif defined(xgboost_IS_WIN)
213207
auto file_size = GetFileSize(fd, nullptr);
@@ -216,16 +210,16 @@ MMAPFile* detail::OpenMmap(std::string path, std::size_t offset, std::size_t len
216210
access = FILE_MAP_READ;
217211
std::uint32_t loff = static_cast<std::uint32_t>(view_start);
218212
std::uint32_t hoff = view_start >> 32;
219-
CHECK(map_file) << "Failed to map: " << path << ". " << SystemErrorMsg();
213+
CHECK(map_file) << "Failed to map: " << path << ". " << error::SystemError().message();
220214
ptr = reinterpret_cast<std::byte*>(MapViewOfFile(map_file, access, hoff, loff, view_size));
221-
CHECK_NE(ptr, nullptr) << "Failed to map: " << path << ". " << SystemErrorMsg();
215+
CHECK_NE(ptr, nullptr) << "Failed to map: " << path << ". " << error::SystemError().message();
222216
auto handle = new MMAPFile{fd, map_file, ptr, view_size, offset - view_start, std::move(path)};
223217
#else
224218
CHECK_LE(offset, std::numeric_limits<off_t>::max())
225219
<< "File size has exceeded the limit on the current system.";
226220
int prot{PROT_READ};
227221
ptr = reinterpret_cast<std::byte*>(mmap(nullptr, view_size, prot, MAP_PRIVATE, fd, view_start));
228-
CHECK_NE(ptr, MAP_FAILED) << "Failed to map: " << path << ". " << SystemErrorMsg();
222+
CHECK_NE(ptr, MAP_FAILED) << "Failed to map: " << path << ". " << error::SystemError().message();
229223
auto handle = new MMAPFile{fd, ptr, view_size, offset - view_start, std::move(path)};
230224
#endif // defined(__linux__) || defined(__GLIBC__)
231225

@@ -238,22 +232,24 @@ void detail::CloseMmap(MMAPFile* handle) {
238232
}
239233
#if defined(xgboost_IS_WIN)
240234
if (handle->base_ptr) {
241-
CHECK(UnmapViewOfFile(handle->base_ptr)) << "Failed to call munmap: " << SystemErrorMsg();
235+
CHECK(UnmapViewOfFile(handle->base_ptr))
236+
<< "Failed to call munmap: " << error::SystemError().message();
242237
}
243238
if (handle->fd != INVALID_HANDLE_VALUE) {
244-
CHECK(CloseHandle(handle->fd)) << "Failed to close handle: " << SystemErrorMsg();
239+
CHECK(CloseHandle(handle->fd)) << "Failed to close handle: " << error::SystemError().message();
245240
}
246241
if (handle->file_map != INVALID_HANDLE_VALUE) {
247-
CHECK(CloseHandle(handle->file_map)) << "Failed to close mapping object: " << SystemErrorMsg();
242+
CHECK(CloseHandle(handle->file_map))
243+
<< "Failed to close mapping object: " << error::SystemError().message();
248244
}
249245
#else
250246
if (handle->base_ptr) {
251247
CHECK_NE(munmap(handle->base_ptr, handle->base_size), -1)
252-
<< "Failed to call munmap: `" << handle->path << "`. " << SystemErrorMsg();
248+
<< "Failed to call munmap: `" << handle->path << "`. " << error::SystemError().message();
253249
}
254250
if (handle->fd != 0) {
255251
CHECK_NE(close(handle->fd), -1)
256-
<< "Failed to close: `" << handle->path << "`. " << SystemErrorMsg();
252+
<< "Failed to close: `" << handle->path << "`. " << error::SystemError().message();
257253
}
258254
#endif
259255
delete handle;
@@ -293,7 +289,7 @@ std::shared_ptr<MallocResource> MemBufFileReadStream::ReadFileIntoBuffer(StringV
293289
std::unique_ptr<FILE, std::function<int(FILE*)>> fp{fopen(path.c_str(), "rb"), fclose};
294290

295291
auto err = [&] {
296-
auto e = SystemErrorMsg();
292+
auto e = error::SystemError().message();
297293
LOG(FATAL) << "Failed to read file `" << path << "`. System error message: " << e;
298294
};
299295
#if defined(__linux__)
@@ -302,7 +298,7 @@ std::shared_ptr<MallocResource> MemBufFileReadStream::ReadFileIntoBuffer(StringV
302298
err();
303299
}
304300
if (posix_fadvise(fd, offset, length, POSIX_FADV_SEQUENTIAL) != 0) {
305-
LOG(FATAL) << SystemErrorMsg();
301+
LOG(FATAL) << error::SystemError().message();
306302
}
307303
#endif // defined(__linux__)
308304

@@ -358,12 +354,12 @@ AlignedMemWriteStream::~AlignedMemWriteStream() = default;
358354
[[nodiscard]] std::size_t TotalMemory() {
359355
#if defined(__linux__)
360356
struct sysinfo info;
361-
CHECK_EQ(sysinfo(&info), 0) << SystemErrorMsg();
357+
CHECK_EQ(sysinfo(&info), 0) << error::SystemError().message();
362358
return info.totalram * info.mem_unit;
363359
#elif defined(xgboost_IS_WIN)
364360
MEMORYSTATUSEX status;
365361
status.dwLength = sizeof(status);
366-
CHECK(GlobalMemoryStatusEx(&status)) << SystemErrorMsg();
362+
CHECK(GlobalMemoryStatusEx(&status)) << error::SystemError().message();
367363
return static_cast<std::size_t>(status.ullTotalPhys);
368364
#else
369365
LOG(FATAL) << "Not implemented";

0 commit comments

Comments
 (0)