You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[gpu] performance and functionality improvements (#1265)
* [gpu] performance and functionality improvements
* Capturing disk usage statistics to reduce excessive disk space
* created exit handler to clean up environment on completion or failure
* created prepare function to prepare for the installation
* when sufficient memory is available, configure a ramdisk
* reduce noise by turning off -x in utility functions
* added descriptive comments before the obscurely coded
compare_versions_lte and compare_versions_lt functions
* removed some intermediate driver versions
* added cuda url for 12.6
* execute_with_retries now logs on failure, captures runtime and
cleans before installing on debian
* saving OS installation and NV .run files and their temp files to ramdisk
* piping source .xz file directly xz instead of saving to disk first
* new utility function "is_debuntu" checks for the frequently used
conditon of whether the running OS is either debian or ubuntu
* added support for specifying an http proxy (thank you प्रकाश)
* moving load of kernel module to later in the code and exercising
modprobe of all modules to avoid regression
* fixed problem with attempting to fetch from incorrect vault
directory when rocky kernel package is not found in primary repo
* using correct cran-r signing key for ubuntu18
* corrected file check condition for /etc/apt/trusted.gpg
* do not update all packages on rocky ; move preparation to prepare function
* increasing memory to make use of ramdisk
* using something a little smaller
* create mount_ramdisk function and call it ; fix up the version comparison functions ; create ge and le comparisons for OSs
* iterating better, caching results of system calls ; renamed to repair_old_backports
* comparing correct version numbers
* rocky uses a tmpfs on /tmp in the base image
* tested on rocky and ubuntu
* tested harder on rocky
* cuda 11 no longer available for debian 12
* cuda v11 no longer supported on debian12
* corrected use of ubuntu regex for rocky version
* re-enabling spark job tests
* correct a couple of edge cases
* added instructions for manually running tests
* open a monitor session by default
* cleaning up cuda and cudnn url generation
* condition better
* cleaned up generation of NVIDIA_CUDA_URL
* updated versions and GPU accelerators in the documentation
* ensure this test to be skipped based on cuda version rather than dataproc version alone
* fix for /usr/local/cuda-12.4/bin/nvcc: No such file or directory
* correcting path to run-bazel-tests.sh
* runing variable definition
* cleaned up skip conditions
* order of operations
* works with 2.0-rocky8
* remove redundant conditional check
* supported version limits are tightened up a bit ; clean up rocky vault install code
* corrected syntax errors
* failure to run dnf here should not fail the entire installer
* order matters here
* 2.2-ubuntu22 works with cuda 11, other 2.2 do not
* 2.2-ubuntu22 works with cuda 11, other 2.2 do not
* fixes ubuntu22 kernel version mismatch error
* disabling rocky9 builds due to out of date base dataproc image
* cuda 2.0 not supported in debian12
* some 2.0-rocky8 single instance tests fail
* intended to use <= and not >=
* simplify gpu resource script
* setting default discoveryScript ; testing pyspark in its own function
* remove spark: prefix from property names
* comment out quite a few tests
* new version numbers
* fixed a syntax error with documentation
* musn't forget the commas
* half as many tasks with twice as much cpu and gpu each
* pause before first ssh ; correct variable name
0 commit comments