Skip to content

Commit

Permalink
Merge branch 'master' into newrand
Browse files Browse the repository at this point in the history
  • Loading branch information
simsurace authored Jan 25, 2022
2 parents 336ca46 + 7244289 commit 196d717
Show file tree
Hide file tree
Showing 6 changed files with 9 additions and 12 deletions.
2 changes: 1 addition & 1 deletion .buildkite/pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,4 +34,4 @@ steps:

env:
JULIA_PKG_SERVER: "" # it often struggles with our large artifacts
CODECOV_TOKEN: "ea64fa23-14d4-4123-a7ce-b4f4208cd455"
CODECOV_TOKEN: "17a4c091-2903-476b-8609-c613436a30f8"
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
*.jl.*.cov
*.jl.cov
*.jl.mem
/Manifest.toml

test.jl
Manifest.toml
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "BinomialGPU"
uuid = "c5bbfde1-2136-42cd-9b65-d5719df69ebf"
authors = ["Simone Carlo Surace"]
version = "0.2.6"
version = "0.3.0"

[deps]
BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
Expand Down
9 changes: 3 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# BinomialGPU

[![Build status](https://badge.buildkite.com/70a8c11259658ad6f836a4981791ed144bac80e65302291d0d.svg?branch=master)](https://buildkite.com/julialang/binomialgpu-dot-jl)
[![Coverage](https://codecov.io/gh/simsurace/BinomialGPU.jl/branch/master/graph/badge.svg)](https://codecov.io/gh/simsurace/BinomialGPU.jl)
[![Coverage](https://codecov.io/gh/JuliaGPU/BinomialGPU.jl/branch/master/graph/badge.svg)](https://codecov.io/gh/JuliaGPU/BinomialGPU.jl)

This package provides a function `rand_binomial!` to produce `CuArrays` with binomially distributed entries, analogous to `CUDA.rand_poisson!` for Poisson-distributed ones.

Expand Down Expand Up @@ -42,8 +42,5 @@ rand_binomial!(A, count = counts, prob = probs)

## Issues

* The sampler is fast: it is about one order of magnitude faster than other samplers. But it is still an open question whether it can be made faster, whether there are other samplers with competitive speed, and it shows some non-intuitive behavior:
* The functionality to draw random numbers within CUDA.jl kernels is still under development. A new function `rand()` has recently become available, but it hasn't been tried within this package. See [issue #7](https://github.com/JuliaGPU/BinomialGPU.jl/issues/7).
* The speed is faster in Julia 1.5.4 than in the current Julia 1.6 release candidate. See [issue #8](https://github.com/JuliaGPU/BinomialGPU.jl/issues/8).
* The speed is slower when using optimal thread allocation than when defaulting to 256 threads. See [issue #2](https://github.com/JuliaGPU/BinomialGPU.jl/issues/2)
* Are there any other samplers that are comparably fast or faster? I compared the following: sample an array of size `(1024, 1024)` with `count = 128` and `prob` of size `(1024, 1024)` with uniformly drawn entries. Timings on an RTX2070 card: BinomialGPU.jl 1.4ms, PyTorch 11ms, CuPy 18ms, tensorflow 400ms. Please let me know if you know samplers that are not yet listed.
* The speed is slower when using optimal thread allocation than when defaulting to 256 threads. See [issue #2](https://github.com/JuliaGPU/BinomialGPU.jl/issues/2)
* Are there any other samplers that are comparably fast or faster? I compared the following: sample an array of size `(1024, 1024)` with `count = 128` and `prob` of size `(1024, 1024)` with uniformly drawn entries. Timings on an RTX2070 card: BinomialGPU.jl 0.8ms, PyTorch 11ms, CuPy 18ms, tensorflow 400ms. Timings for other samplers are very welcome; please open an issue if you find one.
1 change: 1 addition & 0 deletions src/BinomialGPU.jl
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ using Random

using CUDA: cuda_rng, i32


# user-level API
include("rand_binomial.jl")
export rand_binomial!
Expand Down
5 changes: 2 additions & 3 deletions src/kernels.jl
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ function stirling_approx_tail(k)::Float32
elseif k == 5
return 0.0138761288230707f0
elseif k == 6
return 0.0118967099458917f0
return 0.0118967099458917f0Newrand
elseif k == 7
return 0.0104112652619720f0
elseif k == 8
Expand All @@ -31,6 +31,7 @@ function stirling_approx_tail(k)::Float32
end



# BTRS algorithm, adapted from the tensorflow library (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/random_binomial_op.cc)

## Kernel for scalar parameters
Expand Down Expand Up @@ -268,6 +269,4 @@ function kernel_naive_full!(A, count, prob, randstates)
return
end



## COV_EXCL_STOP

0 comments on commit 196d717

Please sign in to comment.