-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About Efficiency Sparse Dense Multiplication #550
Comments
How interesting, thanks for sharing! I don't believe SuiteSparse:GraphBLAS currently uses MKL or any dense acceleration libraries for dense or bitmap operations, but perhaps someday it could. CC @DrTimothyAldenDavis, who may have more insight. In Python, you can enable the SuiteSparse:GraphBLAS "burble" diagnostics output with import graphblas as gb
gb.ss.burble.enable()
# or as a context manager, such as
# with gb.ss.burble:
# ... This will print a lot of details including dtypes, which internal methods are used, and timings. For trying to optimize performance, one thing to watch out for is "Generic" operations (i.e., if the output has "Generic") that can occur for kernels that aren't pre-compiled or JIT-compiled (btw, the SuiteSparse JIT is currently disabled by default until we get GraphBLAS/python-suitesparse-graphblas#118 merged).
Thanks, I'm glad you find it useful ❤️!
Of course! We're a little busy at times and may not always reply promptly, but we try :) Possibly related: #552 |
I don't use the Intel sparse MKL, for sparse-times-dense. I tried to, with Intel's help, but the sparse MKL was slower than any of my methods. I've benchmarked the two libraries. the only time I could get speedup from an MKL library is the dense-times-dense methods, in the dense BLAS (dgemm, sgemm, cgemm, and zgemm), for the PLUS_TIMES semirings on those 4 data types. |
The timings you post above are not enough for me to know what you are doing. Are you computing just y = A*x where A is a sparse matrix and x is a dense vector or matrix? If so, then you are not getting the fastest GraphBLAS kernel, because I must assume that y could be SPARSE not DENSE. The intel MKL would assume Y must be dense. I cannot do that, per the GraphBLAS spec. I cannot fill in y with zeros. Y can be sparse if A has any empty rows. If you want to mimic what the MKL does, you want to compute as follows: y = 0 ; // a dense vector of all zeros THEN I can know that y must be dense on output, even if A has 1 or more empty rows with no entries at all. And I use a different algorithm inside. The burble should say "saxpy4", "saxpy5", "dot4" or "dot5". If it says "saxpy3" or "dot2" then you are getting a slower kernel; those 2 kernels compute the sparsity of y. Without the ACCUM operator, I must compute the sparsity pattern of y, which is costly. If I then find it to be dense, I drop the pattern and you get a dense vector y. But it just took me extra time to do so. Show me the burble output. Also I would need to know if the matrices are held by row or by column (that would be apparent in the burble output). |
If the matrix C is missing even just a single entry, then I cannot store it in a dense format. Per the spec, I am not permitted to put in zeros. What is a zero? In a graph algorithm it means "add an edge here with weight zero". But the existence of an edge with weight zero is not the same thing as "there is no edge here". The Intel MKL can make this assumption because it only has one semiring to deal with. I cannot do it. The sparsity pattern of the result must be preserved. If on the other hand you want to give me a matrix C that has been filled in with explicit zeros so now it is a dense matrix, that that is fine. I just cannot do that myself inside GraphBLAS. I would break the contract with user; in some cases I would have to fill in with + infinity, -infinity etc. And the fillin value would change the moment you use another semiring (which happens all the time in graph algorithms). |
Hi, thank you so much for a great project. May I ask a small question related to sparse and dense matrix operation? The case is that I want to conduct a multiplication between sparse and dense matrix. In detail, the sparsity of the 2 project is given as below:
Where X is a sparse matrix and C is a dense one. In theory, C should be stored in dense format in order to make the operation to be efficient instead of sparse one. However, using bitmap/dense matrix provided by GraphBLAS is not as efficient as compared to MKL code for sparse * dense operation
Could you please kindly suggest any format from GraphBLAS to solve this problem? Thank you so much in advance!
The text was updated successfully, but these errors were encountered: