ROCm Port (#1087) · ggerganov/llama.cpp@6bbc598

The markdown data details a series of updates and fixes made to a repository. The changes include using hipblas based on cublas, updating the Makefile for the Cuda kernels, expanding the architecture list and making it overrideable, and fixing multi GPU on multiple AMD architectures with rocblas_initialize(). Other updates include adding hipBLAS to the README, introducing a new build argument, fixing half2 decomposition, adding intrinsics polyfills for AMD, and optimizing AMD assembly.

The repository also saw changes like allowing the overriding of CC_TURING, using "ROCm" instead of "CUDA", ignoring all build directories, and adding Dockerfiles. Fixes were made to llama-bench and -nommq help for non CUDA/HIP. The changes were co-authored by seven different contributors. The markdown data ends with a note on loading branch information.

Key takeaways:

The repository has been updated to use hipblas based on cublas.
There have been updates to the Makefile for the Cuda kernels and the architecture list has been expanded and made overrideable.
Issues with multi GPU on multiple AMD architectures have been fixed with rocblas_initialize().
Several co-authors contributed to these updates, including YellowRoseCx, ardfork, funnbot, Engininja2, Kerfuffle, jammm, and jdecourval.

ROCm Port (#1087) · ggerganov/llama.cpp@6bbc598

Key takeaways:

Comments (0)

Newsletter