Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

Add support for Qwen2VL by HimariO · Pull Request #10361 · ggerganov/llama.cpp

Dec 15, 2024 - github.com
The markdown data provides results from a series of ROPE tests conducted using the Metal backend on an Apple M4 Max device with 27,648 MB of memory. The tests involve various configurations of ROPE operations, primarily focusing on different data types (f32 and f16), dimensions, modes, and other parameters. The results indicate that most tests passed successfully, with configurations marked as "OK." However, several tests failed, particularly those with higher NMSE (Normalized Mean Square Error) values exceeding the threshold of 0.000000100. These failures occurred across different modes and configurations, highlighting potential issues in the Metal backend's handling of specific ROPE operations.

The failures were consistent across both f32 and f16 data types, particularly in modes 8 and 24, suggesting that these modes may require further investigation or optimization. Despite these failures, many configurations, especially those with adjusted parameters such as af=1.424500, passed successfully, indicating that certain adjustments can mitigate the issues. Overall, the data suggests that while the Metal backend performs well in many scenarios, specific configurations still present challenges that need addressing to ensure robust performance across all ROPE operations.

Key takeaways:

  • Some ROPE tests are failing when using the Metal backend, particularly in specific configurations.
  • Failures are indicated by high NMSE values, exceeding the threshold of 0.000000100, in certain test cases.
  • Tests with different configurations, such as varying dimensions and modes, show a mix of pass and fail results.
  • Both f32 and f16 types are tested, with some failures occurring in both types under similar conditions.
View Full Article

Comments (0)

Be the first to comment!