close
close

first Drop

Com TW NOw News 2024

(R) Exploring the limitations of Kolmogorov-Arnold networks in classification: insights from software training and hardware implementation
news

(R) Exploring the limitations of Kolmogorov-Arnold networks in classification: insights from software training and hardware implementation

(R) Exploring the limitations of Kolmogorov-Arnold networks in classification: insights from software training and hardware implementation

In summary: MLP vastly outperforms KAN in terms of training computational efficiency

Paper: https://arxiv.org/pdf/2407.17790

Abstract:

Kolmogorov-Arnold Networks (KANs), a new type of neural network, have recently gained popularity and attention due to their ability to replace multi-layer perceptions (MLPs) in artificial intelligence (AI) with higher accuracy and interoperability. However, KAN evaluation is still limited and cannot provide in-depth analysis of a specific domain. Moreover, there has been no research on the implementation of KANs in hardware design, which would directly demonstrate whether KANs are really superior to MLPs in practical applications. As a result, in this paper, we focus on verifying KANs for classification problems, which is a common but important topic in AI using four different types of datasets. Moreover, the corresponding hardware implementation is considered using the Vitis high-level synthesis (HLS) tool. To the best of our knowledge, this is the first paper to implement hardware for KAN. The results indicate that KANs cannot be more accurate than MLPs in very complex datasets while consuming significantly higher hardware resources. Therefore, MLP remains an effective approach for achieving accuracy and efficiency in software and hardware implementation.

Highlights:

Except for the training time of the Dry Bean dataset, all three other datasets consistently show that CANs require significantly longer training times than MLPs, ranging from 6.55 times (151.4 vs. 23.1 s) to 36.68 times (198.1 vs. 5.4 s). (…) Except for the Wine dataset, MLPs consistently have faster loss reduction and lower loss values ​​than CANs. Overall, CANs are not better than MLPs in terms of training time and loss reduction.

In general, CANs fail to demonstrate higher accuracy than MLPs, and the symbolic formula representation of CANs even performs worse than MLPs in classification challenges. Moreover, CANs also require a significant amount of time and effort from developers to create symbolic formulas in the final stages.

(Specialized hardware testing:)

These results indicate that implementing symbolic formulas on hardware requires much more hardware resources compared to normal matrix multiplication in MLPs. Furthermore, as the size of CAN models increases, there is a corresponding increase in required hardware resources.

Visual highlights:

GPU training

Difference in loss in favor of MLP can be huge. However, fast KAN convergence for the Wine dataset is remarkable. The dataset has only 178 samples with 13 features each

GPU training

FPGA training

Code: https://github.com/Zeusss9/KAN_Analysis

submitted by /u/StartledWatermelon
(link) (reactions)