王 虹,孫 紅
(1.上海理工大學 光電信息與計算機工程學院,上?!?00093;2.上海理工大學 上海現代光學系統重點實驗室,上?!?00093)
基于混合聚類算法的客戶細分策略研究
王虹1,2,孫紅1,2
(1.上海理工大學 光電信息與計算機工程學院,上海200093;2.上海理工大學 上?,F代光學系統重點實驗室,上海200093)
摘要針對層次聚類法和 K-means 聚類法的缺陷和不足,提出將二者相結合的改進算法,既解決了層次聚類法伸縮性差的問題,又解決了 K-means聚類法對初始聚類中心敏感的問題。通過對改進算法的計算復雜度分析并利用 UCI 數據庫的測試數據對改進算法進行測試。結果表明,混合聚類算法使樣本聚類的準確率提高到94%,并有更高的執行效率和更好地實用性。此外,將此算法應用到汽車銷售公司的客戶細分管理中,得出了差別化明顯的客戶細分類別,表明此改進算法具有更強的客戶細分能力以及客戶行為特征的解釋能力。
關鍵詞層次聚類法;K-means算法;混合聚類算法;客戶細分;汽車銷售
Customer Segmentation Strategy Based on Hybrid Clustering Algorithm
WANG Hong1,2,SUN Hong1,2
(1.School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,
Shanghai 200093,China;2.Shanghai Key Lab of Modern Optical System,University of Shanghai for
Science and Technology,Shanghai 200093,China)
AbstractAn improved algorithm is put forward to fuse the hierarchical clustering method and the K- means clustering method to solve both the poor scalability of the former and the sensitivity to the initial clustering center of the latter.The computing complexity analysis of the improved algorithm and the test data of UCI database testing results show that the hybrid clustering algorithm increases the sample clustering accuracy to 94% with a higher efficiency and better practicability.In addition,this algorithm is applied to the car sales company in the management of customer segmentation,where the differential is obtained obviously of customer segmentation categories,showing that the improved algorithm has higher detection rate and stronger interpretation ability on customer behaviors.
Keywordshierarchical clustering;K-means algorithm;hybrid clustering algorithm;customer segmentation;auto sale
聚類算法中劃分聚類和層次聚類法是常見的兩種聚類技術。劃分聚類方法具有較高的執行效率,但存在初始中心選擇的隨機性問題,聚類精度較低。層次聚類法在算法上比較符合數據的特性[1],但在該算法中,一旦一個分裂或合并被執行就不能修正,使其聚類質量受到影響,且時間復雜度較高[2]。鑒于劃分和層次聚類法各自在處理數據上的優勢和缺陷,本文充分結合兩種算法的特點,提出將……