Two papers accepted in IEEE ICDE 2023

1
Title: A Competition-Aware Approach to Accurate TV Show Recommendation
Author: Hong-Kyun Bae, Yeon-Chang Lee, Kyungsik Han, and Sang-Wook Kim
Abstract
As the number of TV shows increases, designing recommendation systems to provide users with their favorable TV shows becomes of much importance. In a TV show domain, watching a TV show (i.e., giving implicit feedback to the show) among the TV shows broadcast at the same time frame implies that the currently watching show is the winner in the competition with others (i.e., losers). However, in previous studies, such a notion of limited competitions has not been considered in estimating the user’s preferences for TV shows. In this paper, we propose a new recommendation framework to take this new notion into account based on pair-wise models. Our framework is composed of the following ideas: (i) identify winners and losers by determining pairs of competing TV shows; (ii) learn the pairs of competing TV shows based on the confidence for the pair-wise preference between the winner and the loser; (iii) recommend the most favorable TV shows by considering the time factors with respect to users and TV shows. Using a realworld TV show dataset, our experimental results show that our proposed framework consistently improves the accuracy of recommendation by up to 38%, compared with the best state-ofthe-art method. The codes and datasets of our framework will be available in an external link (https://github.com/hongkyunbae/tvshow rs) upon acceptance.
2
Title: Orchestrating Large-Scale SpGEMMs using Dynamic Block Distribution and Data Transfer Minimization on Heterogeneous Systems
Author: Taehyeong Park, Seokwon Kang, Myung-Hwan Jang, Sang-Wook Kim, and Yongjun Park
Abstract
Sparse matrix-matrix multiplication (SpGEMM) is one of the most important kernels in many emerging applications such as database, deep learning, graph analysis, and recommendation system. Since SpGEMM requires huge amount of computations, many SpGEMM techniques have been implemented based on Graphic Processing Units (GPUs), to fully exploit dataparallelism. However, traditional SpGEMM techniques often do not fully utilize the GPU because most non-zero elements of target sparse matrices are present in a few hub nodes and non-hub nodes barely have non-zero elements. The dataspecific characteristics (power law) incur significant performance degradation due to the load imbalance between GPU cores and low utilization of each core. Many recent implementations have tried to solve this challenge with smart pre/postprocessing, but because of their large overheads, net performance hardly gets much better, and sometimes even worse. Also, non-hub nodes are inherently not suitable to be computed on GPU even after the optimizations. More importantly, the performance is no longer dominated by kernel execution, but more dominated by data transfers such as device-to-host transfer and file I/Os, due to the rapid growth of GPU computing power and input data size. To solve the challenges, this paper proposes Dynamic Block Distributor, a novel full-system-level SpGEMM orchestration framework on heterogeneous systems, which improves the overall performance by enabling an efficient CPU-GPU collaboration and further minimizing data transfer overhead between all system elements. It first divides the whole matrix into smaller units and then offloads the computation of each unit to an appropriate computing unit between a GPU and a CPU based on its workload type and the run-time resource utilization status. It also minimizes data transfer overhead with simple but wellsuited techniques: Row Collecting, I/O Overlapping, and I/O Binding. Our experiments show that it speeds up the SpGEMM execution latency including both kernel execution and device-tohost transfers by 3.17x on an average, and the overall execution time by 1.84x on an average, compared to the state-of-the-art cuSPARSE library.