Spatio-Temporal data analytics:

Spatio-Temporal data, generated by either GPS-embedded equipment (such as iPhones) or manual inputs from users, contains large amount of noise, which makes the quality of the data analysis significantly low in many applications. To solve the problem, we propose a reinforcement learning algorithm to robustly decrease the influence of the noise. The algorithm can also automatically determine the number of clusters based on the data distributions. In addition, spatio-temporal data (such as social media data) often arrive in a stream manner, therefore, it is straightforward to design an online data processing model for detecting events hidden in the streams. For such a sake, we propose an online algorithm for hierarchical categorizations and detections of events in the streams. The results are published in SigirACM TISTCIKM, ICDE, IJCAI, Sigmod, and PVLDB.

Social media data mining and analytics

Social Network Analysis:

In this area, we tackle node influence and sampling techniques on the social network, we also apply the algorithms to recommendation algorithms. Our theoretical study shows the convergence speed of the node sampling is related to the topology structure of the social network, that motivates us to propose a dynamic tuning algorithm for the sampling. The algorithm significantly speeds up the convergence speed of the social network sampling. The related results are published in ICDE, ACM TODS, TKDE, KIS, and WWWJ.

Web data crawling:

Spatio-temporal data are stored and managed in various web hidden databases which can only provide a restricted access to the data. However, mining tasks often require a large amount of data from the hidden databases for their analysis. We solve the problem with two techniques: sampling techniques for interface extension and crawling techniques for kNN-based hidden databases. The related results are published in TKDE.

Application scenarios of social media data mining

Relevant publication

