时空数据分析 :

数据的时空属性可以产生于具有GPS功能的智能设备(如智能手机)或由用户直接输入到系统,存在大量噪音。这些噪音会严重影响数据分析结果的质量。为了解决这一问题,研究工作利用不同属性对聚类结果互加强(reinforcement)以消除噪音的影响,从而提高了时空数据的聚类质量。另外,还根据数据的分布特性(时空数据在空间上经常满足高斯分布),提出了自动确定聚类个数的算法。社交媒体数据具有文本,时间,空间等属性,并且以数据流的方式产生。根据此特点,我们提出了实时事件珍侧算法,该算法可以检测出在时间空间上所发生的事件。时间序列一般是指在时间维度上密集产生的一类大数据(如股票信息)。应用中经常需要查找与给定的一个时间序列具有最大相似段的时间序列集,而常规空间索引技术不能有效支持这样的查询。我们提出了菱形索引结构用于高效支持这类变长段匹配的查询与分析。主要研究成果发表在国际一流会议与期刊上Sigir,ACM TIST,CIKM, ICDE, IJCAI, Sigmod, 以及PVLDB。

社交媒体数据挖掘和分析系统

社交网络分析:

在这一领域的研究主要包括社交网络的抽样技术,结点影响力的计算,以及社交网络在推荐领域的应用。通过理论分析,发现社交网结点抽样算法的收敛速度和网络的拓扑结构密切相关。提出了利用结点的局部结构动态调整算法的行走概率,进而加速抽样的收敛速度。根据结点的局部结构分析该用户选择社交朋友的行为特点,从而判断相邻结点对其的行为影响力。利用这一技术,提高了合成推荐算法以及结点影响力算法的性能。相关论文发表在ICDE, ACM TODS, TKDE, KIS, WWWJ.

数据的获取:

时空数据资源被存储于各式各样的数据库系统。这些系统的接口一般设计的比较简单,不适用于批量获取。在这方面的研究主要包括两个方面:界面查询的扩展技术以及kNN界面空间数据资源的获取技术。界面扩展技术是利用数据抽样算法扩展原数据库界面的查询能力。kNN界面数据资源的获取技术给出多维空间数据库的快速数据获取算法,并证明了算法性能只与数据库的大小有关,而与数据的空间分布无关。相关工作发表在TKDE。

社交媒体数据挖掘应用场景

相關文章列表

[1 ]Dichao Li, Zhiguo Gong, and Defu Zhang: A Common Topic Transfer Learning Model for Crossing City POI Recommendations. IEEE Transactions on Cybernetics. Accepted (2018).

[2] Na Ta, Guoliang Li, Tianyu Zhao, Jianhua Feng, Hanchao Ma, Zhiguo Gong: An Efficient Ride-Sharing Framework for Maximizing Shared Route. IEEE Trans. Knowl. Data Eng. 30(2): 219-233 (2018)

[3] Yuhong Li, Jie Bao, Member, Yanhua Li, Member, Yingcai Wu, Zhiguo Gong, and Yu Zheng. Mining the Most Influential k-Location Set From Massive Trajectories. IEEE Transactions on Big Data, (2018).

[4] Juan Lu, Zhiguo Gong, Xuemin Lin. A Novel and Fast SimRank Algorithm. IEEE Transactions on Knowledge and Data Engineering (2017)

[5] Zhenguo Yang, Qing Li, Zheng Lu, Yun Ma, Zhiguo Gong, Wenyin Liu. Dual Structure Constrained Multimodal Feature Coding for Social Event Detection in Flickr Data. ACM Transactions on Internet Technology (2017)

[6] Hui Yan, Zhiguo Gong, Nan Zhang, Tao Huang, Hua Zhong, Jun Wei. Crawling Hidden Objects with kNN Queries. IEEE Transactions on Knowledge and Data Engineering. (2016)

[7] Huiqi Hu Guoliang Li Zhifeng Bao Jianhua Feng Zhiguo Gong. Top-k Spatial-Textual Similarity Join. IEEE Transactions on Knowledge and Data Engineering. (2016)

[8] Zhuojie Zhou, Nan Zhang, Zhiguo Gong, and Gautam Gas. Faster Random Walks By Rewiring Online Social Networks On-The-Fly. ACM Transactions on Database Systems. (2016)

[9] Hui Yan, Zhiguo Gong, Nan Zhang, Tao Huang, Hua Zhong, Jun Wei. “Aggregate Estimation in Hidden Databases with Checkbox Interfaces”. IEEE Transactions on Knowledge and Data Engineering, 27(5) (2015), pp.1192-1204.

[10] Ruicheng Zhong, Guoliang Li, Kian-lee Tan, Lizhu Zhou, Zhiguo Gong. “G-Tree: An Efficient and Scalable Index for Spatial Search on Road Networks”. IEEE Transactions on Knowledge and Data Engineering (TKDE), 27(8) (2015): 2175-2189

[11] Bailong Liao, Leong Hou U, Man Lung Yiu, and Zhiguo Gong.  “Beyond Millisecond Latency k NN Search on Commodity Machine”. IEEE Transactions on Knowledge and Data Engineering (TKDE), 27(10): 2618-2631 (2015).

[12] Minghe Yu, Guoliang Li, Ting Wang, Jianhua Feng, Zhiguo Gong. “Efficient Filtering Algorithms for Location-Aware Publish/Subscribe”. IEEE Trans. Knowl. Data Eng. 27(4): 950-963 (2015)

[13] Yiyang Yang, Zhiguo Gong, Leong Hou U. “Identifying Points of Interest using heterogenous Features”. ACM Transactions on Intelligent Systems and Technology. 5(4): 68:1-68:27 (2014).

[14] Leong Hou U, Hongjun Zhao, Man Lung Yiu, Yuhong Li, and Zhiguo Gong. “Towards Online Shortest Paths Computation”. IEEE Transaction on Knowledge and Data Engineering. 26(4): 1012-1025 (2014).

[15] Jinjin Guo, Zhiguo Gong, A Density-based Nonparametric Model for Online Event Discovery from the Social Media Data, In Proceedings of the 26th IJCAI2017.

[16] Yiyang, Zhiguo Gong, Qing Li, Leong Hou U, Ruichu Cai, Zhifeng Hao. A Robust Noise Resistant Algorithm for POI Identification from Flickr Data. In Proceedings of the 26th IJCAI2017.

[17] Ngai Meng Kou, Yan Li, Hao Wang, Leong Hou U, Zhiguo Gong, “Crowdsourced Top-k Queries by Confidence-Aware Pairwise Judgments”, Proceedings of the SIGMOD, 2017.

[18] Jinjin Guo, Zhiguo Gong. A Nonparametric Model for Event Discovery in the Geospatial-Temporal Space. ACM CIKM2016.

[19] Yuhong Li, Jie Bao, Yanhua Li, Yingcai Wu, Zhiguo Gong, Yu Zheng.  ACM Sigspatial 2016

[20] Ngai Meng Kou, Leng Hou U, Nikos Mamoulis, Zhiguo Gong. Weighted Coverage based Reviewer Assignment. In Proceedings of ACM Sigmod2015.

[21] Yuhong Li, Leong Hou U, Man Lung Yiu, and Zhiguo Gong. Quick-Motif: An Efficient and Scalable Framework for Exact Motif Discovery. In Proceedings of ICDE2015.

[22] Ngai Meng Kou, Leong Hou U, Nikos Mamoulis, Yuhong Li, Ye Li, Zhiguo Gong. A Topic-based Reviewer Assignment System. PVLDB 8(12): 1852-1863 (2015).

[23] Yuhong Li, Yu Zheng, Shenggong Ji, Wenjun Wang, Leong Hou U, and Zhiguo Gong. Location Selection for Ambulance Stations: A Data-Driven Approach. In Proceedings of ACM Sigspatial 2015.

點擊了解更多相關成果