Tag Archives: Topic Modeling (TM)

Identification of topic evolution: network analytics with piecewise linear representation and word embedding

Understanding the evolutionary relationships among scientific topics and learning the evolutionary process of innovations is a crucial issue for strategic decision makers in governments, firms and funding agencies when they carry out forward-looking research activities. However, traditional co-word network analysis on topic identification cannot effectively excavate semantic relationship from the context, and fixed time window method cannot scientifically reflect the evolution process of topics. This study proposes a framework of identifying topic evolutionary pathways based on network analytics: Firstly, keyword networks are constructed, in which a piecewise linear representation method is used for dividing time periods and a Word2Vec mode is used for capturing semantics from the context of titles and abstracts; Secondly, a community detection algorithm is used to identify topics in networks; Finally, evolutionary relationships between topics are represented by measuring the topic similarity between adjacent time periods, and then topic evolutionary pathways are identified and visualized. An empirical study on information science demonstrates the reliability of the methodology, with subsequent empirical validations.

https://doi.org/10.1007/s11192-022-04273-1

Author(s): Lu Huang, Xiang Chen, Yi Zhang, Changtian Wang, Xiaoli Cao, Jiarun Liu
Organization(s): Beijing Institute of Technology, University of Technology Sydney
Source: Scientometrics
Year: 2022

Topic analysis and forecasting for science, technology and innovation: Methodology with a case study focusing on big data research

Highlights

  • Data-driven clustering approach to group topics with high accuracy
  • Similarity measure approach to trace the interaction between topics in time series
  • Analyzing changes of TFIDF values of related topics to predict future trends
  • Technology Roadmapping to blend historical analysis and expert-based forecasting

The number and extent of current Science, Technology & Innovation topics are changing all the time, and their induced accumulative innovation, or even disruptive revolution, will heavily influence the whole of society in the near future. By addressing and predicting these changes, this paper proposes an analytic method to (1) cluster associated terms and phrases to constitute meaningful technological topics and their interactions, and (2) identify changing topical emphases. Our results are carried forward to present mechanisms that forecast prospective developments using Technology Roadmapping, combining qualitative and quantitative methodologies. An empirical case study of Awards data from the United States National Science Foundation, Division of Computer and Communication Foundation, is performed to demonstrate the proposed method. The resulting knowledge may hold interest for R&D management and science policy in practice.

http://www.sciencedirect.com/science/article/pii/S0040162516000160

Author(s): Yi Zhang, Guangquan Zhang, Hongshu Chen, Alan L. Porter, Donghua Zhu, Jie Lu
Organization(s): University of Technology Sydney, Georgia Institute of Technology, Beijing Institute of Technology
Source: Technological Forecasting and Social Change
Year: 2016

Map of science with topic modeling: Comparison of unsupervised learning and human-assigned subject classification

The delineation of coordinates is fundamental for the cartography of science, and accurate and credible classification of scientific knowledge presents a persistent challenge in this regard. We present a map of Finnish science based on unsupervised-learning classification, and discuss the advantages and disadvantages of this approach vis-à-vis those generated by human reasoning. We conclude that from theoretical and practical perspectives there exist several challenges for human reasoning-based classification frameworks of scientific knowledge, as they typically try to fit new-to-the-world knowledge into historical models of scientific knowledge, and cannot easily be deployed for new large-scale data sets. Automated classification schemes, in contrast, generate classification models only from the available text corpus, thereby identifying credibly novel bodies of knowledge. They also lend themselves to versatile large-scale data analysis, and enable a range of Big Data possibilities. However, we also argue that it is neither possible nor fruitful to declare one or another method a superior approach in terms of realism to classify scientific knowledge, and we believe that the merits of each approach are dependent on the practical objectives of analysis.

Full-text available at http://onlinelibrary.wiley.com/doi/10.1002/asi.23596/full

Author(s): Arho Suominen and Hannes Toivanen
Organization(s): VTT Technical Research Centre of Finland Ltd
Source: Journal of the Association for Information Science and Technology
Year: 2015

Map of science with topic modeling: Comparison of unsupervised learning and human-assigned subject classification

The delineation of coordinates is fundamental for the cartography of science, and accurate and credible classification of scientific knowledge presents a persistent challenge in this regard. We present a map of Finnish science based on unsupervised-learning classification, and discuss the advantages and disadvantages of this approach vis-à-vis those generated by human reasoning. We conclude that from theoretical and practical perspectives there exist several challenges for human reasoning-based classification frameworks of scientific knowledge, as they typically try to fit new-to-the-world knowledge into historical models of scientific knowledge, and cannot easily be deployed for new large-scale data sets. Automated classification schemes, in contrast, generate classification models only from the available text corpus, thereby identifying credibly novel bodies of knowledge. They also lend themselves to versatile large-scale data analysis, and enable a range of Big Data possibilities. However, we also argue that it is neither possible nor fruitful to declare one or another method a superior approach in terms of realism to classify scientific knowledge, and we believe that the merits of each approach are dependent on the practical objectives of analysis.

Full-text available at http://onlinelibrary.wiley.com/doi/10.1002/asi.23596/full

Author(s): Arho Suominen and Hannes Toivanen
Organization(s): VTT Technical Research Centre of Finland and Lappeenranta University of Technology
Source: Journal of the Association for Information Science and Technology
Year: 2015