Tag: topic modeling

  • Comparing methods to extract technical content for technological intelligence

    We are developing indicators for the emergence of science and technology (S&T) topics. To do so, we extract information from various S&T information resources. This paper compares alternative ways of consolidating messy sets of key terms [e.g., using Natural Language Processing on abstracts and titles, together with various keyword sets]. Our process includes combinations of…

  • Clustering scientific documents with topic modeling

    Topic modeling is a type of statistical model for discovering the latent “topics” that occur in a collection of documents through machine learning. Currently, latent Dirichlet allocation (LDA) is a popular and common modeling approach. In this paper, we investigate methods, including LDA and its extensions, for separating a set of scientific publications into several…