Category Archives: Tech mining

Different approaches of bibliometric analysis for data analytics applications in non-profit organisations (FULL-TEXT)

Profitable companies that used data analytics have a double gain in cost reduction, demand prediction, and decision-making. However, using data analysis in non-profit organisations (NPOs) can help understand and identify more patterns of donors, volunteers, and anticipated future cash, gifts, and grants. This article presents a bibliometric study of 2673 to discover the use of data analytics in different NPOs and understand its contribution. We characterise the associations between data analysis techniques and NPOs using, Bibliometrics R tool, a co-term analysis and scientific evolutionary pathways analysis, as well as identify the research topic changes in this field throughout time. The findings revealed three key conclusions may be drawn from the findings: (1) In the sphere of NPOs, robust and conventional statistical methods-based data analysis procedures are dominantly common at all times; (2) Healthcare and public affairs are two crucial sectors that involve data analytics to support decision-making and problem-solving; (3) Artificial Intelligence (AI) based data analytics is a recently emerging trending, especially in the healthcare-related sector; however, it is still at an immature stage, and more efforts are needed to nourish its development. The research findings can leverage future research and add value to the existing literature on the subject of data analytics.


Author(s): Idrees Alsolbi, Mengjia Wu, Yi Zhang, Sudhanshu Joshi, Manu Sharma, Siamak Tafavogh, Ashish Sinha, Mukesh Prasad
Organization(s): University of Technology Sydney, Commonwealth Bank Health Society
Source: Journal of Smart Environments and Green Computing
Year: 2022

Combining tech mining and semantic TRIZ for technology assessment: Dye-sensitized solar cell as a case (FULL-TEXT)

In a competitive business environment, an early understanding of the dynamics of technological change is crucial to help policymakers and managers make better-informed decisions. Bibliometric analyses help in studying trends and technological evolution. Tech mining (text analyses of science and technology information resources) enhances Bibliometric analyses. However, more often than not, such analyses focus on a specific technological area, and mainly result in incremental advance forecasts. An analysis of the interconnected dynamics of technology change warrants new approaches for identifying technology emergence, technological substitution, and the influences of vital socioeconomic forces. This paper introduces a unique combination that applies a tech mining and semantic TRIZ as a case study to Dye-Sensitized Solar Cell (DSSC) technology. This methodological combination brings broader insights to the emergence of DSSC in conjunction with related technologies that affect its progress, enriching the associated technological progression’s empirical characterization.

  • Techmining-semantic TRIZ helps to understand the competition influence among technologies.
  • The understanding of the architectural design, the system, helps to clearly understand the role of the different components.
  • Understanding the components’ role in the system helps to guide the techmining analysis and to understand the different trends.
  • Using the S-AO and not the SAO problem solving, the present work is able to find other competing or not, technologies that help to understand if that will support the emergence of the original or the competing technology.
  • This cross-tech-components have different role in other architectures. Perovskites, enhance silicon solar cells efficiency. or download FULL-TEXT

Author(s):J.M. Vicente-Gomila, M.A. Artacho-Ramírez, Ma Ting, A.L.Porter
Organization(s):Universitat Politècnica de València, Beijing Institute of Technology, Search Technology
Source: Technological Forecasting and Social Change
Year: 2021

Tracking developments in artificial intelligence research: constructing and applying a new search strategy (FULL-TEXT)

Artificial intelligence, as an emerging and multidisciplinary domain of research and innovation, has attracted growing attention in recent years. Delineating the domain composition of artificial intelligence is central to profiling and tracking its development and trajectories. This paper puts forward a bibliometric definition for artificial intelligence which can be readily applied, including by researchers, managers, and policy analysts. Our approach starts with benchmark records of artificial intelligence captured by using a core keyword and specialized journal search. We then extract candidate terms from high frequency keywords of benchmark records, refine keywords and complement with the subject category “artificial intelligence”. We assess our search approach by comparing it with other three recent search strategies of artificial intelligence, using a common source of articles from the Web of Science. Using this source, we then profile patterns of growth and international diffusion of scientific research in artificial intelligence in recent years, identify top research sponsors in funding artificial intelligence and demonstrate how diverse disciplines contribute to the multidisciplinary development of artificial intelligence. We conclude with implications for search strategy development and suggestions of lines for further research.


Author(s): Na Liu, Philip Shapira, Xiaoxu Yue
Organization(s):Shandong Technology and Business University, University of Manchester, Tsinghua University
Source: Scientometrics
Year: 2021

Parameter tuning Naïve Bayes for automatic patent classification

In an era of exponential technological growth, business intelligence professionals are more in need than ever of an organized patent landscape in which to conduct technology forecasting and industry positioning. However, the construction of such a system requires time and trained experts, both of which are expensive investments for such a small part of any actual analysis. A natural solution is to employ machine learning (ML), a branch of artificial intelligence that uses statistical information to find patterns and make inferences. The primary benefit of using ML is that these algorithms do not require explicit instruction. In this paper, I present an analysis of feature selection for automatic patent categorization. For a corpus of 7,309 patent applications from the World Patent Information (WPI) Test Collection (Lupu, 2019), I assign International Patent Classification (IPC) section codes using a modified Naïve Bayes classifier. I compare precision, recall, and f-measure for a variety of meta-parameter settings including data smoothing and acceptance threshold. Finally, I apply the optimized model to IPC class and group codes and compare the results of patent categorization to academic literature.

Author(s): Caitlin Cassidy
Organization(s): Search Technology
Source: World Patent Information
Year: 2020

A Multi-match Approach to the Author Uncertainty Problem (Full-Text)

The ability to identify the scholarship of individual authors is essential for performance evaluation. A number of factors hinder this endeavor. Common and similarly spelled surnames make it difficult to isolate the scholarship of individual authors indexed on large databases. Variations in name spelling of individual scholars further complicates matters. Common family names in scientific powerhouses like China make it problematic to distinguish between authors possessing ubiquitous and/or anglicized surnames (as well as the same or similar first names). The assignment of unique author identifiers provides a major step toward resolving these difficulties. We maintain, however, that in and of themselves, author identifiers are not sufficient to fully address the author uncertainty problem. In this study we build on the author identifier approach by considering commonalities in fielded data between authors containing the same surname and first initial of their first name. We illustrate our approach using three case studies.

For FULL-TEXT see 

Author(s): Stephen F. Carley, Alan L. Porter, Jan L. Youtie
Organization(s): Georgia Institute of Technology
Source: Journal of Data and Information Science
Year: 2019 (online. 2017 print)

Patent Portfolio Model for Measuring Strategic Technological Strength

As technological innovation plays important role in today’s knowledge economy, intellectual property as the most important output of technological development is valued highly for generating monopoly position in providing payoffs to innovation. Intellectual Property Management (IPM) helps organizations to identify, enhance and evaluate their technological strength. Patent portfolio Model (PPM) is built for assessing the advantages and disadvantages of organization, identifying the opportunities of development potentials and optimal distribution, to support the decision-making for optimizing resource allocation and developing layout for technical field. The case study of research institute in china show that this method is feasible and fulfilled the needs of different institutions, so as to provide suggestions for R&D technology management.


Author(s): Li Shuyin, Zhang Xian,  Xu Haiyun,  Fang Shu

Organization(s): Chengdu Library and Information Center, Chinese Academy of Sciences

Source:  IEEE Xplore: 2019 Portland International Conference on Management of Engineering and Technology (PICMET)

Year: 2019

Application of Text-Analytics in Quantitative Study of Science and Technology

The quantitative study of science, technology and innovation (ST&I ) has experienced significant growth with advancements in disciplines such as mathematics, computer science and information sciences. From the early studies utilizing the statistics method, graph theory, to citations or co-authorship, the state of the art in quantitative methods leverages natural language processing and machine learning. However, there is no unified methodological approach within the research community or a comprehensive understanding of how to exploit text-mining potentials to address ST&I research objectives. Therefore, this chapter intends to present the state of the art of text mining within the framework of ST&I. The major contribution of the chapter is twofold; first, it provides a review of the literature on how text mining extended the quantitative methods applied in ST&I and highlights major methodological challenges. Second, it discusses two hands-on detailed case studies on how to implement the text analytics routine.

Author(s): Samira Ranaei, Arho Suominen, Alan Porter, Tuomo Kässi

Organization(s): Lappeenranta University of Technology (LUT), VTT Technical Research Centre of Finland

Source: Springer Handbook of Science and Technology Indicators

Year: 2019

Does deep learning help topic extraction? A kernel k-means clustering method with word embedding

Topic extraction presents challenges for the bibliometric community, and its performance still depends on human intervention and its practical areas. This paper proposes a novel kernel k-means clustering method incorporated with a word embedding model to create a solution that effectively extracts topics from bibliometric data. The experimental results of a comparison of this method with four clustering baselines (i.e., k-means, fuzzy c-mean as,principal component analysis, and topic models) on two bibliometric datasets demonstrate its effectiveness across either a relatively broad range of disciplines or a given domain. An empirical study on bibliometric topic extraction from articles published by three top-tier bibliometric journals between 2000 and 2017, supported by expert knowledge-based evaluations, provides supplemental evidence of the method’s ability on topic extraction. Additionally, this empirical analysis reveals insights into both overlapping and diverse research interests among the three journals that would benefit journal publishers, editorial boards, and research communities.

Author(s): Yi Zhang, Jie Lu, Feng Liu, Qian Liu, Alan Porter, Hongshu Chen, Guangquan Zhang
Organization(s): University of Technology Sydney, Beijing Institute of Technology, Georgia Institute of Technology
Source: Journal of Informetrics
Year: 2018

Chapter 2 – Lessons From 10 Years of Nanotechnology Bibliometric Analysis

This chapter summarizes the 10-year experiences of the Program in Science, Technology, and Innovation Policy (STIP) at Georgia Institute of Technology (Georgia Tech) in support of the Center for Nanotechnology in Society at Arizona State University (CNS-ASU) in understanding, characterizing, and conveying the development of nanotechnology research and application. This work was labeled “Research and Innovation Systems Assessment” or (RISA) by CNS-ASU. CNS-ASU was designed to implement a set of methods to anticipate societal impacts (including environmental, health, and safety impacts) and lay the foundation for making changes to emerging technologies at an early stage in their development.

RISA concentrates on identifying and documenting quantifiable aspects of nanotechnology, including academic, commercial/industrial, and government nanoscience and nanotechnology (nanotechnologies) activity, research, and projects. RISA at CNS-ASU engaged in the first systematic attempt of its kind to define, characterize, and track a field of science and technology. A key element to RISA was the creation of a replicable approach to bibliometrically defining nanotechnology. Researchers in STIP, and beyond, could then query the resulting datasets to address topical areas ranging from basic country and regional concentrations of publications and patents to findings about social science literature, environmental, health, and safety research and usage, to study corporate entry into nanotechnology and to explore application areas as special interests arose. Key features of the success of the program include the following:

  • Having access to “large-scale” R&D abstract datasets
  • Analytical software
  • A portfolio that balances innovative long-term projects, such as webscraping to understand nanotechnology developments in small and medium-sized companies, with research characterizing the emergence of nanotechnology that more readily produces articles
  • Relationships with diverse networks of scholars and companies working in the nanotechnology science and social science domains
  • An influx of visiting researchers
  • A strong core of students with social science, as well as some programming background
  • A well-equipped facility and management by the principals through weekly problem-solving meetings, mini-deadlines, and the production journal articles rather than thick final reports.

Author(s): Jan Youtie, Alan L.Porter, Philip Shapira, Nils Newman
Organization(s): Georgia Institute of Technology, Search Technology
Source: Nanotechnology Environmental Health and Safety (Third Edition)
Year: 2018

Business intelligence through patent filings: An analysis of IP management strategies of ICT companies (full-text)

Business intelligence enables enterprises to make effective and good quality business decisions. In the knowledge economy, patents are seen as strategic assets for companies as they provide a competitive advantage and at the same time ensure the freedom to operate and form the basis for new alliances. Publication or disclosure of intellectual property (IP) strategy based on patent filings is rarely available in the public domain. Because of this, the only way to understand IP strategy is to look at patent filings, analyze them and, based on the trends, deduce strategy. This paper tries to uncover IP strategies of five US and Indian IT companies by analyzing their patent filings. Gathering business intelligence via means of patent analytics can be used to understand the strategies used by companies in advocating their patent portfolio and aligning their business needs with patenting activities. This study reveals that the Indian companies are far behind in protecting their IPs, although they are now on course correction and have started aggressively protecting their inventions. It is also observed that the rival companies in the study are not directly competing with each other in the same technological domain. Different patent filing strategies are used by firms to gain a competitive advantage. Companies make use of disclosure as strategy or try to cover many aspects of a technology in a single patent, thereby signaling their dominance in a technological area and at the same time as they add information.

Author(s): Shabib-Ahmed Shaikh, Tarun Kumar Singhal
Organization(s): Symbiosis International University (SIU), Symbiosis Centre for Management Studies
Source: Journal of Intelligence Studies in Business
Year: 2018