Tag Archives: Big Data

Big Data and Business: Tech Mining to Capture Business Interests and Activities around Big Data

Innovations around “Big Data” can be characterized in terms of rapid technology development and deployment dynamics. For this purpose, combining “tech mining” (extraction of usable intelligence) from publication and patent databases with tech mining of business-related databases can elucidate activities and interests of business communities regarding Big Data innovation pathways. In this paper, we focus on commercially oriented databases — ABI/INFORM as a source from which to extract business intents. We select the database to help gauge “hot topics” in industry with regard to Big Data. Our results show that certain types of firms can be clustered into thematic groups relating to Big Data discussions and activities. In the paper we demonstrate that such analyses can illuminate themes being pursued by businesses. Like social media analyses, this text mining can provide useful intelligence to inform more in-depth investigation mobilizing other data sources and techniques.
http://ieeexplore.ieee.org/abstract/document/7723686/

Author(s):  Ying Huang ; Jan Youtie ; Alan L. Porter ; Douglas K.R. Robinson ; Scott W. Cunningham ; Donghua Zhu
Organization(s): Beijing Institute of Technology, Georgia Institute of Technology
Source: 2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom) (BDCloud-SocialCom-SustainCom)
Year: 2016

Early social science research about Big Data

Recent emerging technology policies seek to diminish negative impacts while equitably and responsibly accruing and distributing benefits. Social scientists play a role in these policies, but relatively little quantitative research has been undertaken to study how social scientists inform the assessment of emerging technologies. This paper addresses this gap by examining social science research on ‘Big Data’, an emerging technology of wide interest. This paper analyzes a dataset of fields extracted from 488 social science and humanities papers written about Big Data. Our focus is on understanding the multi-dimensional nature of societal assessment by examining the references upon which these papers draw. We find that eight sub-literatures are important in framing social science research about Big Data. These results indicate that the field is evolving from general sociological considerations toward applications issues and privacy concerns. Implications for science policy and technology assessment of societal implications are discussed.

http://spp.oxfordjournals.org/content/early/2016/06/23/scipol.scw021.abstract

Author(s): Jan Youtie, Alan L. Porter and Ying Huang
Organization(s): Georgia Institute of Technology, Beijing Institute of Technology
Source: Science and Public Policy
Year:
2016

Big Data in the Social Sciences

Recent emerging technology policies seek to diminish negative impacts while equitably and responsibly accruing and distributing benefits.  Social scientists play a role in these policies, but relatively little quantitative research has been performed to study how social scientists inform the assessment of emerging technologies. This paper addresses this gap by examining social science research on “Big Data” – an emerging technology of wide interest. This paper analyzes a dataset of fields extracted from 488 social science and humanities papers written about Big Data. Our focus is on understanding the multi-dimensional nature of societal assessment by examining the references upon which these papers draw. We find that eight sub-literatures are important in framing social science research about Big Data. These results indicate that the field is evolving from general sociological considerations toward applications issues and privacy concerns. Implications for science policy and technology assessment of societal implications are discussed.

Preprint available at http://works.bepress.com/jan_youtie/80/

Author(s): Jan Youtie and Alan Porter
Organizations: Georgia Institute of Technology
Source: Science and Public Policy
Year: 2016

Meta Data: Big Data Research Evolving across Disciplines, Players, and Topics

We present a meta-analysis of BigData research activity since 2009. Our purpose here is to present “tech mining” (bibliometric and text analyses of research publication abstract record sets) to provide a research landscape of who is doing what, where, and when. Our larger purpose is to help Forecast Innovation Pathways for big data & analytics over the coming decade. We download 7006 research publication abstracts from Web of Science resulting from a search algorithm devised to recall a high percentage of core BigData research and a moderate percentage of peripherally related research (fair recall). We find interesting engagement of different disciplines in Big Data over time. On a national level, the USA and China dominate these fundamental research publications to a striking degree. Mapping topics presents interesting evidence on what topics are emerging in this dynamic field.

Author(s): Porter, A.L. ; Ying Huang ; Schuehle, J. ; Youtie, J.
Organization(s): Georgia Institute of Technology
Source: 4th IEEE International Congress on Big Data (BigData Congress)
http://www.researchgate.net/publication/280529689_MetaData_BigData_Research_Evolving_Across_Disciplines_Players_and_Topics
Year: 2015

A systematic method to create search strategies for emerging technologies based on the Web of Science: illustrated for ‘Big Data’

Bibliometric and “tech mining” studies depend on a crucial foundation—the search strategy used to retrieve relevant research publication records. Database searches for emerging technologies can be problematic in many respects, for example the rapid evolution of terminology, the use of common phraseology, or the extent of “legacy technology” terminology. Searching on such legacy terms may or may not pick up R&D pertaining to the emerging technology of interest. A challenge is to assess the relevance of legacy terminology in building an effective search model. Common-usage phraseology additionally confounds certain domains in which broader managerial, public interest, or other considerations are prominent. In contrast, searching for highly technical topics is relatively straightforward. In setting forth to analyze “Big Data,” we confront all three challenges—emerging terminology, common usage phrasing, and intersecting legacy technologies. In response, we have devised a systematic methodology to help identify research relating to Big Data. This methodology uses complementary search approaches, starting with a Boolean search model and subsequently employs contingency term sets to further refine the selection. The four search approaches considered are: (1) core lexical query, (2) expanded lexical query, (3) specialized journal search, and (4) cited reference analysis. Of special note here is the use of a “Hit-Ratio” that helps distinguish Big Data elements from less relevant legacy technology terms. We believe that such a systematic search development positions us to do meaningful analyses of Big Data research patterns, connections, and trajectories. Moreover, we suggest that such a systematic search approach can help formulate more replaceable searches with high recall and satisfactory precision for other emerging technology studies.

http://link.springer.com/article/10.1007/s11192-015-1638-y

Author(s): Ying Huang, Jannik Schuehle, Alan L. Porter, and Jan Youtie
Organization(s): Beijing Institute of Technology and Georgia Institute of Technology
Source: Scientometrics
Year: 2015