C. Lee Giles, Professor, Information Sciences & Technology, Pennsylvania State University


C. Lee Giles, Professor, Information Sciences & Technology, Pennsylvania State University

Sage 4101

May 4, 2016 12:00 PM - 1:30 PM


Collections of scholarly documents are usually not thought of as big data. However, large collections of scholarly documents often have many millions of publications, authors, citations, equations, figures, etc., and large scale related data and structures such as social networks, slides, data sets, etc. We discuss scholarly big data challenges, insights, methodologies and applications. We illustrate scholarly big data issues with examples of specialized search engines and recommendation systems based on the SeerSuite software, such as CiteSeerX. Using information extraction and data mining, we illustrate applications and semantics of information in such diverse areas as computer science, chemistry, archaeology, acknowledgements, citation recommendation, collaboration recommendation, and others.


Dr. C. Lee Giles is the David Reese Professor of Information Sciences and Technology at the Pennsylvania State University with appointments in the departments of Computer Science and Engineering, and Supply Chain and Information Systems.  His research interests are in intelligent cyberinfrastructure and big data, specialty search engines, information retrieval, digital libraries, web-services, knowledge and information extraction, data mining, entity disambiguation, and social networks.  He has published over 400 papers in these areas with over 28,000 citations and an h-index of 79 according to Google Scholar.  He was a co-creator of the popular search engine CiteSeer (now CiteSeerX) and related scholarly search engines.  He is a fellow of the ACM, IEEE, INNS and recently received the Gabor Award from the Int. Neural Network Society (INNS).


Download paper 1 here.

Download paper 2 here.

Add to calendar