Library Workshops and Events

Event Details

Preparing Files for Text and Data Mining

Open to:
-Faculty -Graduate Students -Postdocs -Staff -Undergraduate Students

Text mining, a process for extracting information from unstructured text, requires everyday files (PDF, Word, HTML, etc.) to be transformed into plain text files. Once your files are in a plain text format (no bold, no italics, no underlining, etc.) they are ready for automated processing and computer analysis.

This hands-on workshop will demonstrate and facilitate the use of a free Java-based program called Tika to do this work. More specifically, this workshop will help attendees install Tika and use it to convert just about any file into plain text, and then participants will be empowered to use a myriad of text mining services available on the 'Net. 

Date
Thursday, August 30, 2018
Time
12:00PM - 1:00PM
Location
Navari Family Center for Digital Scholarship // Hesburgh Library 2nd Floor // Consultation Room 247
Instructor
Eric Lease Morgan
Categories
CDS | Text Mining & Analysis Workshops
Registration has closed.

Contact Info

Profile photo of Eric Lease Morgan
Eric Lease Morgan

Hesburgh Library
131 Hesburgh Library
University of Notre Dame
Notre Dame, IN 46556

(574) 631-8604
emorgan@nd.edu
Profile photo of Julie Vecchio
Julie Vecchio

Hesburgh Library
250 Hesburgh Library
University of Notre Dame
Notre Dame, IN 46556

(574) 631-4900
jvecchio@nd.edu