In this course, students are introduced to natural language processing as it applies to data mining, text mining and machine learning tasks with unstructured big data. Students will receive a broad survey of the major tasks in natural language understanding, with some coverage of natural language generation. Topics include document clustering and classification, automated tagging and highlighting, semantic search and text normalization to support machine learning applications. The focus is on best practices for choosing the right tool and method for an application, illustrated with real-world case studies. Students will gain experience building solutions from real-world data sets, utilizing WordNet and the data of some leading websites.
Text Classification, Clustering, Tagging, Synopsizing; Taxonomy Alignment; Corpus Analytics, Semantic Query Analysis