DESCRIPTION
It is no secret that the world is drowning in text and data. This causes real problems for everyday users who need to make sense of all the information available, and for software engineers who want to make their text-based applications more useful and user-friendly. Whether building a search engine for a corporate website, automatically organizing email, or extracting important nuggets of information from the news, dealing with unstructured text can be daunting.
Taming Text is a hands-on, example-driven guide to working with unstructured text in the context of real-world applications. It explores how to automatically organize text, using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization. This book gives examples illustrating each of these topics, as well as the foundations upon which they are built.
KEY POINTS
„h One-stop shop for learning how to process text
„h Clear, concise, and practical advice
„h Builds on high quality open source libraries
"synopsis" may belong to another edition of this title.
Grant Ingersoll
is an independent consultant developing search and natural language processing tools. He has worked on a number of text processing applications involving information retrieval, question answering, clustering, summarization, and categorization. Grant is a committer, as well as a speaker and trainer, on the Apache Lucene Java project and a co-founder of the Apache Mahout machine-learning project.
Thomas Morton writes software and performs research in the area of text processing and machine learning. He has been the primary developer and maintainer of the OpenNLP text processing project and Maximum Entropy machine learning project for the last 5 years. Currently, he works as a software architect for Comcast Interactive Media in Philadelphia.
Drew Farris is a professional software developer and technology consultant whose interests focus on large scale analytics, distributed computing and machine learning. He has contributed to a number of open source projects including Apache Mahout, Lucene and Solr, and holds a master's degree in Information Resource Management from Syracuse University's iSchool and a B.F.A in Computer Graphics.
"About this title" may belong to another edition of this title.
Seller: Bay State Book Company, North Smithfield, RI, U.S.A.
Condition: good. The book is in good condition with all pages and cover intact, including the dust jacket if originally issued. The spine may show light wear. Pages may contain some notes or highlighting, and there might be a "From the library of" label. Boxed set packaging, shrink wrap, or included media like CDs may be missing. Seller Inventory # BSM.130EY
Seller: BooksRun, Philadelphia, PA, U.S.A.
Paperback. Condition: Fair. First Edition. The item might be beaten up but readable. May contain markings or highlighting, as well as stains, bent corners, or any other major defect, but the text is not obscured in any way. Seller Inventory # 193398838X-7-1
Seller: World of Books (was SecondSale), Montgomery, IL, U.S.A.
Condition: Very Good. Item in very good condition! Textbooks may not include supplemental items i.e. CDs, access codes etc. Seller Inventory # 00104134840
Seller: BooksRun, Philadelphia, PA, U.S.A.
Paperback. Condition: Very Good. First Edition. It's a well-cared-for item that has seen limited use. The item may show minor signs of wear. All the text is legible, with all pages included. It may have slight markings and/or highlighting. Seller Inventory # 193398838X-11-1
Seller: Better World Books, Mishawaka, IN, U.S.A.
Condition: Good. Pages intact with minimal writing/highlighting. The binding may be loose and creased. Dust jackets/supplements are not included. Stock photo provided. Product includes identifying sticker. Better World Books: Buy Books. Do Good. Seller Inventory # 5476353-6
Seller: Better World Books: West, Reno, NV, U.S.A.
Condition: Very Good. Pages intact with possible writing/highlighting. Binding strong with minor wear. Dust jackets/supplements may not be included. Stock photo provided. Product includes identifying sticker. Better World Books: Buy Books. Do Good. Seller Inventory # 11066053-75
Seller: Wonder Book, Frederick, MD, U.S.A.
Condition: Good. Good condition. A copy that has been read but remains intact. May contain markings such as bookplates, stamps, limited notes and highlighting, or a few light stains. Bundled media such as CDs, DVDs, floppy disks or access codes may not be included. Seller Inventory # N04O-00902
Seller: ThriftBooks-Atlanta, AUSTELL, GA, U.S.A.
Paperback. Condition: As New. No Jacket. Pages are clean and are not marred by notes or folds of any kind. ~ ThriftBooks: Read More, Spend Less. Seller Inventory # G193398838XI2N00
Seller: ThriftBooks-Dallas, Dallas, TX, U.S.A.
Paperback. Condition: Very Good. No Jacket. May have limited writing in cover pages. Pages are unmarked. ~ ThriftBooks: Read More, Spend Less. Seller Inventory # G193398838XI4N00
Seller: ThriftBooks-Phoenix, Phoenix, AZ, U.S.A.
Paperback. Condition: Very Good. No Jacket. May have limited writing in cover pages. Pages are unmarked. ~ ThriftBooks: Read More, Spend Less. Seller Inventory # G193398838XI4N00