Section 02 Team 02: Text Mining

Emily Kosciulek and Bobby Trivett
ITEC-200 6:45pm

Introduction to Text Mining and its Business Advantages
I. Executive Summary

The use of data mining, and more specifically text mining, offer the information technology field new insights into how to analyze and draw new information from both data and text. The possibility of seeing patterns in data or text provide an important breakthrough to many different fields, allowing businesses look at vast amounts of numerical and textual data and draw conclusions that could not have been seen without this new technology. The mining business has become much bigger over the past several years (Hearst 1). There are now several products that are out on the market that make it easy to sift through mountains of data in a very short time period, that are now available for home use. According to the Gartner Group Advanced Technology Research, data mining is at the top of the five key technology areas that will have a major impact across a wide range of industries within the next five to ten years. This quickly growing technology is already being incorporated into several fields and is something that will change the face of data information technology.

II. Technology

Text mining is the ability to use a computer to research new ideas and information that can be found by extracting text from different databases. This allows for new facts and theories to be created using pre-existing data. The most familiar thing to text mining for consumers would be any computerized search engine like Google (Mailvaganam 11). However, search engines are used to find information that has already been created or analyzed by other people. Text Mining uses pre-existing data to draw conclusions that have not already been made by other researchers. Text Mining is a sub-unit of data mining. Data mining involves the creation of facts and theories through hard facts in databases, while text mining actually looks at the text that the computer then “reads.” The foundation of this research begins with the find feature in applications like Microsoft Word. This desire to easily and readily find data transformed into what today we see as text and data mining (Mailvaganam 12).
The fields that are best equipped to find the process of data and text mining useful usually involve business that deal with enormous amounts of data at one time. Organizations like Amazon.com who receive numerous orders a day, call centers and sales groups with customer information all are prime candidates to use data and text mining. Each company can upload all the data that they want to use into something called a data warehouse. From this point the mining can take three different paths most commonly. Complex algorithms could be run by the software to find consumer behavior patterns or things of that nature. It can also do online analytical processing applications (OLAP) that build relationships between data points to discover trends, current and historical. The last most common thing used by this software is reporting engines crunch numbers for things like sales or income levels. This is all packaged for use by managers, suppliers, business analysts and sales representatives (Hearst 5).
III. Business Advantages
Text mining and data mining provide significant advantages for all workplaces, especially in the areas of finance, retail, communication, and marketing, and is a valuable technology that can result in increased revenue for any business. Currently text mining, specifically data mining, is used in the fields of security, medical, software, online media, marketing, and academic, but this new technology is quickly expanding into new industries. Businesses using data mining possess two significant advantages over their competitors. One main advantage is that text and data mining discovers new and unknown patterns and trends. According to Kurt Thearling, data mining “enables companies to determine relationships among ‘internal’ factors such as price, product positioning, or staff skills, and ‘external’ factors such as economic indicators, competition, and customer demographics”. Data mining tools help a business predict future trends in consumer behavior, allowing businesses make more knowledgeable decisions (Thearling).
The information gathered through data mining allows a company to practice targeted marketing so that customers receive appropriate promotions relevant to their purchase history. By sending out targeted promotions based on a customer’s purchase history, the company saves money through decreasing mailings and increasing the possible response rate to those mailings. For example, if a retailer determines, with the help of data mining, which customers have bought diapers in the past month, that retailer can send out diaper promotions to those customers only and save money by not sending it to customers who have never before bought diapers. American Express uses data mining in this way to suggest products to its customers.
By sorting through large amounts of data, text and data mining picks up patterns, trends, and relationships that analysts may miss due to the large volume and time-consuming nature of digging through data (Thearling). The technology makes links between seemingly unconnected documents and creates visual maps to help the user understand the information (Guersney). Because data mining can determine these relationships, businesses that are especially focused on consumers will find that this technology allows them to predict the purchasing behavior of their customers and thus act accordingly when stocking thei inventory. WalMart uses this technology as a way of managing inventory and pinpointing possible new products through examining consumer purchasing patterns (Thearling). By using technology instead of manpower to analyze massive databases, businesses save time and money and get more accurate and thorough results.
The second main advantage is that text and data mining makes the acquisition of relevant knowledge easier and faster. An article titled “Data Mining-Data Mining Advantages” compared data mining to finding a needle in a haystack. Data mining scans through entire databases and can read through thousands of pages an hour. For example, data mining software from Chicago-based firm “SPSS” can scan through 250,000 pages an hour, which far surpasses the sixty pages per minute human reading rate (Guersney). Publishers and researchers in the academic world find data mining especially useful as a means of sifting through large databases, saving time formerly spent doing research. This quality makes data mining a very valuable investment not only for academic institutions, but nonprofit organizations and foundations as well.
In conclusion, this recent technology is a very valuable and versatile tool that can be used in a variety of workplaces and situations. Because data mining picks up trends and patterns through scanning documents quickly, businesses using this technology to improve efficiency and effectiveness and decrease expenses.

Figure 1: Data mining architecture (Thearling)
dmwhite.htm

Figure 2: A diagram of the data mining process
dmwhite.htm

Presentation can be viewed at:
http://docs.google.com/Presentation?id=dgcsqt3x_22gqvqghfj&hl=en

Works Cited

1. "Data Mining." Data Mining. 2005. Scianta Intelligence. 23 June 2009
<http://scianta.com>.

2. "Data Mining-Data Mining Advantages." Execution for System. 2009. Exforsys. 23 June
2009 <http://www.exforsys.com/>.

3. Guernsey, Lisa. "Digging for Nuggets of Wisdom." New York Times 16 Oct. 2003. 19 June 2009 <nytimes.com>.

4. Hearst, Marti. "What is Text Mining?" 17 Oct. 2003. University of California Berkeley. 19 June 2009
<http://people.ischool.berkeley.edu/~hearst/text- mining.html>.

5. Mailvaganam, Hari. "Database Warehousing Review - Text Mining." Database
Warehouse Review. May 2007. 31 May 2009
<http://www.dwreview.com/Data_mining/text_mining.html>.

6. Palace, Bill. "What is Data Mining?" Data Mining. 1996. University of California at Los
Angeles. 19 June 2009
<http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/da
tamining.htm>.

7. "Text Mining: SAS Text Miner." Text Mining: SAS Text Miner. 29 May 2009. SAS. 31
May 2009 <http://www.sas.com/technologies/
analytics/datamining/textminer/>.

8. "Text Mining." StatSoft. 2008. StatSoft. 19 June 2009 <http://www.statsoft.com>.

9. Thearling, Kurt. "An Introduction to Data Mining." An Introduction to Data Mining.
Thearling.com. 19 June 2009 <http://www.thearling.com/index.htm>.

page_revision: 12, last_edited: 1246037608|%e %b %Y, %H:%M %Z (%O ago)
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License