BT9001, Data Mining

Dear students get fully solved assignments
Send your semester & Specialization name to our mail id :
“ help.mbaassignments@gmail.com ”
or
Call us at : 08263069601


[ Spring 2015 ] ASSIGNMENT

PROGRAM
BSc IT
SEMESTER
FIFTH
SUBJECT CODE & NAME
BT9001, Data Mining
CREDITS
2
BK ID
B1188
MAX. MARKS
60


Note: Answer all questions.



Q. 1. What is Online Analytical Processing (OLAP)? Explain its benefits.

Answer:Short for Online Analytical Processing, a category of software tools that provides analysis of data stored in a database. OLAP tools enable users to analyze different dimensions of multidimensional data. For example, it provides time series and trend analysis views. OLAP often is used in data mining. The maturation of those concepts is realized in online analytical processing (OLAP). OLAP is designed to convert data into usable information by allowing the aggregation of data. This process allows you to answer




Q. 2. Define noisy data. Briefly explain the data smoothening techniques.

Answer: Noisy data is meaningless data. The term was often used as a synonym for corrupt data, but its meaning has expanded to include data from unstructured text that cannot be understood by machines.Noisy data unnecessarily increases the amount of storage space required and can also adversely affect the results of any data mining analysis. Statistical analysis can use information gleaned from historical data to weed out noisy data and facilitate data mining.  Noisy data can be caused by hardware failures, programming errors and gibberish input from speech or optical character recognition (OCR) programs. Spelling errors, industry abbreviations and slang can also impede machine reading.







Q. 3. Briefly explain mining quantitative association rules.

Answer:Lot of research has gone into understanding the composition and nature of proteins, still many things remain to be understood satisfactorily. It is now generally believed that amino acid sequences of proteins are not random, and thus the patterns of amino acids that we observe in the protein sequences are also non-random. We attempt to decipher the nature of associations between different amino acids that are present in a protein. This very basic analysis provides insights into the co-occurrence of certain amino acids in a protein. Such association rules are desirable for enhancing our understanding of protein composition and hold the potential to give clues regarding the global interactions amongst some particular sets o




Q. 4. Briefly explain Agent Based and Database Approaches to web mining.

Answer: Web content mining is the mining, extraction and integration of useful data, information and knowledge from Web page content. The heterogeneity and the lack of structure that permits much of the ever-expanding information sources on the World Wide Web, such as hypertext documents, makes automated discovery, organization, and search and indexing tools of the Internet and the World Wide Web such as Lycos, Alta Vista, WebCrawler, ALIWEB [6], MetaCrawler, and others provide some comfort to users, but they do not generally provide structural information nor categorize, filter, or interpret documents. In recent years these factors have prompted researchers to develop more intelligent tools for information




Q. 5. Define text mining. State the text retrieval methods.

Answer: Text mining, also referred to as text data mining, roughly equivalent to text analytics, refers to the process of deriving high-quality information from text. High-quality information is typically derived through the devising of patterns and trends through means such as statistical pattern learning. Text mining usually involves the process of structuring the input text (usually parsing, along with the addition of some derived linguistic features and the removal of others, and subsequent insertion into a database), deriving patterns within the structured data, and finally evaluation and interpretation of the output. 'High quality' in text



Q. 6. How data mining is used in telecommunication field? Explain.

Answer:Data mining is widely used in diverse areas. There are a number of commercial data mining system available today and yet there are many challenges in this field. In this tutorial, we will discuss the applications and the trend of data mining. The goal of data mining analysis was to determine ~f cluster analysis could be used for finding interesting segments in the business sector of the telecommunication market. The sample consisted of data of the companies that were clients of a telecommunication company. K-means algorithm is applied showing that microsegmentation approach based on data for each individual client gives additional observation into the usual approach to industrial market segmentation.


Dear students get fully solved assignments
Send your semester & Specialization name to our mail id :
“ help.mbaassignments@gmail.com ”
or
Call us at : 08263069601


No comments:

Post a Comment

Note: only a member of this blog may post a comment.