Dear
students get fully solved assignments
Send
your semester & Specialization name to our mail id :
“
help.mbaassignments@gmail.com ”
or
Call
us at : 08263069601
ASSIGNMENT
PROGRAM
|
Master of Science in Information
Technology(MSc IT)Revised Fall 2011
|
SEMESTER
|
4
|
SUBJECT CODE & NAME
|
MIT401– Data Warehousing and
Data Mining
|
CREDIT
|
4
|
BK ID
|
B1633
|
MAX.MARKS
|
60
|
Note:
Answer all questions. Kindly note that answers for 10 marks questions should be
approximately of 400 words. Each question is followed by evaluation scheme.
1
Differentiate between OLTP and Data Warehouses.
Answer : The data warehouse and the OLTP data base are both relational
databases. However, the objectives of both these databases are different.
The OLTP database records transactions in
real time and aims to automate clerical data entry processes of a business
entity. Addition, modification and deletion of data in the OLTP database is
essential and the semantics of the application used in the front end impact on
the organization of the data in the database.
The data warehouse on the other hand does
not cater to real time operational requirements of the enterprise. It is more a
storehouse of current and historical
Q2
Explain the Data Warehouse Kimball life cycle.
Answer : The Kimball Lifecycle methodology
was conceived during the mid-1980s by members of the Kimball Group and other
colleagues at Metaphor Computer Systems, a pioneering decision support company.
Since then, it has been successfully utilized by thousands of data warehouse
and business intelligence (DW/BI) project teams across virtually
Q3
Describe about Hyper Cube and Multicube.
Answer: Multidimensional databases can
present their data to an application using two types of cubes: hypercubes and
multicubes. In the hypercube model, as shown in the following illustration, all
data appears logically as a single cube. All parts of the manifold represented
by this hypercube have identical dimensionality.
In the multicube model, data is segmented
into a set of smaller cubes, each of which is composed of a subset of the
available dimensions, as shown in the following illustration:
Hypercubes and multicubes differ in terms
of available
Q.4
List and explain the Strategies for data reduction.
Answer: Data reduction is the process of
minimizing the amount of data that needs to be stored in a data storage
environment. Data reduction can increase storage efficiency and reduce costs.
Strategies
for data reduction:
TAKE
ADVANTAGE OF EXISTING INFORMATION: First of all, we
don't want to reinvent the wheel. There's a lot of existing information out
there for community health coalitions to take advantage of. Know your
community's history! Has this initiative or something similar been tried here
before? Even a failed attempt has valuable information to offer. Take advantage
of existing knowledge on risk reduction before you work like crazy to come up
with strategies and
Q.5
Describe K-means method for clustering. List its advantages and drawbacks.
Answer: k-means clustering is a method of
vector quantization, originally from signal processing, that is popular for
cluster analysis in data mining. k-means clustering aims to partition n
observations into k clusters in which each observation belongs to the cluster
with the nearest mean, serving as a prototype of the cluster. This results in a
partitioning of the data space into Voronoi cells.
The problem is computationally difficult
(NP-hard);
Q.6
Describe about Multilevel Databases and Web Query Systems.
Answer: Multilevel Databases: The main idea behind this approach is that
the lowest level of the database contains semi-structured information stored in
various Web repositories, such as hypertext documents. At the higher level(s)
meta data or generalizations are extracted from lower levels and organized in
structured collections, i.e. relational or object oriented databases. For
example, Han, et. al. use a multilayered database where each layer is obtained
via generalization and transformation operations performed on the lower layers.
Kholsa, et. al. propose the creation and maintenance of meta-databases at each
information providing domain and the use of a global schema for the
meta-database. King & Novak propose the incremental integration of a portion
of the schema from each information source, rather than relying on a global
heterogeneous database schema. The
Dear
students get fully solved assignments
Send
your semester & Specialization name to our mail id :
“
help.mbaassignments@gmail.com ”
or
Call
us at : 08263069601
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.