CHAPTER 6 DATABASES AND DATA WAREHOUSES
1. List, describe, and provide an example of each of the five characteristics of high quality information.
Characteristics of high-quality information include:
• Accuracy- Accuracy is the degree to which information on a map or in a digital database matches true or accepted values. Accuracy is an issue pertaining to the quality of data and the number of errors contained in a dataset or map. In discussing a GIS database, it is possible to consider horizontal and vertical accuracy with respect to geographic position, as well as attribute, conceptual, and logical accuracy.
o The level of accuracy required for particular applications varies greatly.
o Highly accurate data can be very difficult and costly to produce and compile
Completeness- Having all necessary or normal parts, components, or steps; entire.
Consistency- Consistency is one of the ACID properties that ensures that any changes to values in an instance are consistent with changes to other values in the same instance. A consistency constraint is a predicate on data which serves as a precondition, post-condition, and transformation condition on any transaction. The Database Management System(DBMS) assumes that the consistency holds for each transaction in instances. On the other hand, ensuring this property of the transaction is the responsibility of the user.
Uniqueness - n computing, a unique type guarantees that an object is used in a single-threaded way, with at most a single reference to it. If a value has a unique type, a function applied to it can be made to update the value in-place in the object code. In-place updates improve the efficiency of functional languages while maintaining referential transparency. Unique types can also be used to integrate functional and imperative programming.
Timeliness- Occurring at a suitable or opportune time; well-timed.
2. Define the relationship between a database and a database management system.
A database is the heart of an organisation, it stores key business information like;
Sales Data – customers, sales, contacts
Inventory Data – orders, stock, delivery
Student Data – names, addresses, grades
All businesses use a database of some type. Effective managers know the value of extracting of important data.
A structured collection of related data
A filing cabinet, an address book, a telephone directory, a timetable, etc.
In Access, your Database is your collection of related tables
Database management systems (DBMS) – software through which users and application programs interact with a database.
Database advantages from a business perspective include
Increased flexibility
Increased scalability and performance
Reduced information redundancy
Increased information integrity (quality)
Increased information security
3. Describe the advantages an organisation can gain by using a database.
• Reduced data redundancy
• Reduced updating errors and increased consistency
• Greater data integrity and independence from applications programs
• Improved data access to users through use of host and query languages
• Improved data security
• Reduced data entry, storage, and retrieval costs
• Facilitated development of new applications program
• A database gives users access to data, which they can view, enter, or update, within the limits of the access rights granted to them. Databases become all the more useful as the amount of data stored continues to grow.
• A database can be local, meaning that it can be used on one machine by one user only, or it can be distributed, meaning that the information is stored on remote machines and can be accessed over a network.
• The primary advantage of using databases is that they can be accessed by multiple users at once.
4. Define the fundamental concepts of the relational database model.
The fundamental assumption of the relational model is that all data is represented as mathematical n-ary relations, an n-ary relation being a subset of the Cartesian product of ndomains. In the mathematical model, reasoning about such data is done in two-valued predicate logic, meaning there are two possible evaluations for each proposition: either true or false(and in particular no third value such as unknown, or not applicable, either of which are often associated with the concept of NULL). Some think two-valued logic is an important part of the relational model, while others think a system that uses a form of three-valued logic can still be considered relational.[citation needed][who?]
Data are operated upon by means of a relational calculus or relational algebra, these being equivalent in expressive power.
The relational model of data permits the database designer to create a consistent, logical representation of information. Consistency is achieved by including declared constraints in the database design, which is usually referred to as the logical schema. The theory includes a process of database normalization whereby a design with certain desirable properties can be selected from a set of logically equivalent alternatives. The access plans and other implementation and operation details are handled by the DBMS engine, and are not reflected in the logical model. This contrasts with common practice for SQL DBMSs in which performance tuning often requires changes to the logical model.
The basic relational building block is the domain or data type, usually abbreviated nowadays to type. A tuple is an unordered set of attribute values. An attribute is an ordered pair ofattribute name and type name. An attribute value is a specific valid value for the type of the attribute. This can be either a scalar value or a more complex type.
A relation consists of a heading and a body. A heading is a set of attributes. A body (of an n-ary relation) is a set of n-tuples. The heading of the relation is also the heading of each of its tuples.
A relation is defined as a set of n-tuples. In both mathematics and the relational database model, a set is an unordered collection of items, although some DBMSs impose an order to their data. In mathematics, a tuple has an order, and allows for duplication. E.F. Codd originally defined tuples using this mathematical definition.[6] Later, it was one of E.F. Codd's great insights that using attribute names instead of an ordering would be so much more convenient (in general) in a computer language based on relations[citation needed]. This insight is still being used today. Though the concept has changed, the name "tuple" has not. An immediate and important consequence of this distinguishing feature is that in the relational model theCartesian product becomes commutative.
A table is an accepted visual representation of a relation; a tuple is similar to the concept of row, but note that in the database language SQL the columns and the rows of a table are ordered.[citation needed]
A relvar is a named variable of some specific relation type, to which at all times some relation of that type is assigned, though the relation may contain zero tuples.
The basic principle of the relational model is the Information Principle: all information is represented by data values in relations. In accordance with this Principle, a relational database is a set of relvars and the result of every query is presented as a relation.
The consistency of a relational database is enforced, not by rules built into the applications that use it, but rather by constraints, declared as part of the logical schema and enforced by the DBMS for all applications. In general, constraints are expressed using relational comparison operators, of which just one, "is subset of" (⊆), is theoretically sufficient. In practice, several useful shorthands are expected to be available, of which the most important are candidate key (really, superkey) and foreign key constraints.
http://en.wikipedia.org/wiki/Relational_model
5. Describe the benefits of a data-driven website.
A data-driven website is a site that can easily and quickly be updated by its managers to display requested information to the website user in the most effective way. Conversely, astatic website offers its users information that is rarely updated, but a data-driven website will constantly be updated with more recent and accurate information. The benefits of a data-driven website are numerous. The first and major benefit is that changing the content of the website can be done without specialized knowledge or expertise. Managing the website can be done with minimal training. The website administrator/master does not need to know HTML or programming in order to make any changes and updating a data-driven websites only takes a couple of clicks all in a few seconds. The second benefit is the level of speed when the website manager makes changes. When hosting a data-driven website, changing the content is done almost in real-time. Thirdly, data-driven websites inherently have a great deal of scalability. To this end, expanding a website is very simple which leaves plenty of room for growth. The graphics, layout or interactivity of a website can be changed anytime. This is great for companies that start out small then turn into medium size businesses and later evolve to become large corporations. The fourth advantage includes reduced error rate. Data entry employees are bound to make mistakes when making changes. Anyone that is designated with maintaining a website can also makes lots of errors. As a result, the system will experience inconsistencies, bugs, and flaws that will slow down the interactivity of the website and possibly corrupt the integrity of the information posted on the website. Fixing all of these problems will require significant efforts and resources. Fortunately, data-driven websites solve this dilemma by making it easy to fix any issues with the system. Of course, this makes an online customer very happy when they don’t have to deal with the mistakes of the website creators. The fifth advantage is the efficiency that is created when implementing a data-driven website. In a society that experiences frequent changes in trends and patterns, updating new information can by extensive. For this reason, companies using data-driven websites can make modifications very productively. Creating or storing the background template, layout, design, interface and structure of the website needs to only be done once. When a website administrator leaves the company, someone else can replace him without needing to search hard for another website administrator. This will increase the reliability and stability of the website along with the company’s reputation and goodwill.
http://sites.google.com/site/b188sjsu/Home/database/data-driven-website
6. Describe the roles and purposes of data warehouses and data marts in an organisation?
Data warehouse – a logical collection of information, gathered from many different operational databases, that supports business analysis activities and decision-making tasks.
The primary purpose of a data warehouse is to aggregate information throughout an organisation into a single repository for decision-making purposes.
Data warehouses are optimized for speed of data analysis. Frequently data in data warehouses are denormalised via a dimension-based model. Also, to speed data retrieval, data warehouse data are often stored multiple times—in their most granular form and in summarized forms called aggregates. Data warehouse data are gathered from the operational systems and held in the data warehouse even after the data has been purged from the operational systems.
A data warehouse provides a common data model for all data of interest regardless of the data's source. This makes it easier to report and analyze information than it would be if multiple data models were used to retrieve information such as sales invoices, order receipts, general ledger charges, etc.
Prior to loading data into the data warehouse, inconsistencies are identified and resolved. This greatly simplifies reporting and analysis.
Information in the data warehouse is under the control of data warehouse users so that, even if the source system data is purged over time, the information in the warehouse can be stored safely for extended periods of time.
Because they are separate from operational systems, data warehouses provide retrieval of data without slowing down operational systems.
Data warehouses can work in conjunction with and, hence, enhance the value of operational business applications, notably customer relationship management (CRM) systems.
Data warehouses facilitate decision support system applications such as trend reports (e.g., the items with the most sales in a particular area within the last two years), exception reports, and reports that show actual performance versus goals.
A data mart is a subset of an organizational data store, usually oriented to a specific purpose or major data subject, that may be distributed to support business needs.[1] Data marts are analytical data stores designed to focus on specific business functions for a specific community within an organization. Data marts are often derived from subsets of data in a data warehouse, though in the bottom-up data warehouse design methodology the data warehouse is created from the union of organizational data marts. It’s used for following puposes:
Easy access to frequently needed data
Creates collective view by a group of users
Improves end-user response time
Ease of creation
Lower cost than implementing a full Data warehouse
Potential users are more clearly defined than in a full Data warehouse
A data mart can contain star schemas and other tables for more than one warehouse pack. For example, a single data mart might contain data for the following reporting needs:
• Single customer analysis for performance engineers
• Infrastructure analysis for network analysts
• Summarized, overall customer health for service level agreement management
httphttp://en.wikipedia.org/wiki/Data_warehouse
://publib.boulder.ibm.com/tividd/td/TEDW/SC32-1497-00/en_US/HTML/srfmst157.htmhttp://en.wikipedia.org/wiki/Data_mart
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment