database management system
Also found in: Dictionary, Thesaurus, Medical, Acronyms, Wikipedia.
database management system[′dad·ə‚bās ′man·ij·mənt ‚sis·təm]
Database management system
A collection of interrelated data together with a set of programs to access the data, also called database system, or simply database. The primary goal of such a system is to provide an environment that is both convenient and efficient to use in retrieving and storing information.
A database management system (DBMS) is designed to manage a large body of information. Data management involves both defining structures for storing information and providing mechanisms for manipulating the information. In addition, the database system must provide for the safety of the stored information, despite system crashes or attempts at unauthorized access. If data are to be shared among several users, the system must avoid possible anomalous results due to multiple users concurrently accessing the same data.
Examples of the use of database systems include airline reservation systems, company payroll and employee information systems, banking systems, credit card processing systems, and sales and order tracking systems.
A major purpose of a database system is to provide users with an abstract view of the data. That is, the system hides certain details of how the data are stored and maintained. Thereby, data can be stored in complex data structures that permit efficient retrieval, yet users see a simplified and easy-to-use view of the data. The lowest level of abstraction, the physical level, describes how the data are actually stored and details the data structures. The next-higher level of abstraction, the logical level, describes what data are stored, and what relationships exist among those data. The highest level of abstraction, the view level, describes parts of the database that are relevant to each user; application programs used to access a database form part of the view level.
The overall structure of the database is called the database schema. The schema specifies data, data relationships, data semantics, and consistency constraints on the data.
Underlying the structure of a database is the logical data model: a collection of conceptual tools for describing the schema.
The entity-relationship data model is based on a collection of basic objects, called entities, and of relationships among these objects. An entity is a “thing” or “object” in the real world that is distinguishable from other objects. For example, each person is an entity, and bank accounts can be considered entities. Entities are described in a database by a set of attributes. For example, the attributes account-number and balance describe one particular account in a bank. A relationship is an association among several entities. For example, a depositor relationship associates a customer with each of her accounts. The set of all entities of the same type and the set of all relationships of the same type are termed an entity set and a relationship set, respectively.
Like the entity-relationship model, the object-oriented model is based on a collection of objects. An object contains values stored in instance variables within the object. An object also contains bodies of code that operate on the object. These bodies of code are called methods. The only way in which one object can access the data of another object is by invoking a method of that other object. This action is called sending a message to the object. Thus, the call interface of the methods of an object defines that object's externally visible part. The internal part of the object—the instance variables and method code—are not visible externally. The result is two levels of data abstraction, which are important to abstract away (hide) internal details of objects. Object-oriented data models also provide object references which can be used to identify (refer to) objects.
In record-based models, the database is structured in fixed-format records of several types. Each record has a fixed set of fields. The three most widely accepted record-based data models are the relational, network, and hierarchical models. The latter two were widely used once, but are of declining importance. The relational model is very widely used. Databases based on the relational model are called relational databases.
The relational model uses a collection of tables (called relations) to represent both data and the relationships among those data. Each table has multiple columns, and each column has a unique name. Each row of the table is called a tuple, and each column represents the value of an attribute of the tuple.
The size of a database can vary widely, from a few megabytes for personal databases, to gigabytes (a gigabyte is 1000 megabytes) or even terabytes (a terabyte is 1000 gigabytes) for large corporate databases.
The information in a database is stored on a nonvolatile medium that can accommodate large amounts of data; the most commonly used such media are magnetic disks. Magnetic disks can store significantly larger amounts of data than main memory, at much lower costs per unit of data.
To improve reliability in mission-critical systems, disks can be organized into structures generically called redundant arrays of independent disks (RAID). In a RAID system, data are organized with some amount of redundancy (such as replication) across several disks. Even if one of the disks in the RAID system were to be damaged and lose data, the lost data can be reconstructed from the other disks in the RAID system. See Computer storage technology
Logically, data in a relational database are organized as a set of relations, each relation consisting of a set of records. This is the view given to database users. The underlying implementation on disk (hidden from the user) consists of a set of files. Each file consists of a set of fixed-size pieces of disk storage, called blocks. Records of a relation are stored within blocks. Each relation is associated with one or more files. Generally a file contains records from only one relation, but organizations where a file contains records from more than one relation are also used for performance reasons.
One way to retrieve a desired record in a relational database is to perform a scan on the corresponding relation; a scan fetches all the records from the relation, one at a time.
Accessing desired records from a large relation using a scan on the relation can be very expensive. Indices are data structures that permit more efficient access of records. An index is built on one or more attributes of a relation; such attributes constitute the search key. Given a value for each of the search-key attributes, the index structure can be used to retrieve records with the specified search-key values quickly. Indices may also support other operations, such as fetching all records whose search-key values fall in a specified range of values.
A database schema is specified by a set of definitions expressed by a data-definition language. The result of execution of data-definition language statements is a set of information stored in a special file called a data dictionary. The data dictionary contains metadata, that is, data about data. This file is consulted before actual data are read or modified in the database system. The data-definition language is also used to specify storage structures and access methods.
Data manipulation is the retrieval, insertion, deletion, and modification of information stored in the database. A data-manipulation language enables users to access or manipulate data as organized by the appropriate data model. There are basically two types of data-manipulation languages: Procedural data-manipulation languages require a user to specify what data are needed and how to get those data; nonprocedural data-manipulation languages require a user to specify what data are needed without specifying how to get those data.
A query is a statement requesting the retrieval of information. The portion of a data-manipulation language that involves information retrieval is called a query language. Although technically incorrect, it is common practice to use the terms query language and data-manipulation language synonymously.
Database languages support both data-definition and data-manipulation functions. Although many database languages have been proposed and implemented, SQL has become a standard language supported by most relational database systems. Databases based on the object-oriented model also support declarative query languages that are similar to SQL.
SQL provides a complete data-definition language, including the ability to create relations with specified attribute types, and the ability to define integrity constraints on the data.
Query By Example (QBE) is a graphical language for specifying queries. It is widely used in personal database systems, since it is much simpler than SQL for nonexpert users.
Forms interfaces present a screen view that looks like a form, with fields to be filled in by users. Some of the fields may be filled automatically by the forms system. Report writers permit report formats to be defined, along with queries to fetch data from the database; the results of the queries are shown formatted in the report. These tools in effect provide a new language for building database interfaces and are often referred to as fourth-generation languages (4GLs). See Human-computer interaction
Often, several operations on the database form a single logical unit of work, called a transaction. An example of a transaction is the transfer of funds from one account to another. Transactions in databases mirror the corresponding transactions in the commercial world.
Traditionally database systems have been designed to support commercial data, consisting mainly of structured alphanumeric data. In recent years, database systems have added support for a number of nontraditional data types such as text documents, images, and maps and other spatial data. The goal is to make databases universal servers, which can store all types of data. Rather than add support for all such data types into the core database, vendors offer add-on packages that integrate with the database to provide such functionality.
database management system(database)
A database management system (DBMS) can be an extremely complex set of software programs that controls the organisation, storage and retrieval of data (fields, records and files) in a database. It also controls the security and integrity of the database. The DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data.
When a DBMS is used, information systems can be changed much more easily as the organisation's information requirements change. New categories of data can be added to the database without disruption to the existing system.
Data security prevents unauthorised users from viewing or updating the database. Using passwords, users are allowed access to the entire database or subsets of the database, called subschemas (pronounced "sub-skeema"). For example, an employee database can contain all the data about an individual employee, but one group of users may be authorised to view only payroll data, while others are allowed access to only work history and medical data.
The DBMS can maintain the integrity of the database by not allowing more than one user to update the same record at the same time. The DBMS can keep duplicate records out of the database; for example, no two customers with the same customer numbers (key fields) can be entered into the database.
Query languages and report writers allow users to interactively interrogate the database and analyse its data.
If the DBMS provides a way to interactively enter and update the database, as well as interrogate it, this capability allows for managing personal databases. However, it may not leave an audit trail of actions or provide the kinds of controls necessary in a multi-user organisation. These controls are only available when a set of application programs are customised for each data entry and updating function.
A business information system is made up of subjects (customers, employees, vendors, etc.) and activities (orders, payments, purchases, etc.). Database design is the process of deciding how to organize this data into record types and how the record types will relate to each other. The DBMS should mirror the organisation's data structure and process transactions efficiently.
Organisations may use one kind of DBMS for daily transaction processing and then move the detail onto another computer that uses another DBMS better suited for random inquiries and analysis. Overall systems design decisions are performed by data administrators and systems analysts. Detailed database design is performed by database administrators.
The three most common organisations are the hierarchical database, network database and relational database. A database management system may provide one, two or all three methods. Inverted lists and other methods are also used. The most suitable structure depends on the application and on the transaction rate and the number of inquiries that will be made.
Database machines are specially designed computers that hold the actual databases and run only the DBMS and related software. Connected to one or more mainframes via a high-speed channel, database machines are used in large volume transaction processing environments. Database machines have a large number of DBMS functions built into the hardware and also provide special techniques for accessing the disks containing the databases, such as using multiple processors concurrently for high-speed searches.
The world of information is made up of data, text, pictures and voice. Many DBMSs manage text as well as data, but very few manage both with equal proficiency. Throughout the 1990s, as storage capacities continue to increase, DBMSs will begin to integrate all forms of information. Eventually, it will be common for a database to handle data, text, graphics, voice and video with the same ease as today's systems handle data.
See also: intelligent database.