Global overview

We’ve just seen the basic components inside a database. We now need to step back to see the big picture.

A database is a collection of information that can easily be accessed and modified. But a simple bunch of files could do the same. In fact, the simplest databases like SQLite are nothing more than a bunch of files. But SQLite is a well-crafted bunch of files because it allows you to:

use transactions that ensure data are safe and coherent
quickly process data even when you’re dealing with millions of data

More generally, a database can be seen as the following figure:

Before writing this part, I’ve read multiple books/papers and every source had its on way to represent a database. So, don’t focus too much on how I organized this database or how I named the processes because I made some choices to fit the plan of this article. What matters are the different components; the overall idea is that a database is divided into multiple components that interact with each other.

The core components:

The process manager: Many databases have a pool of processes/threads that needs to be managed. Moreover, in order to gain nanoseconds, some modern databases use their own threads instead of the Operating System threads.
The network manager: Network I/O is a big issue, especially for distributed databases. That’s why some databases have their own manager.
File system manager: Disk I/O is the first bottleneck of a database . Having a manager that will perfectly handle the Operating System file system or even replace it is important.
The memory manager: To avoid the disk I/O penalty a large quantity of ram is required. But if you handle a large amount of memory, you need an efficient memory manager. Especially when you have many queries using memory at the same time.
Security Manager: for managing the authentication and the authorizations of the users
Client manager: for managing the client connections
…

The tools:

Backup manager: for saving and restoring a database.
Recovery manager: for restarting the database in a coherent state after a crash
Monitor manager: for logging the activity of the database and providing tools to monitor a database
Administration manager: for storing metadata (like the names and the structures of the tables) and providing tools to manage databases, schemas, tablespaces, …
…

The query Manager:

Query parser: to check if a query is valid
Query rewriter: to pre-optimize a query
Query optimizer: to optimize a query
Query executor: to compile and execute a query

The data manager:

Transaction manager: to handle transactions
Cache manager: to put data in memory before using them and put data in memory before writing them on disk
Data access manager: to access data on disk

For the rest of this article, I’ll focus on how a database manages an SQL query through the following processes:

the client manager
the query manager
the data manager (I’ll also include the recovery manager in this part)

2 Global overview

Global overview

results matching ""

No results matching ""