ALL ABOUT POSTGRESQL DATABASE FOR BEGINNER
POSTGRESQL DATABASE
Table of Contents
Major features of Postgresql
Portal: Written in ANSI C and Supports Windows, Linux, Mac and major UNIX Platforms.
Reliable: have ACID Compliant
a) Supports Transactions
b) Supports Savepoints
c) Uses Write Ahead Logging
Scalable
a) Uses Multiversion Concurrency Control
b) Uses Row-Level Locking
c) Supports Table Partitioning
d) Supports Tablespaces
Secure: Employs Host-Based Access Control
a) Provides Object-Level Permissions
b) Supports Logging
c) SSL
Available:
a) Replication Support
b) Support for High Availability
Architecture diagram:
Step1) user requests from front-end to make connection with database the libpq library allows a single frontend to make multiple connections to backend processes.
Step2) postmaster process get requests and use /var/lib/pgsql/data/ pg_hba.conf files in which informations is stored to make connection with which ip/hostname with what type of authentication.
Step3) After authentication and authorization postmaster spawns process per user “postgres” server process.
Then requests simply to shared memory area where different type of memory are present like temp_buffer, work_mem etc.
Details for memory components
shared_buffers
If you have a dedicated database server with 1GB or more of RAM, a reasonable starting value for shared_buffers is 25% of the memory in your system. There are some workloads where even large settings for shared_buffers are effective, but because PostgreSQL also relies on the operating system cache, it is unlikely that an allocation of more than 40% of RAM to shared_buffers will work better than a smaller amount.
temp_buffers
Sets the maximum number of temporary buffers used by each database session. These are session-local buffers used only for access to temporary tables. The default is eight megabytes (8MB). The setting can be changed within individual sessions, but only before the first use of temporary tables within the session; subsequent attempts to change the value will have no effect on that session.
work_mem
Specifies the amount of memory to be used by internal sort operations and hash tables before writing to temporary disk files.
maintenance_work_mem
Specifies the maximum amount of memory to be used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY.
Wal buffer
In WAL_buffer we have all modified information like all insert, update and delete operations and write data by background process and when check point occur.
Background process in Postgresql database -
· Mandatory process: These process are not having an option of Enable/Disable.
· Postgresql utilizes separate processes instead of threads.
Command to check running process in postgresql
- BGWriter – which writes uncommitted data from shared memory buffer and then write it to data file.
- WAL Writer – WAL stands for Write Ahead logs, all modified data are first written into WAL buffers and then it is written into datafile and which help us in recovery for database. Before Commit - Uncommitted updates are in memory After Commit -Committed updates written from shared memory to disk (write-ahead log file) After Checkpoint -Modified data pages are written from shared memory to the data files.
- Checkpoint Process – The process is responsible for flushing all dirty data pages to disk (called a checkpoint) A check pointer, a background process, periodically performs check pointing. Whenever the check pointer starts, it writes a XLOG record called checkpoint record to the current WAL segment. This record contains the location of the latest REDO point.
- Stats collector process – It is an optional process, default is ON. Collect information about cluster activity like Number of access to tables and indexes, total number of rows in each table. Collection of Statistics add some overhead to query execution, but allows the query planner to make better choice.
Enable Stats collection in Postgresql database
PostgreSQL has a set of statistics access functions, as well as a set of predefined statistics views. The views use the predefined statistics functions.
By default only a small number of statistics are collected. The statistics collection is controlled by the following configuration parameters:
Track counts - controls whether statistics are collected about table and index accesses
Track_functions - enables tracking of user-defined functions
Track activities - enables monitoring of the current command being executed by any server process
To turn on statistics collection, we need to make change in postgresql.conf file and set the following parameters:
track_activities = on
track_counts = on
track_functions = all
Insertion operations with WAL
- A check pointer, a background process, periodically performs check pointing. Whenever the check pointer starts, it writes a XLOG record called checkpoint record to the current WAL segment. This record contains the location of the latest REDO point.
- Issuing the first INSERT statement, PostgreSQL loads the TABLE_A's page into the shared buffer pool, inserts a tuple into the page, creates and writes a XLOG record of this statement into the WAL buffer at the location LSN_1, and updates the TABLE_A's LSN from LSN_0 to LSN_1.
- As this transaction commits, PostgreSQL creates and writes a XLOG record of this commit action into the WAL buffer, and then, writes and flushes all XLOG records on the WAL buffer to the WAL segment file, from LSN_1.
- Issuing the second INSERT statement, PostgreSQL inserts a new tuple into the page, creates and writes this tuple's XLOG record to the WAL buffer at LSN_2, and updates the TABLE_A's LSN from LSN_1 to LSN_2.
- When this statement's transaction commits, PostgreSQL operates in the same manners as in step 3.
- Imagine when the operating system failure should occur. Even though all of data on the shared buffer pool are lost, all modifications of the page have been written into the WAL segment files as history data.
Note: Memory calculations
Look for 'shared_buffers = Xkb' Set this equal to about 1/4 of your total memory. If you have 2GB of memory you would change this to 'shared buffer = 512MB'.
Then look for 'work_mem =' and set this value equivalent to 1/64 of your total RAM Ex) 2GB memory = 32MB of work_mem
Then 'maintenance_work_mem =' and set to approximately 1/16 of your total RAM. Ex) 2GB memory = 128MB of maintenance_work_memory.
Then look for 'work_mem =' and set this value equivalent to 1/64 of your total RAM Ex) 2GB memory = 32MB of work_mem
Then 'maintenance_work_mem =' and set to approximately 1/16 of your total RAM. Ex) 2GB memory = 128MB of maintenance_work_memory.
Oracle instance vs postgresql instance
In Oracle database we have different instance for every single database where as in Postgresql database we have single instance and have multiple database on it.
Default location for postgresql database is /usr/pgsql-9.4/lib where all database configuration file whereas in oracle default location is /u01/app/oracle/product/12.1.0.2/db.
In postgresql default database are already created like postgres and template0 and template1 (which are templates in postgresql). INWEB, DWH, SALES are seprates tablespace used by users.
Difference between template0 and template1
At the point you initialize a database cluster, template0 and template1 are the same. Any location-specific stuff you want to make available to every database you create by using CREATE DATABASE should go into template1. So, for example, if you add the procedural language PL/python to template1, every database you create later will include PL/python.
The database template0 is intended to be a "simple" template. It should contain only standard database objects--the ones created by initializing the cluster. As a "simple" template, it should never be changed. Never.
The database template0 is intended to be a "simple" template. It should contain only standard database objects--the ones created by initializing the cluster. As a "simple" template, it should never be changed. Never.
LETS STARTS WITH DATABASE:
Before do anything, you must initialize a database storage area on disk. We call this a database cluster.
Postgres uses directory (might be referred by environment variable PGDATA)traditionally called database cluster to store all necessary data managed by Postgres instance.which is mandatory parameter for starting Postgres instance.
Contains configuration files, and in default setup also files for all databases residing within a particular Postgres cluster.
A database cluster is a collection of databases that is managed by a single instance of a running database server. After initialization, a database cluster will contain a database named postgres, which is meant as a default database for use by utilities, users and third party applications. The database server itself does not require the postgres database to exist, but many external utility programs assume it exists. Another database created within each cluster during initialization is called template1. As the name suggests, this will be used as a template for subsequently created databases
It is completely up to you where you choose to store your data. There is no default, although locations such as /usr/local/pgsql/data or /var/lib/pgsql/data are popular
Create new directory locations where you want to create database files:
initdb will attempt to create the directory you specify if it does not already exist. Of course, this will fail if initdb does not have permissions to write in the parent directory. It's generally recommendable that the PostgreSQL user own not just the data directory but its parent directory as well, so that this should not be a problem. If the desired parent directory doesn't exist either, you will need to create it first, using root privileges if the grandparent directory isn't writable. So the process might look like this:
root# mkdir /usr/local/pgsql
root# chown postgres /usr/local/pgsql
root# su postgres
postgres$ initdb -D /usr/local/pgsql/data
Initdb will refuse to run if the data directory exists and already contains files; this is to prevent accidentally overwriting an existing installation.
FILE LOCATIONS:
1) Data directory (string)
Specifies the directory to use for data storage. This parameter can only be set at server start.
Before change this directory take backup and change this parameter in postgresql.conf
Data directory = '/mnt/volume-nyc1-01/postgresql/9.5/main'.
2) When a connection is initialized, PostgreSQL will read through the pg_hba.conf one entry at a time, from the top down. As soon a matching record is found, PostgreSQL will stop searching and allow or reject the connection, based on the found entry. If PostgreSQL does not find a matching entry in the pg_hba.conf file, the connection fails completely.
Where TYPE means
Host
A host entry is used to specify remote hosts that are allowed to connect to the PostgreSQL server. PostgreSQL's postmaster backend must be running with the -i option (TCP/IP) in order for a host entry to work correctly.
Local
A local entry is semantically the same as a host entry. However, you do not need to specify a host that is allowed to connect. The local entry is used for client connections that are initiated from the same machine that the PostgreSQL server is operating on.
Database means if you want to specifies particular database or all database
ip_addr, netmask
The ip_addr and netmask fields specify either a specific IP address, or range of IP addresses, that are allowed to connect to the PostgreSQL server. Such a range can by specified by describing an IP network with an associated netmask. Otherwise, for a single IP address, the netmask field should be set to 255.255.255.255.
If you are unsure of how to specify a netmask, view the online Linux Networking HOWTO, at http://www.thelinuxreview.com/howto/networking, or consult your system administrator.
auth_method
The authentication method specifies the type of authentication the server should use for a user trying to connect to PostgreSQL. The following is a list of options available for auth_method:
Trust
The trust method allows any user from the defined host to connect to a PostgreSQL database without the use of a password, as any PostgreSQL user. You are trusting the host-based authentication with the use of this method, and any user on the specified host. This is a dangerous condition if the specified host is not a secure machine, or provides access to users unknown to you.
Reject
The reject method automatically denies access to PostgreSQL for that host or user. This can be a prudent setting for sites that you know are never allowed to connect to your database server.
Password
The password method specifies that a password must exist for a connecting user. The use of this method will require the connecting user to supply a password that matches the password found in the global pg_shadow system table for their username. If you use the password method, the password will be sent in clear text.
Crypt
The crypt method is similar to the password method. When using crypt, the password is not sent in clear text, but through a simple form of encryption. The use of this method is not very secure, but is better than using the clear text password method.
krb4, krb5
The krb4 and krb5 methods are used to specify Version 4 or 5 of the Kerberos authentication system. The installation and configuration of Kerberos is beyond the scope of this book, but if you wish to authenticate via Kerberos, these methods are available.
The pg_ident.conf file
When specifying the ident term as a host record's authentication method, PostgreSQL uses the pg_ident.conf file to map the identifying username to a PostgreSQL username. The identifying username is the name provided by the connecting client's identd service (RFC 1413), which is required to identify the name of the system account initiating the connection. This method is similar to the trust method, but restricts access based on the identifying username.
A single line record to define an ident map consist of 3 tokens: the name of the map, the identifying username, and the translated PostgreSQL username. This syntax is entered as follows, where each token is separated by spaces, or tabs:
mapname
identname
postgresqlname
mapname
The map name used in the pg_hba.conf file to refer to the ident map.
Identname
The identifying username, which is generally the name of the system user attempting to establish a connection to the database. This is the name provided by the identd daemon, which must be running on the system attempting to connect.
Postgresqlname
The database username which is allowed for the preceding identifying username. You may specify several lines with the same identname , but with different postgresqlname values, in order to allow a single system user access to several accounts, which do not all need to be on the same database.
Yum install postgresql
Yum install postgresql-server
How to start postgresql database
[root@delora12db1 ~]# /etc/init.d/postgresql start
Initializing database: [OK]
Starting postgresql service: [OK]
psql –h hostname –p 5432 –d dbname –U username –W
[root@delora12db1 ~]# sudo -u postgres psql postgres
Could not change directory to "/root"
Welcome to psql 8.1.23, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? For help with psql commands
\g or terminate with semicolon to execute query
\q to quit
How to check port in postgresql database
Add caption |
Default port number – 5432.
To check version of postgresql database
How to Enabling access for network/remote users
Add/edit the following line in your
postgresql.conf
: listen addresses = 'your ip address'
Add the following line as the first line of
pg_hba.conf
, to allow access to all databases for all users with an encrypted password.
Where,
Host – host means a remote connection.
Database – all database ‘for all databases’
User – specifies username
CIDR-ADDRESS - consists of two parts: IP-address/sub-net mask.
Method= trust effectively means "no authentication". Other authentication methods
Include GSSAPI, SSPI, LDAP, RADIUS, and PAM. PostgreSQL connections can also be made
using SSL, in which case client SSL certificates provide authentication.
Total number of tables
Create, select and delete database
Postgres=# create database yellow;
CREATE DATABASE
postgres=# \du
List of roles
Role name | Superuser | Create role | Create DB | Connections | Member of
-----------+-----------+-------------+-----------+-------------+-----------
anand | yes | yes | yes | no limit |
anand123 | no | no | no | no limit |
postgres | yes | yes | yes | no limit |
root | yes | yes | yes | no limit |
(4 rows)
Oracle=# \l
List of databases
Name | Owner | Encoding
-----------+----------+----------
oracle | root | UTF8
postgres | postgres | UTF8
template0 | postgres | UTF8
template1 | postgres | UTF8
yellow | root | UTF8
(5 rows)
postgres=# \c oracle
You are now connected to database "oracle".
Create table
oracle=# create table company(id int primary key not null , name text not null , age int not null, address char(50));
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "company_pkey" for table "company"
CREATE TABLE
oracle=# \d
List of relations
Schema | Name | Type | Owner
--------+---------+-------+-------
Public | company | table | root
(1 row)
Oracle=# insert into company values (1,'anand', 12,'delhi');
INSERT 0 1
Oracle=# select * from company;
Id | name | age | address
----+-------+-----+----------------------------------------------------
1 | anand | 12 | Delhi
(1 row)
Command to create users and roles
postgres=# CREATE USER davide WITH PASSWORD 'jw8s0F4';
CREATE ROLE
postgres=# CREATE ROLE miriam WITH LOGIN PASSWORD 'jw8s0F4' VALID UNTIL '2005-01-01';
CREATE ROLE
postgres=# CREATE ROLE admin WITH CREATEDB CREATEROLE;
CREATE ROLE
Or simply use pg_user
Command to create schemas
Create tablespace
The data store in Postgres (which is what Advanced Server is based on) is quite a bit different than the way Oracle stores data. Gone are the usual logical storage components. Data blocks, extents and segments just don't exist in a Postgres database. Actually, segments do exist when a table gets bigger than 1GB but that's a story for a different entry.
Postgres stores data as files and directories. Tablespaces are directories and tables (and indexes) are files inside that directory. Postgres ships with two tablespaces installed by default: pg_default and pg_global. The tablespaces cannot be changed or dropped. Pg_global contains objects that are visible across databases (i.e. clusterwide). If you don't have any other tablespaces defined, pg_default is where everything else goes.
BACKUP AND RECOVERY IN POSTGRESQL
1) In postgresql sql dump it will create a text file with SQL commands
2) It usually run as postgres user. (Which is super user)
3) Snapshot in time
4) Can be restored in a newer postgres version.
5) Mostly non-blocking (you can take backup when database is in running mode).
6) Two command line utilities pg_dump and pg_dumpall.
a) Pg_dump db_name > db_backup.sql
b) Pg_dumpall > cluster_backup.sql
Restore commands –
a) Psql db_name < db_backup.sql
b) Psql –f cluster_backup.sql postgres
Using compression
a) Pg_dump db_name | gzip > db_backup.gz
b) Pg_dumpall | gzip > cluster_backup.gz
Restore database
a) Gunzip –c db_backup.gz | psql db_name
b) Gunzip –c cluster_backup.gz | psql postgres
Example:
Command to take backup (here we take backup from vidit database)
-bash-3.2$ pg_dump oracle > /var/lib/pgsql/backups/vidit.sql
Checking backup
-bash-3.2$ cd /var/lib/pgsql/backups/
-bash-3.2$ ll
Total 8
-rw-r--r-- 1 postgres postgres 1644 Sep 23 03:49 vidit.sql
-rwxrwxrwx 1 postgres postgres 0 Sep 23 03:35 oracle. Sql
-rwxrwxrwx 1 postgres postgres 20 Sep 23 03:17 vidit_backup.gz
-bash-3.2$ psql Anand < /var/lib/pgsql/backups/vidit.sql
SET
SET
SET
COMMENT
SET
SET
SET
CREATE TABLE
ALTER TABLE
CREATE TABLE
ALTER TABLE
ALTER TABLE
REVOKE
REVOKE
GRANT
GRANT
Verifying data in Anand database (make sure before restore any data we should create database)
-bash-3.2$ psql
Welcome to psql 8.1.23, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? For help with psql commands
\g or terminate with semicolon to execute query
\q to quit
postgres=# \c Anand
You are now connected to database "Anand".
anand=#
anand=# \d
List of relations
Schema | Name | Type | Owner
--------+---------+-------+-------
public | company | table | anand
public | prod | table | root
(2 rows)
Comments
Post a Comment