147. Moving objects
between schemas
To move one table from its current
schema to a new schema, use the following:
ALTER
TABLE cust SET SCHEMA anotherschema;
If you want to move all objects, you
might consider just renaming the schema itself using the following query:
ALTER
SCHEMA existingschema RENAME TO anotherschema;
This only
works if another schema with that name does not already exist.
Otherwise, you'll need to run ALTER TABLE for each table you
would like to move.
Views, sequences, functions,
aggregates, and domains can also be moved by ALTER
When you move a table to a new schema, all of the indexes,
triggers, and rules defined on those tables will also be moved to the new
schema. If you've used a SERIAL data type, and an implicit sequence has been
created, then that also moves to the new schema. Schemas are a purely an
administrative concept, and do not affect the location of the table's data
files. Tablespaces don't work this way, as we will see in later recipes.
Databases,
users/roles, languages, conversions don't exist in a schema. Schemas exist in a
particular database. = Schemas don't exist within schemas; they are not
arranged in a tree or hierarchy.
Text search
objects exist in a specific schema though there are no commands to move them to
a new schema. Similarly, operator(s), operator class(es), and operator
family(s) exist in a schema, though there are no commands to move them to new
schemas. Also casts don't exist in a schema, though the data types and functions
they reference don't.
148. Explain Tablespace in postgres?
Tablespaces
allow us to store PostgreSQL data across different devices. A tablespace is
just a directory where we store PostgreSQL data files. We use symbolic links
from the data directory to the tablespace. A tablespace can contain objects
from multiple databases, so it's possible to be in a position where there are
no objects visible in the current database.
We might
want to do that for performance, administrative ease, or your database might
just have run out of disk space.
Create the tablespace is simple, as
follows:
CREATE
TABLESPACE new_tablespace LOCATION '/usr/local/pgsql/new_tablespace';
And, the command to remove the
tablespace is also simple, which is:
DROP
TABLESPACE new_tablespace;
A tablespace can only be dropped when it is empty,
Tablespaces can contain both permanent and temporary objects. Permanent data objects are tables, indexes,
and toast objects.
Toast
objects are created and always live in the same tablespace as their main table,
plus, you cannot manipulate their privileges or ownership.
Indexes can
exist in separate tablespaces, as a performance option, though that requires
explicit specification on the CREATE INDEX statement. The default is to create
indexes in the same tablespace as the table to which they belong.
Temporary
objects may also exist in a tablespace. These exist when users have explicitly
created temporary tables or there may be implicitly created data files when
large queries overflow their work_mem settings. These files are created
according to the setting of the temp_tablespaces parameter.The users can change
their setting of temp_tablespaces away from the default value specified in the
postgresql.conf.
Giving each tablespace a specific owner, using the following
query:
ALTER TABLESPACE new_tablespace
OWNER TO eliza;
You may also wish to set default_tablespaces for a user, so
that tables are automatically created there by issuing the following query:
ALTER USER eliza SET
default_tablespace = 'new_tablespace';
As each tablespace has different I/O
characteristics, we may wish to alter the planner cost parameters for each
tablespace. These can be set with the following command:
ALTER TABLESPACE new_tablespace SET
(seq_page_cost = 0.05, random_page_cost = 0.1);
Moving objects between tablespaces
if you alter the tablespace of a
table, it's toast objects will move also.
ALTER TABLE mytable SET TABLESPACE
new_tablespace;
ALTER INDEX mytable_val_idx SET
TABLESPACE new_tablespace;
Moving tablespaces means
bulk-copying data. Copying happens sequentially block-by-block, and that
performs well, but there's no way to avoid the fact that the bigger the table,
the longer it will take.The table is fully locked (AccessExclusiveLock) while
the copy takes place, so this can cause an effective outage for your
application.
SET default_tablespace =
'new_tablespace';
ALTER DATABASE mydb SET
default_tablespace = 'new_tablespace';
Note :>>> Take care
that you do not run the next command by mistake though:
ALTER DATABASE mydb SET TABLESPACE
new_tablespace;
As this literally moves all objects
that do not have an explicitly defined tablespace into new_ tablespace. For a
large database, this will take a very long time, and your database will be
completely locked while it runs
149. How do I make sure that the user X cannot access the table Y?
The current user must either be a
superuser, the owner of the table, or must have a GRANT option for the table.
Also, you can't revoke rights from a user who is a superuser.
To revoke all rights to table
mysecrettable from user user whoshouldnotseeit, one must run the following SQL
command:
REVOKE
ALL ON mysecrettable FROM userwhoshoudnotseeit;
However, because the table is
usually also accessible to all users through role PUBLIC, the following must
also be run:
REVOKE ALL ON mysecrettable FROM
PUBLIC;
To make sure that some user no
longer can access a table, the right(s) to that table must be revoked from both
PUBLIC and that specific user.
150. What are the default
rights have Public role on newly created tables?
By default all users have a set of
rights (SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, and TRIGGER) to
all newly created tables through the special role PUBLIC.
151. Granting user
access to a table
Access to schema is also needed
In order to access any table, the
user first needs access to the schema containing the table:
GRANT USAGE
ON SCHEMA someschema TO someuser;
Granting access to a table through a group role
CREATE
GROUP webreaders;
GRANT
SELECT ON pages TO webreaders;
GRANT
INSERT ON viewlog TO webreaders;
GRANT
webreaders TO tim, bob;
Now, both tim and bob have the SELECT
privilege on table pages, and INSERT on table viewlog. You can also add
privileges to the group role after assigning it to users. So after:
GRANT
INSERT, UPDATE, DELETE ON comments TO webreaders;
Both bob and
tim have all these privileges on table comments.
Granting access to all objects in schema
Version 9.0 added a capability to GRANT
or REVOKE privileges on all objects of a certain kind in a specific schema:
GRANT
SELECT ON ALL TABLES IN SCHEMA staging TO bob;
You still
need to grant the privileges on the schema itself in a separate grant
statement.
152. Temporarily
preventing a user from connecting
To modify other users, you must either be a superuser or
have the CREATEROLE privilege (in the latter case only non-superuser roles can
be altered).
To temporarily prevent the user from logging in, run the
following:
pguser=#
alter user bob nologin;
The preventing user from connecting
can be achieved by setting connection limit for that user to 0:
pguser=# alter user bob connection limit 0;
To let the user connect again, run the following:
pguser=#
alter user bob login;
This sets a
flag in the system catalog telling PostgreSQL not to let the user to log in. It does not kick out already connected
users.
153. Limiting number
of concurrent connections by a user
The preventing user from connecting
can be achieved by setting connection limit for that user to 0:
pguser=#
alter user bob connection limit 0;
To allow bob 10 concurrent
connections, run the following:
pguser=#
alter user bob connection limit 10;
To allow an unlimited number of
connections by this user, run the following:
pguser=#
alter user bob connection limit -1;
postgres=#
ALTER ROLE fred CONNECTION LIMIT 1;
Note that even if you set connection
limit to zero for superusers, they will still be able to connect.
The
connection limit is applied during session connection. Raising this limit would
never affect any connected users. Lowering the limit doesn't have any effect
either, unless they try to disconnect and reconnect.
postgres=> SELECT
rolconnlimit FROM pg_roles WHERE rolname = 'fred';
postgres=> SELECT count(*)
FROM pg_stat_activity WHERE usename = 'fred';
Users can't
raise or lower their own connection limit, in case you were worried they might
be able to override this somehow.
154. Forcing NOLOGIN
users to disconnect
In order to make sure that all users
whose login privilege has been revoked are disconnected right away, run the
following SQL statement as a superuser:
SELECT pg_terminate_backend(procpid)
FROM from pg_stat_activity a
JOIN pg_roles r ON a.usename =
r.rolname AND not rolcanlogin;
On older versions of postgresql,
where pg_terminate_backend() function does not exist, you can get the same
effect from shell by running the following as user postgres on the database
server:
postgres@hvost:~$ psql -t -c "\
select 'kill ' || procpid from
pg_stat_activity a \
join pg_roles r on a.usename =
r.rolname and not rolcanlogin;"\
| bash
This incantation constructs proper
kill commands from a query, and then feeds them to the shell for execution.
155. checking all
users have a secure password
PostgreSQL has no built-in facilities to make sure that you
are using strong passwords.
The best you
can do is to make sure that all users' passwords are encrypted, and that your
pg_hba file does not allow logins with a plain password. That is, always use
MD5 as login method for users.
To see which users have unencrypted
passwords, use the following query:
test2=#
select usename,passwd from pg_shadow where passwd not like 'md5%' or
length(passwd) <> 35;
usename
| passwd
----------+--------------
tim
| weakpassword
asterisk
| md5chicken
To see users with encrypted
passwords, use the following:
test2=#
select usename,passwd from pg_shadow where passwd like 'md5%' and
length(passwd) = 35;
usename
| passwd
----------+-------------------------------------
bob2
| md518cf038878cd04fa207e7f5602013a36
156. How to Audit DDL changes
One can
collect DDL (Data Definition Language) from database logs in order to audit
changes to the database structure
your postgresql.conf file, and set
the following:
log_statement = 'ddl'
Setting it to 'mod' or 'all' is also
ok for this. Don't forget to reload the configuration:
/etc/init-d/postgresql
reload
Was the change committed
It is
possible to have some statements recorded in the log file, but not visible in
the database structure. Most DDL commands in PostgreSQL can be ROLLBACKed, so
what is in the log, is just a list of commands executed by postgresql, not what
was actually committed. The log file is not transactional, and also keeps
commands that were rolled back.
Who made the change
To be able
to know the database user who made the DDL changes, you have to make sure that
this info is logged as well. In order to do so, you may have to change the
log_line_prefix parameter to include the %u format string.
A recommended minimal log_line_prefix
format string for auditing DDL is '%t %u %d', which tells postgresql to log timestamp, database user,
and database name at the start of every log line.
157. Auditing data
changes
1. Collecting
data changes from server log
Set log_statement = 'mod' or to 'all' in the server log.
Collect all INSERT, UPDATE, DELETE, and
TRUNCATE commands from log
2. Collecting changes using triggers
158. Choosing good
names for database objects
The standard names for
indexes in PostgreSQL are:
{tablename}_{columnname(s)}_{suffix}
where the suffix is
one of the following:
ffpkey for a Primary Key
constraint
ffkey for a Unique
constraint
ffexcl for an Exclusion
constraint
ffidx for any other kind
of index
Standard suffix for
sequences is
ffseq for all sequences
Tables can have multiple
triggers fired on each event. Triggers are executed in alphabetical order, so
trigger names should have some kind of action name to differentiate them and to
allow the order to be specified. It might seem a good idea to put INSERT, UPDATE, or DELETE
in the trigger name, though that can get
confusing if you have triggers that work on both UPDATE and DELETE, and may end up as a
mess.
A useful naming
convention for triggers is:
{tablename}_{actionname}_{after|before}__trig
159. Handling objects
with quoted names
PostgreSQL
object names can contain spaces and mixed case characters if we enclose the
tablenames in double quotes. PostgreSQL folds all names to lowercase when used
within an SQL statement, which means that:
SELECT
* FROM mycust;
is exactly the same as:
SELECT *
FROM MYCUST;
If you want to access a table that was created with quoted
names, then you must use quoted names, such as the following:
postgres=# SELECT count(*) FROM
"MyCust";
Handy function named quote_ident().
This function puts double quotes around a value if PostgreSQL would require
that for an object name, such as
postgres=#
select quote_ident('MyCust');
quote_ident
-------------
"MyCust"
postgres=#
select quote_ident('mycust');
quote_ident
-------------
mycust
160. Why is the database version
important?
PostgreSQL has internal version numbers for the data file
format, the database catalog layout, and the crash recovery format. Each of
these is checked as the server runs, to ensure that the data isn't corrupted.
PostgreSQL doesn't change these internal formats for a single release, they
only change across releases.
From a user
perspective, each release differs in terms of the way the server behaves. If
you know your application well, then it should be possible to assess the
differences just by reading the release notes for each version. In many cases,
a retest of the application is the safest thing to do.
161. Identifying and
removing duplicates
Identify the duplicates using a query, such as the following:
SELECT * FROM cust WHERE customerid IN (SELECT
customerid FROM cust GROUP BY customerid HAVING count(*) > 1);
Use a query block like
the following to delete all the exactly duplicate rows, leaving just one row
from each set of duplicates:
BEGIN; LOCK TABLE
new_cust IN ROW EXCLUSIVE MODE; DELETE FROM new_cust WHERE ctid NOT IN (SELECT
min(ctid) FROM new_cust
WHERE customer_id
IN (4) --specify exact duplicate ids
GROUP BY
customerid); COMMIT;
And then follow that
with
VACUUM new_cust;
To clean up the table
after the deletions.
If we're looking for
duplicates of more than one column (or even all columns) then we have to use an
SQL of the following form:
SELECT * FROM
mytable WHERE (col1, col2, … ,colN) IN (SELECT col1, col2, … ,colN FROM mytable
GROUP BY col1, col2, … ,colN HAVING count(*) > 1);
with (col1, col2, …. , colN) as the list of columns of the key.
The DELETE query that we showed only works with PostgreSQL, because it uses
the ctid value which is
the internal identifier of each row in the table. If you wanted to run that
query against more than one column
DELETE FROM
mytable WHERE ctid NOT IN (SELECT min(ctid) FROM has_duplicates
-- need WHERE
clause to filter only exact duplicates
GROUP BY col1, col2, …, colN);
Uniqueness is
always enforced by an index.
Unique constraints can be marked as "deferrable".
However, there are a number of restrictions on this that make this feature not
very usable in PostgreSQL 9.0. The restrictions are as follows:
ffYou must define a constraint as DEFERRABLE on
the CREATE TABLE.
You cannot define this on a CREATE
TABLE AS SELECT, nor can these be added later with an ALTER TABLE command.
ffYou cannot mix deferrable unique constraints with Foreign Keys. You will get
an error message if you try to add a Foreign Key that refers to a unique
constraint that is deferrable.
162. Loading data from a spreadsheet
If your spreadsheet data is neatly laid out in a single worksheet.
CSV
as the file type to be saved
We can then load it into an existing PostgreSQL table, using the psql command
postgres=# \COPY sample FROM sample.csv CSV HEADER
postgres=# SELECT * FROM sample;
key | value
-----+-------
1 | c
2 | d
Or from the command
line this would be, as follows:
psql -c '\COPY sample FROM sample.csv CSV HEADER'
Note that the file can
include a full file path if the data is in a different directory. The psql
\COPY command transfers data
from the client system where you run the command through to the database
server, so the file is on the client.
If you are submitting
SQL through another type of connection, then you would use the following SQL
statement:
COPY sample FROM
'/mydatafiledirectory/sample.csv' CSV HEADER;
Note that the preceding SQL statement runs on the database server,
and can only be executed by a superuser. So you would need to transfer the data yourself to the server,
and then load. The COPY statement shown in the preceding SQL statement uses an
absolute path to identify data files, which is required.
The COPY (or \COPY) command does not create the table for you; that must be done
beforehand. Note also that the HEADER option does nothing
but ignore the first line of the input file, so the names of the columns from
the .csv file don't need to
match the Postgres table. If you say HEADER and
the file does not have a header line, then all it does is ignore the first data
row. Unfortunately, there's no way for PostgreSQL to tell whether the first
line of the file is truly headers or not. Be careful.
163. How to Loading
data from flat files.
Use pgloader to Loading data into your database. use a script to execute pgloader.
Typical example: cookbook_pgloader.conf
[pgsql]
host = 192.168.0.5
base = pgloader
user = dim
log_file = /tmp/pgloader.log
log_min_messages = DEBUG
client_min_messages = WARNING
lc_messages = C
client_encoding = 'utf-8'
copy_every = 10000
null = ""
empty_string = "\ "
max_parallel_sections = 4
[load]
table = load_example
filename = simple/simple.data
format = csv
datestyle = dmy
field_sep = |
trailing_sep = True
columns = a:1, b:3, c:2
reject_log = /tmp/simple.rej.log
reject_data = /tmp/simple.rej
section_threads =
4
We can use the load
script like the following:
pgloader
–-summary –-vacuum –-config cookbook_pgloader.conf
164. Explain pgloader? Differnece between
COPY & pgloader ?
pgloader copes gracefully with errors. copy loads all rows
in a single transaction, so only a single error is enough to abort the load.
pgloader breaks down an input file into reasonably sized chunks, and loads them
piece-by-piece. If some rows cause errors, then pgloader will iteratively check
them so that it can skip those bad rows.
pgloader is written in Python, and allows connection to PostgreSQL
through the standard Python client interface. Yes, pgloader is less efficient than loading data files using a COPY command,
but running a COPY has many more restrictions: the file has to already be in
the right place on the server, has to be in the right format, and must be
unlikely to throw errors on load. pgloader has additional overhead, but it also
has the ability to load data using multiple parallel threads, so it can be
faster to use as well. pgloader's ability to call out to reformat functions
written in Python is often essential in most cases; straight COPY is just too
simple.
pgloader also allows loading from fixed-width files, which COPY cannot.
If you need to reload the table from fresh completely, then
specify --truncate
on the command line of pgloader.
165. How to reload data in parallel by
pg_restore ?
If you are re-loading data that has been unloaded from PostgreSQL,
then you may want to use the pg_restore utility instead. The pg_restore utility has an
option to reload data in parallel, -j
number_of_threads, though this is only possible if the dump was produced using
the custom
pg_dump format.
166. Default port number for PostgreSQL
5432
167. What is difference between pg_ctl stop
fast & immediate?
You can issue a
database server stop using fast mode as follows:
pg_ctl -D
datadir -m fast stop
You must use -m
fast if you wish to shut
down as soon as possible. Normal shutdown means "wait for all users to
finish before we exit". That can take a very long time, though all the
while new connections are refused.
When you do a fast stop, all users have their transactions aborted
and all connections are disconnected. It's not very polite to users, but it
still treats the server and its data with care, which is good.
PostgreSQL is similar to other database
systems, in that it does do a shutdown checkpoint before it closes. This means
that the startup that follows will be quick and clean. The more work the
checkpoint has to do, the longer it will take to shut down.
The basic command to perform
an emergency restart on the server is the following:
pg_ctl -D datadir stop -m
immediate
We
must use an immediate stop mode.
When you do an immediate stop, all users have their transactions
aborted, and all connections are disconnected. There is no clean shutdown, nor
politeness of any kind.
An immediate mode stop is similar to a database crash. Some cached
files will need to be rebuilt, and the database itself needs to undergo crash
recovery when it comes back up.
Note that for DBAs with Oracle experience, immediate mode is the
same thing as a shutdown abort. PostgreSQL
immediate mode stop is not the same thing as shutdown immediate on Oracle.
168. How to Restarting the server quickly
The basic command to restart the server is the following:
pg_ctl -D datadir restart -m
fast
The actual shutdown will happen much faster if we issue a normal
checkpoint first, as the shutdown checkpoint will have much less work to do.
So, flush all dirty shared_buffers
to disk with the following command issued by a database superuser:
psql -c "CHECKPOINT"
169. Reloading the server configuration
files
Some PostgreSQL configuration parameters can only be changed by
"reloading" the whole configuration file. On all platforms, there is
a specific command to reload the server, which is as follows:
UBUNTU/DEBIAN pg_ctlcluster 9.0
main reload
RED HAT/FEDORA service
postgresql reload pg_ctl -D /var/lib/pgsql/data reload
SOLARIS
pg_ctl -D /var/lib/pgsql/data reload
MAC OS
pg_ctl -D /var/lib/pgsql/data reload
FREEBSD pg_ctl -D
/var/lib/pgsql/data reload
You can also reload the
configuration files while still connected to PostgreSQL. This can be done from
the command line as follows, if you are a superuser:
postgres=# select pg_reload_conf();
pg_reload_conf
----------------
t
Which is also often
executed from an admin tool, such as pgAdmin3
To reload the configuration files, we send the SIGHUP signal to
the postmaster, which then passes that on to all connected backends. That's why
some people call reloading the server "sigh-up-ing".
If you look at the catalog table pg_settings, you'll see that
there is a column named context.
Each setting has a time and a place where it can be changed. Some parameters
can only be reset by a server reload, and so the value of context for those
parameters will be a "sighup". Here are a few of the ones you'll want
to change sometimes during server operation (there are others!):
postgres=# SELECT name,
setting, unit ,(source = 'default') as is_default FROM pg_settings
WHERE context = 'sighup' AND
(name like '%delay' or name like '%timeout') AND setting != '0';
As reloading the configuration file is achieved by sending the
SIGHUP signal, we can reload the configuration file just for a single backend
using the kill command First, find out the pid of the backend using pg_stat_activity.
Then, from the OS prompt, issue the following:
kill -SIGHUP pid
Or we can do both at once, as in the following command:
kill -SIGHUP \ `psql -t -c
"select procpid from pg_stat_activity limit 1"`
though that is only useful with a sensible WHERE clause.
170. Preventing new connections
Certain emergencies, you may need to lock down the server
completely, or just prevent specific users from accessing the database
Connections can be prevented in a number of ways as follows:
ffPause/Resume the session pool.
ffStop the server! but it is not recommended.
ffRestrict connections to zero for a specific
database by setting the connection limit to zero.
ALTER DATABASE foo_db CONNECTION LIMIT 0;
This
will limit normal users from connecting to that database, though it will still
allow superuser connections.
ffRestrict
connections to zero for a specific user by setting the connection limit to zero
ALTER USER foo CONNECTION LIMIT 0;
This
will limit normal users from connecting to that database, though it will still
allow connection if the user is a superuser, so luckily you cannot shut
yourself out accidentally.
ffChange
host-based authentication (HBA) file to refuse all incoming connections, and
then reload the server:
..Create
a new file named pg_hba_lockdown.conf,
and add the following two lines to the file. This puts rules in place that will
completely lock down the server, including superusers. Please have no doubt
that this is a serious and drastic action.
# TYPE DATABASE USER CIDR-ADDRESS METHOD
local all all reject
host all all 0.0.0.0/0
reject
If you still want superuser
access, then try something like the following:
#
TYPE DATABASE USER CIDR-ADDRESS METHOD
local
all postgres ident
local
all all reject
host
all all 0.0.0.0/0 reject
which
will prevent connection to the database by any user except the postgres
operating system userid connecting locally to the postgres database. Be careful
not to confuse the second and third columns: the second column is the database
and the third column is the username. It's worth keeping the header line just
for that reason. The method "ident" should be replaced by other
authentication methods if a more complex configuration is in use.
..Copy the existing pg_hba.conf to pg_hba_access.conf, so that
it can be replaced again later, if required.
..Copy pg_hba_lockdown.conf
to
pg_hba.conf
..Reload
the server
No comments:
Post a Comment