Thursday, December 2, 2010


In this post we'll examine PostgreSQL's schema support.

We encountered the default schema called "public" in an earlier post when exploring pgAdmin. Colloquially, when someone talks about a database's schema they typically mean the structure of the tables (that this table has these fields with these types). Here, a schema is a higher level of organization, a container or namespace within a database. A schema contains a set of tables, views, stored procedures and triggers and so on, all the usual components of a database. However, one can have multiple schema within a PostgreSQL database. To clarify this, let's start with a concrete example and consider the pros and cons of different ways of organizing the data.

Suppose we have a supermarket company, SuperFoods Inc., who have two stores, say Sausalito, CA and Petaluma, CA. (Clearly they have recovered from the bad economy and layoffs of this earlier post.) Each of the two stores has a set of information: employees and contact details, details about the physical store itself, replenishment orders and order history etc.

Imagine the data are stored in two separate databases. The CEO of SuperFoods asks one of his minions to produce a report on the sales history of the two stores. The problem here is that databases are independent silos. One cannot write a SQL query that will collate the data from the two stores. psql can only connect to one database at a time. We can use external tools, such as a JDBC driver in Java that will make a connection to each of the two databases and collect the data, but we cannot query the two databases simultaneously from the psql command line.

Imagine now that we take the two databases and put them inside a single SuperFoods database. We create a container, a schema, for each of the two original databases. Each schema has a unique name, "sausalito" and "petaluma", and this causes each of the tables in the schema to have a qualified name that includes the schema name.

That is, we have



(Technically, it would include the database name too and be "SuperFoods".sausalito.employees etc. under the hood but the user doesn't see this or need to see it.)

Importantly, when we connect to the SuperFoods database we can now access the sales report tables from both stores. We cannot do

select * from sales;

because PostgreSQL is going to complain. There is an abmiguity here. There are multiple sales tables. Which do you mean? Thus, one has to qualify it with the schema name:

select * from sausalito.sales;

to specify the Sausalito sales and

select * from petaluma.sales;

to access the Petaluma sales.

We can now combine those queries with operators such as UNION and UNION ALL (covered in this post) to get something closer to what the CEO wants:

select 'sausalito', sum(sales) from sausalito.sales where year=2010
select 'petaluma', sum(sales) from petaluma.sales where year=2010;

Great. Goal achieved.

OK, now consider this structure from the perspective of a human resources person at the Sausalito store. She needs to update the employees table for her store and she isn't interested in the Petaluma tables. Writing 'sausalito.' before each table in the queries is a bit of a nuisance. PostgreSQL helps us out with the notion of the search path. The search path says where to look, similar to classpaths in most programming languages.

Let's connect to a database, set the search path and create a new table.

psql -d SuperFoods

testdb=#set search_path TO sausalito;

testdb=# \d
              List of relations
  Schema   |   Name    | Type  |    Owner    
 sausalito | employees | table | postgresuser
 sausalito | sales     | table | postgresuser
(2 rows)

Notice we see sales and employees not sausalito.sales. When we set the search path it means, work on the database from the perspective of the sausalito schema. Now when one does

select * from sales;

it does not complain. It knows that you mean sausalito.sales. There is no ambiguity now because you cleared that up earlier when you set the search path.

You can even select the data from the other schema using the qualified name

select * from petaluma.sales;

from within the sausalito schema.

Creating tables once one has set the search_path has the same implicit name qualification:

testdb=# create table promotions(name varchar, value int);
testdb=# \d
               List of relations
  Schema   |    Name    | Type  |    Owner    
 sausalito | employees  | table | postgresuser
 sausalito | promotions | table | postgresuser
 sausalito | sales      | table | postgresuser
(3 rows)

How do we create a schema? That is a synch:

testdb=# create schema sausalito;
testdb=# create schema petaluma;

Further, you can dump data from a single schema from the operating system command line

localhost:~ postgresuser$ pg_dump --schema sausalito testdb > sausalito.sql

We can see that using schema gives us all the tools that exist for single databases but we can collect related tables together, in SuperFoods by store, and yet compare data mong those different collections. Now we understand more about schema we can refine our notion about them. They are a namespace or rather a namespace within a database. PostgreSQL only sees one database here: SuperFoods.

Schema can be very powerful and useful way of partitioning data. However, they are typically used for more complex sets of tables, more complex than those typically needed for beginner exercises. For simpler cases, an additional field that defines the subset of data might suffice. Imagine that we include an additional 'store' field in the sales table. A single table could then be queried for Sausalito sales, Petaluma sales or both.

Also, imagine that SuperFoods assigns a regional manager to the two stores. That manager is effectively an employee at both stores. We want to avoid replication of data in databases and so would not want to include his personal details in both schema. In this case a single, more unified database structure might be more appropriate. Given those, schema are a valuable addition to PostgreSQL and worth considering when designing one's database.


  1. Replies
    1. It's useful for me and many other beginners. Perhaps you are not a beginner anymore but already are knowledgeable (which is good)

    2. im a begginer...and i didint get there

    3. Very useful - even in 2018

    4. useful for sure, even in 2018. I agree

  2. I found it useful. Thanks

  3. Seriously 0% useful

  4. Thanks, the one key bit of info for me was that I could combine schema data by using UNION. I've had trouble finding this anywhere else. Job done, I can go to bed happy - cheers!

  5. Useful for beginners like me :)

  6. Very cool -- that was a very clear and useful explanation! Thank you!

  7. Useful for a beginner!

  8. Useful for a beginner okay.

    I came across this page while trying to find the answer to a related problem - you mention using Java & JDBC to merge the data from two databases but if you use two schema instead it seems the avenue is no longer open to you because there is no way to specify the schema in the jdbc connection url!

    Which is a real surprise to me and I've been using java & postgresql for many years. I just never bothered using separate schema until recently.

  9. With big tables the separation by schemas can be helpful for partitioning the data. Putting all data into one table can make it hard for the database to maintain the huge index structures. Insert operations might slow down due to index or primary key reorganization - same is true for updates. The more the database grows the more the DBMS might busy to maintain the structures. With a schema you will get many small tables that are easier to maintain.

  10. for begineers this info is usefull

  11. appreciate this alot, thanks

  12. Very useful... thank you!!!

  13. Good one, Thanks.

  14. Clear and easy to understand. Never before used schemas, but this is the next thing I'm going to implement in my postgres databases.
    Thank you

  15. Very useful, thanks.

  16. Great explanation! Thank you very much!

  17. It's useful for me. For the ones said this article is unhelpful: if you can share more information you think is useful, it will benefit all of us, otherwise your comment is just another 0% useful.

  18. Eat poo and go to the loo

  19. ¿Schemas can share tables or other types of database objects?

  20. Thanks for putting it in simple and easy to understandable steps without complex jargons :-)

    It was very useful

  21. very useful, it was confusing me

  22. Fuck that first guy, this was very useful.

  23. great thanks, coming from ms sql the terminology is a bit different, good to have a nice brief guide to go on!

  24. Very comprehensible and intelligible. Thank you

  25. 100% useful :)
    Coming from MySql, didn't know about schemas in PG.

  26. Hi, thank you for let me getting the AHA moment with schema ...

  27. Excellent post! I've gotten great info from a number of your postings. Much appreciated.

  28. Very good one ..really use full.

    Narayana Kantamani

  29. Very useful! Thank you!

  30. Thank you very much for this simple and easy to understand explanation of what schema are and how they fit into database design.

  31. Thanks to the creator

  32. Dikesempatan kali ini, kami ingin memperkenalkan kepada anda website QQ terbaru yang akan memberikan permainan yang lebih seru dengan 8 jenis permainan yang bisa anda nikmati hanya dengan menggunakan 1 user id.
    bandar ceme terbaik
    paito warna terlengkap
    syair sgp

  33. Hal kedua yang harus Anda lakukan adalah pekerjaan rumah Anda. Bukan jenis studi yang mungkin Anda lakukan di sekolah yang tampaknya praktis tidak berguna, karena pokerpelangi

  34. Situs terpercaya poker online PKV GAMES terbaru dan terpercaya serta gampang menang. buruan gabung dengan jutaan Jackpot sudah menanti.
    * MIYAQQ