Category Archives: Postgresql for the SQL Server DBA

SQL Server vs Postgresql my take.

I came across an interesting post about Microsoft SQL Server vs. Postgresql.  Currently, my company is working to migrate our application towards Postgresql.  I’m excited about this as I find learning something new interesting and fun.  While I love SQL Server, I also acknowledge that its not right for all situations.  However, this article seems a bit too biased towards Postgesql and comes across as petty at times.

1.2 The author seems to have an issue with Microsoft’s usage of the standard SELECT INTO syntax.  Seems picky and odd to have an issue with standard syntax.  The author also has an odd love of the $$ instead of single quote to define quoted text.  The author complains that the OUTPUT syntax is tedious because of the need for a variable but then later procliams this is a valuable feature.

1.4 The author seems preoccupied with the wide swath of languages supported for procedures (functions in Postgresql) but this is just an excuse to offload application server work on the valuable database server.  Seems more logical to keep that work out in the application environment where it can be scaled more easily.  Not that anyone actually does that in SQL Server either but it worries me that it would be worse in a Postgres world.  I would welcome feedback on this.

1.9  The article completely ignores the idea that there is plenty of scripting available to automate much of what happend with SQL Server.  The author ignores the job engine that comes with SQL Server and has great configurability.  The author also is condescending that there might be automation via powershell, criticizes people for using GUIs, and also completely ignores Microsoft’s movement towards headless (Core) installations.

1.11 The author is critical of SQL Server documentation because it’s not funny.  However, there is plenty of example code in the documentation.  Having funny comments in the code is cool and all but is it really a benefit to the database?  Will my queries run faster becuase the code developers are clever with their history of the church and dates?

1.13 All the anecdotes about customer service seem directed to companies other than Microsoft.  The article is critical that “all” people just end up fixing it and the the vendor never does.  I’ve had many different support experiences with Microsoft and have seen thier support to help with resolution, same goes with Oracle.  Simply putting all my support hope in the usage of mail lists, forums and such seems like a poor decision.  I use all of these resources myself and rarely need to call support but it is a reasonable approach to have a support solution.  These products are far too complex to just rely on community resources.  You want organizations that can pay folks to focus on diffferent aspects of the product and gain expertise… People that can open bugs and place pressure on getting these resolved.  Microsoft is not perfect but I have benefited from their support many times.

1.14 While the author is right that there is a good amount of flexability with dumps, SQL Server’s backup facilities are much easier to use.  They can be easily scripted in many ways.  They can be encrypted, compressed and set to differential, full or transaction logs with the single command.  While there are methods to handle logs and such within Postgresql, they are not as easy to implement as SQL Server.

1.16 The author says they do not know long it takes to install SQL Server but they say they were aware of many organizations still on SQL Server 2008 R2.  This isn’t because its exceptionally hard to upgrade but more of a function of license costs.  I don’t know of the issues with moving up major versions of Postgresql but given my understanding of enterprise database software, no major upgrade is done without significant analysis and testing.  It’s a little more complicated than and apt-get update.

Something I also noticed that I can’t personally find a solution for is a good tool for profiling.  In SQL Server, I can use SQL Profiler or XEvents to see currently running transactions for issues.  Or, I can take these traces to do replay work.  The first I’ve not found any good solutions.  And for the former I have not thoroughly investigated.  I welcome those suggestions as well.

Finally the author seems very proud of his/her work but does not take credit or identify themselves.  If they are so proud, why do they hide?  Seems odd.   Comments, criticisms and questions welcomed.

Postgresql for the SQL Server DBA: Backups, copy that file

So the next method of backup we will discuss is a file system level backup.  The file system backup involves shutting down the postgresql server and then creating a tar of the directory containing all the data files for the database.  It’s a very simple but inconvenient method to backup the database because the server must be down for the backup to be created.  On the SQL Server side, the most similar option is detaching and attaching databases.

Postgresql documentation also recommends a variation on this process where you take a frozen snapshot of the data directory and the WAL files.  This would eliminate the need to shutdown the database server.  However, in this scenario, Postgresql will assume that the database server was not properly shutdown and perform recovery.  This is the reason you need the WAL files available. You can reduce the recovery time by performing a checkpoint prior to the backup.

So lets see how the file copy backup works.

First, find the OID (similar to the database_id in sys.databases) as this is the name of the directory in the data directory where data files are stored.

Query window to find the OID number using query: SELECT oid,datname FROM pg_database.

You can place the data directory anywhere depending on the database setup but common locations are /usr/local/pgsql/data or /var/lib/pgsql/data.  Here you can see the directory for dvdrental, which is 16399.

Directory listing of the base data directory.

You can then stop the postgresql server.

Stopping the posgresql server service.

Now that the server is down,  you can run the tar command to create the tar file of the database files.

Creating a tar of the database data directory: tar -cf /root/Downloads/backup.tar /var/lib/postgresql/9.1/main/base/16399.  16339 is the OID for the database.

Finally, restart the postgresql server.

Start the postgresql server service.

See my future post about how to perform archiving of WAL files for point-in-time restores.

Postgresql for the SQL Server DBA: Backups, taking a dump via GUI

In my article Postgresql for the SQL Server DBA: Backups, taking a dump, I talk about how to perform a dump via the command-line.  In this article, I’ll show how to perform this backup with a GUI tool pgAdmin in Linux.  pgAdmin is available for Windows, Mac OSX, Linux, and more.  So let’s get started and create a backup.

Start backup with pgAdmin

Now you can enter the file name and choose any options you want to change.  Here I’m staying with the default options for this example.

Backup options screen in pgAdmin

Now you see the results as the backup completes.

pgAdmin backup output screen.

Postgresql for the SQL Server DBA: Backups, taking a dump.

One of the most important tasks DBAs are responsible for is backups. It is critical that we protect our data and the first defense is having a good, reliable backup.  In SQL Server, we can perform this work within SSMS, PowerShell, or via the command line tools (e.g. sqlcmd).  These are all tools that come with SQL Server.  Postgresql does not come with a GUI but there are free ones available.  In this article, I’ll be discussing the process using a native command-line tool.  So let’s get started.

Basic backups via dump

The most basic backup method available for Postgresql  is the dump tool called pg_dump.  This tool allows you to take a back up of the database schema and data while the database is online.  The backup will be consistent to when the backup was started.  Additionally, this tool can be used to extract just the schema.  There are many options you can use to change the default behavior of the dump command.  Although the pg_dump tool does allow for backups while the database is online, it does not allow for point-in-time restores.  Additionally, there is nothing included with the pg_dump command to encrypt backups.  And, depending on the size of the database, the backup times can be significant with pg_dump. Also,  pg_dump opens a transaction when it starts that can effect other transactions on the system needing to take locks.

Below is an example of a basic dump of the database.  First we issue the pg_dump command with the username (postgres), database name (dvdrental), and the dump file name:

The pg_dump command: pg_dump -U postgres dvdrental > /root/Downloads/dvdrental_backup_nogui.sql

If you look at the file created by the dump utility, you will see that the file is just a plain text file with all the commands to create the schema and insert the data into the database.

A view of the script file created by the dump command.

You can also compress your dumps to conserve on backup space.

Dump command with compression: pg_dump -U postgres dvdrental | gzip > /root/Downloads/dvdrental_backup_nogui_compressed.sql

As you can see, the size with compression is much smaller compared to the uncompressed version:


Please see my future topics on using free GUI tools for backups, file system level backups, and performing WAL archiving to provide point-in-time restores.


Postgresql for the SQL Server DBA Series

I have been a DBA for over 10 years and have worked with SQL Server and Oracle.  I spent the first 6 years doing both Oracle and SQL Server, and I have used SQL Server exclusively for the last 4 years.  As a DBA, I’m always reading and trying to learn.  This series is a learning process for me that I wanted to share with the community.  In this blog series, I’m going to be exploring Postgresql, an RDBMS platform that has been gaining interest in the DBA community.  Postgresql has been around for some time but, with the increased costs for Oracle and SQL Server licenses combined with Oracle’s ownership of MySQL, there has been a greater acceptance of Postgresql.  In this series, I will  explore how the SQL Server DBA can perform DBA responsibilities in Postgresql.  I hope this series is both educational and interesting for the SQL Server DBA looking to expand his or her skillset or take on new projects with a different RDBMS.