Settings

Theme

Can PostgreSQL pickup where MySQL left off?

news.cnet.com

40 points by ruffdev 14 years ago · 45 comments

Reader

pilif 14 years ago

When I started with PHP3 development in 1998, PostgreSQL was still called Postgres95 and it had some severe limitations which were directly adverse to the features needed for web development:

1) performance was very slow

2) There was an effective row size limit of 8K over the whole row which worked against people who wrote forum software (big posts) or were (ugh) storing binaries in the database.

3) Even today, PostgreSQL requires a bit more maintenance and configuration until it works right, whereas MySQL ran out of the box. Back then, VACUUM wasn't at all optional and still required an exclusive lock over the tables it was working on.

4) There was a misconception that each Postgres user needed to be a Unix system user (this might or might not be true. Probably isn't, but I and others were certainly thinking that), so it was working against the usual cheap virtual hosting.

5) The MySQL extension for PHP was much more advanced than the postgres one, leading to people preferring MySQL which in the end lead to the virtual hosts only installing MySQL at which point Postgres' fate was sealed.

6) MySQL was coined as a spiritual successor to mSQL (just look at the names), which was very popular at the time as it was the first SQL-database for "normal people". You either did flat-files or mSQL.

7) MySQL had Windows support from the beginning. Postgres95 was still using cygwin (if you could get it to work). Back then, many people were using Windows as their development environment and being able to run the whole stack on your development machine does have some advantages.

By now, aside of the maintenance thing, Postgres has lost all these drawbacks, but now it's too late as people are using what they know is working for them, so they are just going with MySQL (or SQLite if they need more simplicity, which is interesting in itself as MySQL for ages provided an embedded linkable and serverless variant which also never took on).

  • mgkimsal 14 years ago

    #2 - even if you don't store binaries, just very long posts, it's a problem.

    #4 - whether it's a misconception or not, the majority of tutorials and even the default utilities make that assumption. Working around that is not something as widely documented as the defaults.

    Also, forum software is a big arena. Not being able to get accurate count() values back hurt(s) postgresql. It's hurting people who now rely on MySQL's innodb as well, but you always have the option of MyISAM for certain tables that you want accurate count()s on for pagination. I've heard for years that "it's so easy to do this in PostgreSQL, just write some triggers and stored procedures... " yet... if it's "so easy", why not just bundle that in as default functionality (or perhaps even just a script that would create triggers for you) in postgresql?

    I don't ever recall hearing about the "embedded linkable and serverless variant" of MySQL, and have been doing PHP since 1996. Perhaps it was a licensing issue? MySQL seemed to take off largely due to a rise in PHP, but sqlite seems to have taken off more because of embeddability and the public domain aspect of it first, well, before I ever saw PHP adoption of sqlite.

    • pilif 14 years ago

      > I don't ever recall hearing about the "embedded linkable and serverless variant" of MySQL, and have been doing PHP since 1996. Perhaps it was a licensing issue?

      http://dev.mysql.com/doc/refman/5.1/en/libmysqld.html

      pretty much a licensing issue as all of MySQL is released under the GPL or a proprietary license. This also includes the mysql client by the way, but they made a license exception there that allowed linking against PHP (extensions mysql and mysqli) and later there was a reimplementation of the on-the-wire protocol inside a PHP extension (mysqlnd) released under the PHP license.

      > Not being able to get accurate count() values back hurt(s) postgresql.

      count() is totally accurate within the limits of MVCC. If you need it to be 100% accurate across statements, make your transaction SERIALIZABLE.

      There's one thing about count(): count(*) under MySQL, if using MyISAM tables, is optimized so it doesn't have to actually count and thus is much faster than, say count(row), count(whatever) with a where clause or count([asterisk as to not confuse the HN parser.]) in any other database.

      But the moment you use InnoDB or any other database that supports transactions, count(whatever) unfortunately requires counting in all cases.

      • mgkimsal 14 years ago

        re: count() - I guess I should have also said "reasonably fast". Waiting 8 seconds for a count() isn't practical for many apps.

    • jacques_chester 14 years ago

      > why not just bundle that in as default functionality (or perhaps even just a script that would create triggers for you) in postgresql?

      Multi-version concurrency control means that a single universal definition of count() is basically insensible.

    • bretthoerner 14 years ago

      Under what circumstances is count() inaccurate in Postgres?

      • mgkimsal 14 years ago

        See earlier comment - I misstated my point - it's not so much that it's inaccurate as slow. Dog slow. InnoDB in MySQL 'fixes' the speed with estimates, but that's generally not very useful either in most cases (sometimes, it is).

        I was confusing my earlier innodb count() experiences (inaccurate estimates) with postgresql count() experiences (dog slow).

  • masklinn 14 years ago

    > 2) There was an effective row size limit of 8K over the whole row which worked against people who wrote forum software and were (ugh) storing binaries in the database.

    Could you not just use the dedicated TEXT or BLOB type for that kind of stuff, and store them out-of-row? (yes, would probably have been even slower)

    • pilif 14 years ago

      Nope. There was blob-support, but that relied on special library functions to extract the data to a file which you'd then have to manually read.

      Of course this also means that you couldn't search in these fields or do anything else you'd do directly on the database.

      TOAST tables were added in 7.1 AFAIK which lifted that limit and allowed for arbitrary sized rows by moving the big fields away into their own storage.

      The old functionality is still there (http://www.postgresql.org/docs/9.0/interactive/largeobjects....) though I totally failed and still fail to see any practical use.

      • masklinn 14 years ago

        > Nope. There was blob-support, but that relied on special library functions to extract the data to a file which you'd then have to manually read.

        OK.

        > The old functionality is still there (http://www.postgresql.org/docs/9.0/interactive/largeobjects....) though I totally failed and still fail to see any practical use.

        Memory constraints maybe? That's about the only justification I could see for using that over bytea: if you store single binary objects in the GB+ range, you may want to interact with them as streams.

Zak 14 years ago

I wonder why MySQL became super-popular and Postgres didn't. I remember reading for years about how Postgres had a bunch of stuff (e.g. transactions) that MySQL didn't (until recently) and that anybody who knew anything about databases would choose Postgres, yet people still chose MySQL.

What did MySQL have during the last decade that made it better for certain applications than Postgres?

  • buro9 14 years ago

    MySQL was simple.

    You didn't have to deal with schema/tablespaces, or the chance that someone used stored procs, or security (almost everyone just used the equivalent of a root account).

    MySQL just worked.

    For the most part, with zero monitoring, it stayed working until the server ran out of disk space or died.

    MySQL was always available.

    No matter which hosting company you looked at, no matter how little you wanted to spend, MySQL was pre-installed and ready for use.

    I never thought it was the better choice, but I understood why it dominated. Once things like Wordpress, vBulletin, phpBB and other stuff was written only for MySQL the choice was then made for you.

    I've long been a fan of applications that give you a choice of DB, but it's now unrealistic as many applications have designed schemas to cope with the quirks and limitations of MySQL specifically. These would now be harder to migrate than simply changing a database string, and most application devs don't invest in such things when what they have works well enough and is all anyone is asking for.

    • Zak 14 years ago

      The second half of your explanation, which boils down to "it's everywhere" makes sense now, but is only true because it was already in demand.

      That it was faster (for certain applications), easier and more reliable does make sense. I understand, for example why PHP got popular even though there were other options available at the time that I believe were better from a software engineering[0] perspective. Things get popular by being better at the things the majority of users care about, even if they're awful at everything else.

      [0] Did I really just use that term? I can't think of a better one to describe the issue I'm talking about.

      • sjs 14 years ago

        For many people, including hosting admins, MySQL was the path of least resistance. MySQL required less maintenance than Postgres back in the day. It's pretty much exactly the same situation as with PHP. The tool created by mere mortals is usable by mere mortals, so they use it. The tool created by scholars is technically superior at core functions but is slightly more difficult or cumbersome to use so is used by fewer mortals.

        One thing people seem to forget is that Postgres has not been around longer than MySQL, both were released to the public in 1995. Postgres skipped a few version numbers so maybe that makes it seem older.

        • stonemetal 14 years ago

          Counting public releases of non university projects sure, but at least according to the wiki page it has been under active development since 1986, with first release in 1989. Where Mysql started development in 1994.

    • igorw 14 years ago

      Actually, phpBB 2.0.0 which was released in 2002 supported: IBM DB2, MS Access, MS SQL, MySQL3, MySQL4, Oracle... and Postgres 7.

      But you are right, many of the other popular projects did not make this effort.

  • epo 14 years ago

    MySQL is a triumph of branding and fanboy self-delusion.

    MySQL made it easier initially by omitting lots of difficult stuff (data integrity, management, transactions) thus making easier to get something working quickly. It also had a reputation for being faster then PostgreSQL, a reputation spread by fanboys (in truth people who were bigging up the only database they had a faint comprehension of).

    Then it became ubiquitious. Now, like some nasty rash, we can't get rid of it.

    • masklinn 14 years ago

      > It also had a reputation for being faster then PostgreSQL

      It was indeed much faster when reading (probably still is on MyISAM) due to the lack of everything (especially transactions and any concept of data integrity).

    • lamby 14 years ago

      > MySQL made it easier initially by omitting lots of difficult stuff (data integrity, management, transactions) thus making easier to get something working quickly.

      Remind anyone of NoSQL? :)

  • jacques_chester 14 years ago

    MySQL (MyISAM, really) was faster for loading chunks of text and spitting them into Perl and PHP. Back in the 133Mhz days, people cared about that. Given that MyISAM was pretty much flat files in a clown suit, you didn't get much in the way of serious database functionality.

  • Udo 14 years ago

    I can say from a web dev's perspective that the ubiquitousness of MySQL is only part of the allure (an important part though). MySQL makes simple stuff very easy to implement, it's a great solution if you just want your database to store your stuff and get out of the way. Sure, like any other DB it requires some expertise if you want performance, but it makes this incredibly simple. It just works and you don't spend a lot of time arguing with the software.

    It also has a single feature that I find incredibly cool: it supports automatically generated row IDs. It's been a while since I last tried Oracle and Postgres, but at the time you had to create a second table just for a sequence number and it was extremely awkward to handle.

    Could the auto-increment feature in MySQL be better? Sure, for example it would be nice to have an option where the row is identified by a hash value instead of a continuous sequence. But even in its limited form today it beats having to wrestle with the glued-together equivalent as implemented in "real databases".

    • masklinn 14 years ago

      > It also has a single feature that I find incredibly cool: it supports automatically generated row IDs. It's been a while since I last tried Oracle and Postgres, but at the time you had to create a second table just for a sequence number and it was extremely awkward to handle.

      Must have been a while indeed. Postgres's SERIAL datatype has been available at least since 6.4. Postgres 6.4 was released in 1998.

    • wlll 14 years ago

      Postgres has had auto-incrementing row ids for a while, see this example: https://gist.github.com/1178237

      • Udo 14 years ago

        I may be wrong, but isn't that just a syntactical shortcut for creating a separate sequence table?

        • masklinn 14 years ago

          Yes (that's noted in the gist). But the end result's the same, instead of writing AUTOINCREMENT you write SERIAL. That's about it, the only issue may happen if you drop the table (the sequence will not be dropped, not a huge issue in practice)

          • Udo 14 years ago

            Did you notice any differences in performance?

            Edit: I'm not saying there are, but it feels like there might be some repercussions when you take the underlying implementation into account. It would be interesting to know if someone conducted comparative tests about this.

  • wisty 14 years ago

    MySQL was popular for the same reason No-SQL is - it lacked features, but it was fast as hell with no tuning. Postgres did more, so it had to be slower, and it was tuned to work on minimal resources (and not steal all your RAM). Like MongoDB today, MySQL was tuned to work fast, at the costs of both reliability, and being a poor citizen on a shared host.

    There are other factors - you could limit its use of disk space (so hosts liked it), but mostly it was a good key-value store with some database functionality.

    People who wanted a "real" database used Postgres. People who wanted to write a web app used MySQL.

  • Tichy 14 years ago

    I seem to faintly remember that in the beginning Postgres was Unix only, and MySQL was available for Windows, too.

  • Nelson69 14 years ago

    Well, MySQL has been easier to set up, that cannot be underestimated; I've basically saved off a pair of configuration files I've just reused a dozen times to set up PostgreSQL. MySQL also works pretty fast right out of the box, if you data is smaller than physical memory, it might be nearly as fast as you can make it by default. PostgreSQL has fairly conservative performance settings when first installed, you can make it really sing but it takes some tuning and in some cases that tuning requires a fairly deep understanding of your data. Think most MySQL users are not DBAs, all the data specific settings for a PostgreSQL like database might frighten them some; it's a really odd juxtaposition when you think about it, stereotypically, geeks are drawn to the knobs and dials you can play with but I honestly think in this case it turned them away.

    Then a little bit of support from ISPs and corporations and it's a no-brainer for most folks. All the oss CMS and blogs and everything support MySQL really really well. ALl the ISPs pretty much use MySQL as the "database" solution.

  • smackay 14 years ago

    The query browser and admin tools were easier to use. They weren't really better, rather the presentation was simpler. The Postgres tools (pgAdmin) shows a little too much detail right away (namespaces, etc.) and looks a lot more complicated to use which might put off people who just need to get it working and build from there.

  • enobrev 14 years ago

    When I first got into full-time pro web development (about a decade ago), I started learning PHP. Just about every tutorial I read or book I wished I could afford had 'MySQL' next to 'PHP' as though they were one in the same.

    From there forward, any time I was looking into a new host, they ALL had both PHP and MySQL. Any job I was bidding, discussion I was having in IRC or on Forums, anywhere - the two came as a matching set. I can't say that's WHY it became more popular, but I've always assumed their popularity were closely aligned.

  • sandGorgon 14 years ago

    Right now - the biggest problem is Wordpress which has ZERO postgres support. A PG-supported version of Wordpress will go a long, long way in popularizing Postgres.

    • jacques_chester 14 years ago

      It'll never happen. Wordpress core might be ported, but there's thousands of plugins and themes that assume MySQL.

  • dramaticus3 14 years ago

    MySQL got the wrong results faster.

    • davidw 14 years ago

      I've always likened it to riding a bicycle downhill with no brakes. Yes, it's faster, but...

      Of course, now they've converged a lot, with Postgresql getting significant speed upgrades, and Mysql having InnoDB, with a few 'minor details' like transactions.

mkup 14 years ago

Well, PostgreSQL is licensed under BSD, which makes closed-source, commercial forks possible. That's why it is attractive for large businesses, like Apple.

GPL-licensed MySQL serves the opposite purpose: it is impossible to fork it into commercial DB product, grow it as times goes by, and threaten Oracle. It is nearly impossible to create commercial product for this market, because there's a free product there: MySQL. MySQL is like a ditch around the Oracle stronghold, a ditch to keep competitors far away. GPL is a perfect tool for such tricks.

In contrast to GPL which suppresses commercial developments in it's market segment, BSD encourages them.

For the same reason, Apple slowly gets rid of GCC in favour of clang. It is perfectly OK from legal point of view to include BSD-licensed compiler into commercial closed-source IDE, unlike GPL-licensed compiler.

  • sigzero 14 years ago

    I actually think the license had very little to do with its popularity. The fact that MySQL was a lot simpler than Pg made it popular.

jacques_chester 14 years ago

IMO the short answer is: no.

Most opensource software that uses MySQL is riddled with MySQLisms.

gierach 14 years ago

In my mind, replication has historically been a pretty big deciding factor in favor of MySQL. PostgreSQL didn't support it natively until version 9.0 (which still isn't part of most Linux distro repositories). You'd have to install Slony or something similar. MySQL made master/slave or master/master replication simple to configure and a breeze to maintain.

ruffdevOP 14 years ago

thanks for all these points. I am doing a presentation tomorrow on the same and all these points will help!

pablospr 14 years ago

When you’re choosing a database, you’re making a long-term decision, because changing your mind later is difficult and expensive. You want to get it right the first time

Here's a comparison in a tabluar format http://database-management-systems.findthebest.com/compare/1...

  • Halienja 14 years ago

    that's a good comparison. We use MySQL and PostGre both in our organization and the deciding factor is mainly cost.

    • masklinn 14 years ago

      > PostGre

      It has never been named "PostGre".

      Its original name was Postgres, the current one is PostgreSQL with Postgres the accepted nickname.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection