Talk Schedule
Day 1: Friday 2nd July 2010
Replication Features in Core PostgreSQL
Conference kicks off with a review of the major features in PostgreSQL, Streaming Replication and Hot Standby. Plus some details of the features likely to be there for in 9.1 and later.
Speaker:
Simon Riggs
Making master/slave systems work better with pgpool-II
PostgreSQL 9.0 will ship with long awaited feature: built-in replication system(Synchronous replication/Hot standby). Though Pgpool-II has its own synchronous replication system, it's known it can work together with other replication systems, for example Slony-I. By combining pgpool-II with Synchronous replication/Hot standby, we could expect many benefits including high performance, easy to use, automatic failover and transparent replication from the system. In this presentation we explain the architecture of the combined system and the technique to use it.
Speaker:
Tatsuo Ishii
Living the Easy Life with Rules-Based Autonomic Database Clusters
Our business at Continuent is development of database clusters with highly simplified management and the ability to operated unattended for prolonged periods of time. As part of our Tungsten Clustering product we developed an extensible, low-latency, fault-tolerant management framework for database clusters built around a core of group communications and business rules. We have found that our system is easy to maintain and to extend. For example, a recent extension to switch virtual IP addresses in the event of a database node failure was implemented in an afternoon as a set of two rules and a single bash script. In our talk we will cover the following: * Basic architecture of a rules-based management framework for databases * Introduction to business rules, with code examples showing how they work to repair problems ranging from simple process failures to network partitions * A quick demo of business rules in operation. * Finally, some thoughts on the benefits of the approach and our experiences (good and bad) with autonomic management of database clusters. This is an approach to management that we believe will be of interest to anyone who cares about keeping important data highly available as well anyone interested in learning about rules technology.
Speaker:
Linas Virbalas
Tuning Your Hot Standby Deployment
Abstract: The new Hot Standby feature in PostgreSQL 9.0 allows running read-only queries on systems whose only communication with the master are log files it is shipped. This feature is applicable to many situations, but setting the related server parameters depends heavily on the type of workload you intend to deploy it on. Running queries against a fully decoupled standby is perfect for some applications, but not ideal for others. Tuning for fast replication and long-running reports are also difficult to accomplish at the same time. Understanding how Hot Standby works and how its parameters interactwith
Speaker:
Greg Smith
Parallel pg_dump - maxing out hardware resources for database copy, backup and restore
PostgreSQL's pg_dump tool is currently single threaded which is a big drawback to anyone who wants to run a backup as fast as possible with the tool. The talk discusses solutions for the current issues of a parallel version of pg_dump and also includes a demo of a proof-of-concept implementation. Benchmark data of this new version will be presented and compared to the current version and further enhancements will be shown. For example, there is the possibility to run a parallel version of "pg_dump | psql", i.e. copying a database in parallel to a new server without saving any data to disk.
Speaker:
Joachim Wieland
Juggling Petabytes
Gigabytes spin easy. Terabytes are harder. Petabytes are impossible without a carefully designed architecture for parallel computing. Luke will give a live demonstration of how to juggle large quantities of data and then explain the key architectural differences between PostgreSQL and the Greenplum shared-nothing parallel database system. Benchmarking results will be presented.
Speaker:
Gavin Sherry
Middle-R: A Middleware for Scalable Database Replication
This talk will present the current status of Middle-R a middleware for scalable database replication. Middle-R was the pioneer in providing database replication at the middleware level and also to do so with a performance close to the in-kernel implementations. The talk will focus on how the different bottlenecks for scalability have been overcome in Middle-R. In particular, it will be explained how isolation and full database replication bottlenecks have been successfully addressed. The talk will also pay special attention to the autonomic capabilities of Middle-R for self-provisioning and self-healing and the current plans to deploy on cloud environments.
Speaker:
Ricardo Jimenez-Peris
Speaker:
Simon Riggs
Day 2: Saturday 3rd July 2010
Read/Write scalability and transaction management in Postgres-XC
Postgres-XC is PostgreSQL-based database cluster which provides both read and write scalability based on shared-nothing architecture. The talk will explain the background and potential of its scalability, transaction management, transparency to applications and benchmark results, as well as relationship with other PostgreSQL cluster efforts, potential use case and future plans. At present, PG-XC provides 6.4 times throughput of vanilla PostgreSQL in DBT-1 based benchmark with ten servers. PG-XC also provides synchronous multi-master capability. Applications can connect to any master. They provide uniform capability to applications. Any update from a master is immediately visible to any other transactions running in any masters.
Speaker:
Koichi Suzuki
Scaling PostgreSQL with pgmemcache
Easy-to-use caching for your PostgreSQL setup today! Have you ever felt the need to scale reads from your database and gotten annoyed by the complexity of it all? Ever hit cache revocation issues in your architecture? Or have you just heard of people using this memcached thing and just want to know what all the cool kids have been talking about? pgmemcache (http://pgfoundry.org/projects/pgmemcache/) is a PostgreSQL memcached library allowing you to call memcached directly from SQL. It allows you to easily add memcached caching to your PostgreSQL DB application. The talk goes over the history and background of the pgmemcache project. It also covers current status and some future plans for it. Also some general use cases of where you'd actually want to use the technology will be covered. And last but not least, pgmemcache usage will be demoed.
Speaker:
Hannu Valtonen
Scaling PostgreSQL the Skype way
Setting up an infinitely scalable postgreSQL cluster, accessed exclusively via function calls, scaled via pl/proxy and maintained using pgbouncer, pgq and londiste.
Speaker:
Hannu Krosing
How a dating site scaled with londiste and django
The overall theme of the talk is that specific db knowledge cannot be thrown away. it is not possible to abstract the database, only to make using it faster/quicker/easier. ORM stuff that attempts to abstract away db implementation get you stung when you scale.
Speaker:
Nic Ferrier
Using MVCC for Clustered Database Systems
Multi-version concurrency control is a well known and established key technology for high performance database systems. A lot of open source projects as well as commercial products offer varying successful implementations, however, its potential for clustering solutions is still underestimated.Postgres-R is one of the very projects that leverages MVCC to be used in the cluster. This talk shows its many benefits and explains the following key techniques of Postgres-R: concurrent, optimistic application of change sets, conflict detection during normal operation, early application of change sets during recovery and transparent support for sub-transactions, rules, triggers and stored procedures.
Speaker:
Markus Wanner
Replication & Database Security
Distributed system security is an essential though often overlooked aspect of replication and clustering.
Speaker:
Magnus Hagander
Bottom-Up Database Cluster Benchmarking
Ideal clustered database deployments usually presume some number of equal nodes that are functionally equivalent, to spread work across. In the real world, server hardware changes so fast, and has so many parts, that you might not even get two systems of the same speed even if you order them in the same batch. Using a systematic benchmarking approach for all servers you add, starting at the hardware level and moving up as they are validated, improves several aspects of clustered deployment. You can find vendor errors before putting things into production. It allows speed grading to adjust load balance based on predicted server capacity when using a mixed set. And it can provide insight into where the true bottlenecks are in complicated deployments.
Speaker:
Greg Smith
Managing Slony with slony1-ctl
slony1-ctl is a Slony manager written in Bash. It allows easy installation and configuration of multi-replication sets. This tool let you initiate and start a replication (even complex, with cascading) in a snap. And after that, you will be able to modify your schema and manage your switchovers even if your are not a Slony expert.
Speaker:
Cedric Villemain
Workshop on Synchronous Replication
Round-table workshop on Synchronous Replication

