Talk Schedule

Day 1: Friday 2nd July 2010

09:35 Duration: 60 minutes

Replication Features in Core PostgreSQL

Conference kicks off with a review of the major features in PostgreSQL, Streaming Replication and Hot Standby. Plus some details of the features likely to be there for in 9.1 and later.

Speaker:
Simon Riggs

Simon leads the 2ndQuadrant team as Managing Consultant. Simon has contributed major features in each of the last 5 versions of PostgreSQL. His work includes recovery and replication, performance and monitoring as well as designs for many other features. Simon has worked as a Database Architect for 20 years, with high-end solutions experience and certifications on Oracle, Teradata and DB2.
10:35 Duration: 45 minutes

Making master/slave systems work better with pgpool-II

PostgreSQL 9.0 will ship with long awaited feature: built-in replication system(Synchronous replication/Hot standby). Though Pgpool-II has its own synchronous replication system, it's known it can work together with other replication systems, for example Slony-I. By combining pgpool-II with Synchronous replication/Hot standby, we could expect many benefits including high performance, easy to use, automatic failover and transparent replication from the system. In this presentation we explain the architecture of the combined system and the technique to use it.

Speaker:
Tatsuo Ishii

Tatsuo Ishii is the author of pgpool and has been involved in PostgreSQL since 1996. He is a co-founder of Japan PostgreSQL User's Group(JPUG). He has written many PostgreSQL books and articles in Japan and is working for SRA OSS, Inc. Japan, which provides various PostgreSQL related services.
11:40 Duration: 40 minutes

Living the Easy Life with Rules-Based Autonomic Database Clusters

Our business at Continuent is development of database clusters with highly simplified management and the ability to operated unattended for prolonged periods of time. As part of our Tungsten Clustering product we developed an extensible, low-latency, fault-tolerant management framework for database clusters built around a core of group communications and business rules. We have found that our system is easy to maintain and to extend. For example, a recent extension to switch virtual IP addresses in the event of a database node failure was implemented in an afternoon as a set of two rules and a single bash script. In our talk we will cover the following: * Basic architecture of a rules-based management framework for databases * Introduction to business rules, with code examples showing how they work to repair problems ranging from simple process failures to network partitions * A quick demo of business rules in operation. * Finally, some thoughts on the benefits of the approach and our experiences (good and bad) with autonomic management of database clusters. This is an approach to management that we believe will be of interest to anyone who cares about keeping important data highly available as well anyone interested in learning about rules technology.

Speaker:
Linas Virbalas

Linas Virbalas is an engineer for Continuent, Inc., a leading provider of database availability and scaling solutions. Linas is responsible for the Tungsten implementation of clustering for PostgreSQL using warm standby/PITR as well as the upcoming log streaming and hot standby features. Linas has an extensive background in Java programming including replication and group communications. He has worked directly on numerous customer implementations of clustering.
12:20 Duration: 40 minutes

Tuning Your Hot Standby Deployment

Abstract: The new Hot Standby feature in PostgreSQL 9.0 allows running read-only queries on systems whose only communication with the master are log files it is shipped. This feature is applicable to many situations, but setting the related server parameters depends heavily on the type of workload you intend to deploy it on. Running queries against a fully decoupled standby is perfect for some applications, but not ideal for others. Tuning for fast replication and long-running reports are also difficult to accomplish at the same time. Understanding how Hot Standby works and how its parameters interactwith

Speaker:
Greg Smith

Greg Smith has been writing business applications in PostgreSQL since adopting V7.0 as the only sensible open-source database in 2000. After a few years of fighting database scaling issues, he has contributed code to every release of the core PostgreSQL project starting in V8.3, primarily in the area of performance tuning. Greg is Principal Consultant for 2ndQuadrant in the United States, providing training, consulting, and support to clients all over the country.
14:00 Duration: 40 minutes

Parallel pg_dump - maxing out hardware resources for database copy, backup and restore

PostgreSQL's pg_dump tool is currently single threaded which is a big drawback to anyone who wants to run a backup as fast as possible with the tool. The talk discusses solutions for the current issues of a parallel version of pg_dump and also includes a demo of a proof-of-concept implementation. Benchmark data of this new version will be presented and compared to the current version and further enhancements will be shown. For example, there is the possibility to run a parallel version of "pg_dump | psql", i.e. copying a database in parallel to a new server without saving any data to disk.

Speaker:
Joachim Wieland

Joachim got in contact with PostgreSQL about 10 years ago and was impressed by the stability and the flexibility of the database. He started to contribute to the project about 5 years ago and now does that whenever time permits. His 9.0 contribution was the rewrite of the LISTEN/NOTIFY subsystem and version 9.1 will hopefully see a parallel version of pg_dump.
14:45 Duration: 45 minutes

Juggling Petabytes

Gigabytes spin easy. Terabytes are harder. Petabytes are impossible without a carefully designed architecture for parallel computing. Luke will give a live demonstration of how to juggle large quantities of data and then explain the key architectural differences between PostgreSQL and the Greenplum shared-nothing parallel database system. Benchmarking results will be presented.

Speaker:
Gavin Sherry

Gavin Sherry is a core developer at Greenplum. He builds parallel database systems powering the world's biggest databases.
15:45 Duration: 45 minutes

Middle-R: A Middleware for Scalable Database Replication

This talk will present the current status of Middle-R a middleware for scalable database replication. Middle-R was the pioneer in providing database replication at the middleware level and also to do so with a performance close to the in-kernel implementations. The talk will focus on how the different bottlenecks for scalability have been overcome in Middle-R. In particular, it will be explained how isolation and full database replication bottlenecks have been successfully addressed. The talk will also pay special attention to the autonomic capabilities of Middle-R for self-provisioning and self-healing and the current plans to deploy on cloud environments.

Speaker:
Ricardo Jimenez-Peris

Prof. Ricardo Jimenez-Peris is co-director of the Distributed Systems Lab (LSD) at Universidad Politecnica de Madrid. After finishing his PhD at UPM, he spent one year as a postdoc researcher at ETH Zurich (Switzerland) during 2000. There, he became interested on scalable database replication and has been working on the topic since then. He has been one of the main architects of Middle-R. After his postdoc stay, he became an associate professor at UPM where he founded the LSD lab. He has published over 90 research papers in international conferences and journals. He is also very active in European projects, and currently is coordinating two projects on cloud computing.
16:50 Duration: 85 minutes

Speaker:
Simon Riggs

Simon leads the 2ndQuadrant team as Managing Consultant. Simon has contributed major features in each of the last 5 versions of PostgreSQL. His work includes recovery and replication, performance and monitoring as well as designs for many other features. Simon has worked as a Database Architect for 20 years, with high-end solutions experience and certifications on Oracle, Teradata and DB2.

Day 2: Saturday 3rd July 2010

09:30 Duration: 60 minutes

Read/Write scalability and transaction management in Postgres-XC

Postgres-XC is PostgreSQL-based database cluster which provides both read and write scalability based on shared-nothing architecture. The talk will explain the background and potential of its scalability, transaction management, transparency to applications and benchmark results, as well as relationship with other PostgreSQL cluster efforts, potential use case and future plans. At present, PG-XC provides 6.4 times throughput of vanilla PostgreSQL in DBT-1 based benchmark with ten servers. PG-XC also provides synchronous multi-master capability. Applications can connect to any master. They provide uniform capability to applications. Any update from a master is immediately visible to any other transactions running in any masters.

Speaker:
Koichi Suzuki

Koichi Suzuki is the fellow of NTT DATA Intellilink Cor. and the leader of Postgres-XC project. He is also involved in several PostgreSQL development such as WAL optimization and recovery acceleration. He has written database internal book and translated several books on database and network. Before joining PostgreSQL development, he was involved in the development of UniSQL Object-Relational database engine. He was also involved in the development of EUC (Extended Unix Code) and its standardization.
10:30 Duration: 45 minutes

Scaling PostgreSQL with pgmemcache

Easy-to-use caching for your PostgreSQL setup today! Have you ever felt the need to scale reads from your database and gotten annoyed by the complexity of it all? Ever hit cache revocation issues in your architecture? Or have you just heard of people using this memcached thing and just want to know what all the cool kids have been talking about? pgmemcache (http://pgfoundry.org/projects/pgmemcache/) is a PostgreSQL memcached library allowing you to call memcached directly from SQL. It allows you to easily add memcached caching to your PostgreSQL DB application. The talk goes over the history and background of the pgmemcache project. It also covers current status and some future plans for it. Also some general use cases of where you'd actually want to use the technology will be covered. And last but not least, pgmemcache usage will be demoed.

Speaker:
Hannu Valtonen

Hannu Valtonen took over the maintainership of pgmemcache in early 2009. As his day job he works as a Senior Software Engineer at Reputation Systems / F-Secure Labs (http://www.f-secure.com/weblog/) designing distributed systems and large scale databases. He's an occasional contributor to various Open Source projects.
11:35 Duration: 40 minutes

Scaling PostgreSQL the Skype way

Setting up an infinitely scalable postgreSQL cluster, accessed exclusively via function calls, scaled via pl/proxy and maintained using pgbouncer, pgq and londiste.

Speaker:
Hannu Krosing

Hannu started programming a few years before PCs became available. He has worked extensively on scaling the PostgreSQL database, designing a new partitioning language, pl/proxy, which, together with queueing system pgQ enables infinite database scalability. Hannu balances his free time between his family and sports, mountaineering and reading. Hannu studied architecture and applied mathematics.
12:15 Duration: 45 minutes

How a dating site scaled with londiste and django

The overall theme of the talk is that specific db knowledge cannot be thrown away. it is not possible to abstract the database, only to make using it faster/quicker/easier. ORM stuff that attempts to abstract away db implementation get you stung when you scale.

Speaker:
Nic Ferrier

CTO, Woome. woome is a social dating startup. we began in 2007 and have had to learn the hard way how to scale "sexy" but complex webapp frameworks.
14:00 Duration: 45 minutes

Using MVCC for Clustered Database Systems

Multi-version concurrency control is a well known and established key technology for high performance database systems. A lot of open source projects as well as commercial products offer varying successful implementations, however, its potential for clustering solutions is still underestimated.Postgres-R is one of the very projects that leverages MVCC to be used in the cluster. This talk shows its many benefits and explains the following key techniques of Postgres-R: concurrent, optimistic application of change sets, conflict detection during normal operation, early application of change sets during recovery and transparent support for sub-transactions, rules, triggers and stored procedures.

Speaker:
Markus Wanner

Markus Wanner has been a professional Postgres user for more than 10 years and picked up development of Postgres-R in 2002. As the lead developer of Postgres-R, he adapted it to use MVCC, added recovery features, implemented support for sequences and optimistic change set application among other features. He maintains an up-to-date patch and strives to create a scalable and highly available clustering solution for Postgres.
14:45 Duration: 30 minutes

Replication & Database Security

Distributed system security is an essential though often overlooked aspect of replication and clustering.

Speaker:
Magnus Hagander

PostgreSQL Committer and integration expert, Magnus is an integral part of the PostgreSQL project's web team and a member of the Security team.
15:45 Duration: 30 minutes

Bottom-Up Database Cluster Benchmarking

Ideal clustered database deployments usually presume some number of equal nodes that are functionally equivalent, to spread work across. In the real world, server hardware changes so fast, and has so many parts, that you might not even get two systems of the same speed even if you order them in the same batch. Using a systematic benchmarking approach for all servers you add, starting at the hardware level and moving up as they are validated, improves several aspects of clustered deployment. You can find vendor errors before putting things into production. It allows speed grading to adjust load balance based on predicted server capacity when using a mixed set. And it can provide insight into where the true bottlenecks are in complicated deployments.

Speaker:
Greg Smith

Greg Smith has been writing business applications in PostgreSQL since adopting V7.0 as the only sensible open-source database in 2000. After a few years of fighting database scaling issues, he has contributed code to every release of the core PostgreSQL project starting in V8.3, primarily in the area of performance tuning. Greg is Principal Consultant for 2ndQuadrant in the United States, providing training, consulting, and support to clients all over the country.
16:15 Duration: 30 minutes

Managing Slony with slony1-ctl

slony1-ctl is a Slony manager written in Bash. It allows easy installation and configuration of multi-replication sets. This tool let you initiate and start a replication (even complex, with cascading) in a snap. And after that, you will be able to modify your schema and manage your switchovers even if your are not a Slony expert.

Speaker:
Cedric Villemain

Cédric Villemain is a member of the PostgreSQL community and a professional PostgreSQL consultant in France. Interested by several part of the project, he contributes in the areas of monitoring, administration and derivate products like Slony.
16:50 Duration: 85 minutes

Workshop on Synchronous Replication

Round-table workshop on Synchronous Replication

Speaker:
Chair: Greg Smith

Greg Smith has been writing business applications in PostgreSQL since adopting V7.0 as the only sensible open-source database in 2000. After a few years of fighting database scaling issues, he has contributed code to every release of the core PostgreSQL project starting in V8.3, primarily in the area of performance tuning. Greg is Principal Consultant for 2ndQuadrant in the United States, providing training, consulting, and support to clients all over the country.