Upgrade from GA to Heap

Learn How
skip to content
Loading...
    • The Digital Insights Platform Transform your digital experience
    • How Heap Works A video guide
    • How Heap Compares Heap vs. competitors
    • The Future of Insights A comic book guide
  • Data Insights

    • Session Replay Complete context with a single click
    • Illuminate Data science that pinpoints unknown friction
    • Journeys Visual maps of all user flows

    Data Analysis

    • Segments User cohorts for actionable insights
    • Dashboards Share insights on critical metrics
    • Charts Analyze everything about your users
    • Playbooks Plug-and-play templates and analyses

    Data Foundation

    • Capture Automatic event tracking and apis
    • Mobile Track and analyze your users across devices
    • Enrichment Add context to your data
    • Integrations Connect bi-directionally to other tools

    Data Management

    • Governance Keep data clean and trusted
    • Security & Privacy Security and compliance made simple
    • Infrastructure How we build for scale
    • Heap Connect Send Heap data directly to your warehouse
  • Solutions

    • Funnel Optimization Improve conversion in user flows
    • Product Adoption Maximize adoption across your site
    • User Behavior Understand what your users do
    • Product Led Growth Manage PLG with data

    Industries

    • SaaS Easily improve acquisition, retention, and expansion
    • eCommerce Increase purchases and order value
    • Financial Services Raise share of wallet and LTV

    Heap For Teams

    • Product Teams Optimize product activation, conversion and retention
    • Marketing Teams Optimize acquisition performance and costs
  • Pricing
  • Support

    • Heap University Video Tutorials
    • Help Center How to use Heap
    • Heap Plays Tactical how-to guides
    • Heap Updates
    • Professional Services

    Resources

    • Blog A community for digital builders
    • Content Library Ebooks, whitepapers, videos, guides
    • Press News from and about Heap
    • Webinars & Events Virtual and live events
    • Careers Join us

    Ecosystem

    • Customer Community Join the conversation
    • Partners Technology and Solutions Partners
    • Developers
    • Customers Over 8,000 successful companies
  • Free TrialRequest Demo
  • Log In
  • Free Trial
  • Request Demo
  • Log In

All Blogs

Engineering

Optimizing Postgres Queries at Scale

Matt Dupree
December 21, 20214 min read
  • Facebook
  • Twitter
  • LinkedIn

Heap is a product analytics tool that automatically captures web and mobile behavior like page views, clicks, and taps. We're operating at a scale of billions of events per day, which we store across a distributed Postgres cluster.

Heap's thousands of customers can build queries in the Heap UI to answer almost any question about how users are using their product. Optimizing all of these queries across all our customers presents special challenges you wouldn't typically encounter if you were optimizing the performance of a small set of queries within a typical app.

This post is about why this scale requires us to conduct performance experiments to optimize our SQL, and it details how we conduct those experiments.

Why it’s hard to optimize Heap queries

We want the querying experience within Heap to feel interactive, so we need most queries to return in just a few seconds. There are multiple reasons why this is particularly difficult to do with Heap queries, but the most important one is that Heap's query builder allows customers to issue a wide variety of queries:

blog-query-builder

As you can see, we support filters, group bys, and comparing different groups of users and time periods. While using these features may only result in a small change to the SQL, it can have a big impact on performance.

Here's a recent example of this.

One of our customers created a query that typically hits one of the millions of indexes we have across our cluster. More specifically, the query resulted in an index-only scan, a special kind of (speedy!) operation in which the database can satisfy a query with just the information that’s in the index, without having to read the table’s actual rows.

Then the user made a slight modification to their query by changing a simple value in a dropdown:

blog-change-dropdown

This change resulted in SQL that could not use an index-only scan, which made the query take much longer to execute.

Because of the wide variety of questions customers want to ask about their data, we have many examples like this. This means that when we're assessing the performance of a set of queries that support a new feature within Heap, we can't simply tweak some SQL, look at a few query plans from EXPLAIN ANALYZE, and call it a day. There may be a small, but common tweak to the query we’ve optimized that could tank the overall performance of the query. (This has happened to us multiple times.)

Here’s a histogram of some of our queries that helps show the variability of performance:

blog-query-perf-variability

Experiments to optimize SQL

One of the things we do to combat this variability issue is to construct 95% confidence intervals around our estimates of performance. These confidence intervals help us identify the range of plausible values our p90 may take once we launch a new Heap query for all of our customers. This range takes into account the variability we introduce into our query performance by supporting a wide variety of ways customers can tweak and/or slice and dice their data.

We typically visualize new Heap query performance like this:

blog-confidence-interval

The width of the confidence bounds helps us see how much query performance could vary across all our customers and across all the various tweaks they make to their queries.

To test if a SQL change makes a query faster, we first generate experimental and control SQL for a few hundred Heap queries, which are stored as JSON. Then, for each JSON query, we generate the control SQL and the experimental SQL that we hope will improve performance.

Next, we run the queries in counterbalanced order, alternating which SQL statement gets run first for a particular JSON query. For example, given JSON queries A and B, we’d run our queries in the following order:

  1. Control SQL for Query A

  2. Experimental SQL for Query A

  3. Experimental SQL for Query B

  4. Control SQL for Query B

We do this control for the effect of caching. If experimental SQL is always run after the control SQL for all queries, the experimental SQL may look faster merely because those queries benefit from caching.

Why run experimental and control SQL on the same queries instead of a true experiment where we randomly assign queries to experimental and control groups? Because of the variability of query performance, a randomized experiment would require us to have a large sample of queries, which slows down the speed with which we can conduct experiments and puts additional load on our systems.

Once we’ve executed these queries, we plot the performance estimates with confidence intervals of the experimental and control queries.

The plot looks like this:

blog-experimental-query-perf

If the confidence intervals of the queries don't overlap (as in the above picture), we can have some confidence that we've actually improved performance with our tweak.

Here’s a recap of the above process as a sequence diagram:

blog-perf-rig-sequence-diag

If you like nerding out about Postgres performance, @-me on Twitter @philosohacker, and if you like working on large data systems, we’re hiring! Check out our team and open roles.

I’d like to thank John Krauss for his help with reviewing early drafts of this post.

Matt Dupree

Was this helpful?
PreviousNext

Related Stories

See All

  • How to Ship 41 Product Updates, Improve Usability, and 20x Your Team Engagement in 2 Weeks

    Product Updates

    How to Ship 41 Product Updates, Improve Usability, and 20x Your Team Engagement in 2 Weeks

    October 27, 2021

  • Product Insights

    Heap delivers superior customer experiences and drives revenue with Snowflake Marketplace

    January 24, 2023

  • Product Updates

    Heap Launches Integration with HubSpot

    December 13, 2022

Subscribe

Sign up to stay on top of the latest posts.

Better insights. Faster.

Request Demo
  • Platform
  • Capture
  • Enrichment
  • Integrations
  • Governance
  • Security & Privacy
  • Infrastructure
  • Illuminate
  • Segments
  • Charts
  • Dashboards
  • Playbooks
  • Use Cases
  • Funnel Optimization
  • Product Adoption
  • User Behavior
  • Product Led Growth
  • Customer 360
  • SaaS
  • eCommerce
  • Financial Services
  • Why Heap
  • The Digital Insights Platform
  • How Heap Works
  • How Heap Compares
  • The Future of Insights
  • Resources
  • Blog
  • Content Library
  • Events
  • Topics
  • Heap University
  • Community
  • Professional Services
  • Company
  • About
  • Partners
  • Press
  • Careers
  • Customers
  • Support
  • Request Demo
  • Help Center
  • Contact Us
  • Pricing
  • Social
  • Twitter
  • Facebook
  • LinkedIn
  • YouTube

© 2023 Heap Inc. All Rights Reserved.

  • Legal
  • Privacy Policy
  • Status
  • Trust