NEW: Heap for mobile. Track every interaction, on every platform.

Learn more
skip to content
Loading...
    • The Digital Insights Platform Transform your digital experience
    • How Heap Works A video guide
    • How Heap Compares Heap vs. competitors
    • The Future of Insights A comic book guide
  • Data Insights

    • Session Replay Complete context with a single click
    • Illuminate Data science that pinpoints unknown friction
    • Journeys Visual maps of all user flows

    Data Analysis

    • Segments User cohorts for actionable insights
    • Dashboards Share insights on critical metrics
    • Charts Analyze everything about your users
    • Playbooks Plug-and-play templates and analyses

    Data Foundation

    • Capture Automatic event tracking and apis
    • Mobile Track and analyze your users across devices
    • Enrichment Add context to your data
    • Integrations Connect bi-directionally to other tools

    Data Management

    • Governance Keep data clean and trusted
    • Security & Privacy Security and compliance made simple
    • Infrastructure How we build for scale
    • Heap Connect Send Heap data directly to your warehouse
  • Solutions

    • Funnel Optimization Improve conversion in user flows
    • Product Adoption Maximize adoption across your site
    • User Behavior Understand what your users do
    • Product Led Growth Manage PLG with data

    Industries

    • SaaS Easily improve acquisition, retention, and expansion
    • eCommerce Increase purchases and order value
    • Financial Services Raise share of wallet and LTV

    Heap For Teams

    • Product Teams Optimize product activation, conversion and retention
    • Marketing Teams Optimize acquisition performance and costs
    • Data Teams Optimize behavioral data without code
  • Pricing
  • Support

    • Heap University Video Tutorials
    • Help Center How to use Heap
    • Heap Plays Tactical how-to guides
    • Heap Updates
    • Professional Services

    Resources

    • Blog A community for digital builders
    • Content Library Ebooks, whitepapers, videos, guides
    • Press News from and about Heap
    • Webinars & Events Virtual and live events
    • Careers Join us

    Ecosystem

    • Customer Community Join the conversation
    • Partners Technology and Solutions Partners
    • Developers
    • Customers Over 8,000 successful companies
  • Free TrialRequest Demo
  • Log In
  • Free Trial
  • Request Demo
  • Log In

All Blogs

Engineering

PostgreSQL's Powerful New Join Type: LATERAL

Dan Robinson
December 2, 20145 min read
  • Facebook
  • Twitter
  • LinkedIn

PostgreSQL 9.3 has a new join type! Lateral joins arrived without a lot of fanfare, but they enable some powerful new queries that were previously only tractable with procedural code. In this post, I’ll walk through a conversion funnel analysis that wouldn’t be possible in PostgreSQL 9.2.

Interested in learning more about Heap Engineering? Meet our team to get a feel for what it’s like to work at Heap!

What is a LATERAL join?

The best description in the documentation comes at the bottom of the list of FROM clause options:

The LATERAL key word can precede a sub-SELECT FROM item. This allows the sub-SELECT to refer to columns of FROM items that appear before it in the FROM list. (Without LATERAL, each sub-SELECT is evaluated independently and so cannot cross-reference any other FROM item.)

When a FROM item contains LATERAL cross-references, evaluation proceeds as follows: for each row of the FROM item providing the cross-referenced column(s), or set of rows of multiple FROM items providing the columns, the LATERAL item is evaluated using that row or row set’s values of the columns. The resulting row(s) are joined as usual with the rows they were computed from. This is repeated for each row or set of rows from the column source table(s).

This is a bit dense. Loosely, it means that a LATERAL join is like a SQL foreach loop, in which PostgreSQL will iterate over each row in a result set and evaluate a subquery using that row as a parameter.

What can we do with this?

Consider a table of click events with the following schema:

Each event is associated with a user and has an ID, a timestamp, and a JSON blob with the event’s properties. At Heap, these properties might include the DOM hierarchy of a click, the window title, the session referrer, and so forth.Let’s say we want to optimize our landing page to increase signups. The first step is to figure out where we’re losing users in our conversion funnel.

An example conversion funnel between four steps in a signup flow.

An example conversion funnel between four steps in a signup flow. We’ll assume that we’ve instrumented our frontend to log events along this flow and that all of the data lives in the event table specified above.*

As an initial question, let’s figure out how many people view our homepage and what percentage of them enter a credit card within two weeks of that initial homepage view. If we were using an older version of PostgreSQL, we might write some custom functions in PL/pgSQL, PostgreSQL’s builtin procedural language. But, in 9.3, we can use a lateral join to compute this in one efficient query, with no extensions or PL/pgSQL.

Nobody likes 30-line SQL queries, so let’s break this down into pieces. The first chunk of this is vanilla SQL:

That is, get the initial time each user did a view_homepage event. Then our lateral join allows us to iterate over each resulting row and perform a parametrized version of the next subquery. This is equivalent to taking the query below and running it for each resulting row:

I.e., for each user, get the first time he or she performed the enter_credit_card event within two weeks of view_homepage_time. Because this is a lateral join, our subquery can make reference to the view_homepage_time results from the previous subquery. Otherwise, the subqueries would be evaluated independently and there would be no way to access results from one when evaluating the other.Then we wrap the whole thing in a select, which returns something like this:

Because this is a LEFT JOIN, the query still produces result rows for users with no matching enter_credit_card event, as long as there is a view_homepage event. If we aggregate over the numerical columns, we get a tidy summary of this conversion funnel:

… which produces:

We can add intermediate steps to this funnel with more lateral joins to evaluate which portions of our flow we should focus on improving.** Let’s add a use_demo step between viewing the homepage and entering a credit card.

This yields:***

This gives us the three-step conversion funnel from viewing the homepage to using the demo within a week to entering the credit card within a week of that. From here, the expressive power of PostgreSQL allows us to drill down on these results and thoroughly analyze the performance of our website. We might follow up with:

  • Does using the demo increase the likelihood of a signup?

  • Do users who discover our homepage via an ad convert with the same likelihood as users from other sources?

  • How do these conversion rates change over time or with different A/B test variants?

The answers to these questions apply directly to product changes and can be determined in PostgreSQL now that it supports lateral joins.Without lateral joins, we would need to resort to PL/pgSQL to do this analysis.

Or, if our data set were small, we could get away with complex, inefficient queries. In an exploratory data science use case, you might just pull your data out of PostgreSQL and analyze it with your scripting language of choice. But there is considerable power in being able to express these questions in SQL, especially if you’re wrapping it all in an approachable UI and exposing the functionality to nontechnical users.

Note that these queries can be tuned so as to be very efficient. In this example, if we create a btree index on (user_id, (data->>’type’), time), we can evaluate each funnel step for each user with a single indexed lookup. If you’re using SSDs, on which seeks are cheap, this might be good enough. If not, you might need to schematize your data a bit differently, but I’ll leave the details of that for another post.

Have a favorite new PostgreSQL feature or a neat lateral join use case? Ping me @danlovesproofs.If you find this kind of work interesting, we’re hiring! Check out our engineering team and open roles.

* Or we can use Heap, which captures everything for us as soon as we install it! No need to write any logging code, and no risk of forgetting to log something that you’ll want to analyze later.

** Note that adding additional steps to a conversion funnel would be particularly easy if we were using product analytics like Heap, since we would already have the relevant data.

*** The number of users with enter_credit_card events is lower in this query than in the previous one, as this query only returns enter_credit_card events for users who do so after doing the use_demo event, and 17 of the users signed up without using the demo.

Additional resources:

  • What is product analytics?

  • Google analytics alternatives

Dan Robinson

Was this helpful?
PreviousNext

Related Stories

See All

  • Google Analytics 4

    Product Insights

    Google Analytics 4: What it promises, and what that really means

    April 28, 2022

  • Heap.io

    How to

    The 3 key first steps to improving CRO

    March 29, 2023

  • Heap.io

    Data Stories

    Celebrating H&R Block as the inaugural winner of the Digital Innovator Award

    March 22, 2023

Subscribe

Sign up to stay on top of the latest posts.

Better insights. Faster.

Request Demo
  • Platform
  • Capture
  • Enrichment
  • Integrations
  • Governance
  • Security & Privacy
  • Infrastructure
  • Illuminate
  • Segments
  • Charts
  • Dashboards
  • Playbooks
  • Use Cases
  • Funnel Optimization
  • Product Adoption
  • User Behavior
  • Product Led Growth
  • Customer 360
  • SaaS
  • eCommerce
  • Financial Services
  • Why Heap
  • The Digital Insights Platform
  • How Heap Works
  • How Heap Compares
  • The Future of Insights
  • Resources
  • Blog
  • Content Library
  • Events
  • Topics
  • Heap University
  • Community
  • Professional Services
  • Company
  • About
  • Partners
  • Press
  • Careers
  • Customers
  • Support
  • Request Demo
  • Help Center
  • Contact Us
  • Pricing
  • Social
  • Twitter
  • Facebook
  • LinkedIn
  • YouTube

© 2023 Heap Inc. All Rights Reserved.

  • Legal
  • Privacy Policy
  • Status
  • Trust