Chapter 9. Adding joins to resultsets


Specifying columns used in joins, and multiple joins to the same table
Nested resultsets
Prefetching rows from joined tables

A join element in the schema file defines a join to another table in the schema file. It usually corresponds to a FOREIGN KEY constraint in the actual schema, but doesn't have to be.

  <table name="accounts">
    <column name="account_id" datatype="int64_t" primarykey='1' />
    <column name="code" />

    <join class="ledger_entries">

  <table name="ledger_entries">
    <column name="ledger_entry_id" datatype="int64_t" primarykey='1' />
    <column name="account_id" datatype="int64_t"/>
    <column name="ledger_date" datatype="x::ymd"/>
    <column name="amount" datatype="double"/>

    <join class="accounts" type="inner join">

This is a schema for two tables. Only the ledger_entries table has a foreign key. The above schema translates, approximately, to the following SQL:

create table accounts(
    account_id integer not null,
    code varchar(255) not null,
    primary key(account_id)

create table ledger_entries(
    ledger_entry_id integer not null,
    account_id integer not null,
    ledger_date date not null,
    amount numeric(11,2) not null,
    primary key(ledger_entry_id),
    foreign key(account_id) references accounts(account_id)

The actual column types vary, depending on the SQL database. In this schema, a record in the accounts table may have zero or more records in the ledger_entries table with the same account_id. accounts's join to ledger_entries, consequently, specifies a default join type of a left join. However, a record in ledger_entries must have a record with the same account_id in accounts. This is really an inner join, and the join, therefore, gives an explicit inner join in its type attribute.

join's required class attribute specifies the destination table for the join, the joined-to table. The attribute's value specifies the name of a class, not the name of a table. They're usually the same, but sometimes they're different. So, if the schema file reads, instead:

<table name="ledger_entries" class="ledger">

The first join must read:

<join class="ledger">

joins declare additional methods in the corresponding resultset. Calling those methods adds the corresponding join to the resultset, and the joined-to table's fields can be referenced in the resultset's constraints:

auto ledger_entries_rs=ledger_entries::create(conn);

accounts::base::joins accounts_join=ledger_entries_rs->join_accounts();

ledger_entries_rs->search("code", "=", "Acct1");

for (const auto &row: *ledger_entries)
    std::cout << x::tostring(row->ledger_date.value())
              << ": "
              << row->amount.value()
              << std::endl;

This example executes SQL that's equivalent to SELECT fields FROM ledger_entries INNER JOIN accounts ON ledger_entries.account_id=accounts.account_id WHERE code='Acct1', joining the accounts table to the resultset that iterates over the contents of the ledger_entries table. It's important to note that the fields from accounts do not get selected in addition to the fields from ledger_entries. This is a ledger_entries resultset, and the ledger_entries::base::row always returns values of columns from the ledger_entries table.

Declaring a join in the schema file creates a join_class() method to the resultset's referenced object, where class is the joined-to resulset class. The join from ledger_entries to accounts creates join_accounts() in the ledger_entries reference-counted resultset.

Each join_class() returns a class::base::joins. It is a reference to an object that serves several purposes:

The above example makes no particular use for the joins object, but if both tables in this example had a code column, then ledger_entries->search("code", "=", "Acct1");, becomes code = ? SQL, resulting in an error, due to the ambiguous column name.

It's always possible to explicitly specify the table: ledger_entries->search("ledger_entries.code", "=", "Acct1");. This, however, relies on each table's alias, in the query, defaulting to the name of the table.

This is no longer the case when the same table ends up appearing more than once in the resultset's joins. In this case, each occurence of the table has a unique alias. A formal, documented way to reference the correct table is to use the get_table_alias() method defined in the joined-to class. For consistency, the resultset class also implements a get_table_alias() that gives the resultset table's alias:

auto ledger_entries_rs=ledger_entries::create(conn);

accounts::base::joins accounts_join=ledger_entries_rs->join_accounts();

ledger_entries_rs->search(ledger_entries->get_table_alias() + ".code", "=", accounts_join->get_table_alias() + ".code");

This creates WHERE ledger_entries.code = accounts.code, presuming that both ledger_entries's and accounts's alias is the same as the table name; or the appropriate alias, as specified in the corresponding FROM or JOIN table name. At this time, an alias for the resultset table always ends up to be the same as the table's name. This is subject to change in the future. Always use get_table_alias() for proper results.

The join object implements additional methods that create subsequent joins from the join object's table. So, if, for example, the accounts table had an additional join to an account_types table:

auto ledger_entries_rs=ledger_entries::create(conn);

accounts::base::joins accounts_join=ledger_entries_rs->join_accounts();

account_types::base::joins account_types_join=accounts_join->join_account_types();

Calling ledger_entries's join_accounts() adds a join from the ledger_entries to the accounts table and returns a accounts::base::joins object reference. Calling this object's join_account_types() method adds a join from the accounts to the account_types table.

classname and classname::base::joins always implement the same join_classname() methods (they reference objects where the former is a subclass of the latter).

The above example executes the equivalent of SELECT fields FROM ledger_entries INNER JOIN accounts ON ledger_entries.account_id=accounts.account_id LEFT JOIN account_types ON join condition from accounts to account_types (presuming that the join to the account_types table is a LEFT JOIN). Note that this is different from the following (assuming that the schema file specifies this join relationship):

auto ledger_entries_rs=ledger_entries::create(conn);

accounts::base::joins accounts_join=ledger_entries_rs->join_accounts();

account_types::base::joins account_types_join=ledger_entries->join_account_types();

This results in a slightly different join: SELECT fields FROM ledger_entries INNER JOIN accounts ON ledger_entries.account_id=accounts.account_id LEFT JOIN account_types ON join condition from ledger_entries to account_types.

Specifying columns used in joins, and multiple joins to the same table

A join element consists of one or more column elements, that define which columns join the two tables:

<join class="categories">
  • The creates a method name join_categories which returns a categories::base::joins handle for the join.

  • The resulting SQL, added to the resultset's query, executes the equivalent of LEFT JOIN categories ON table.account_type_id = categories.account_type_id AND table.account_category_id = categories.account_category_id. The two tables get joined using columns with the same name in each table.

Use the foreigncolumn attribute to define a join on columns whose names are different, in each table:

<join class="ledger_entries" type="inner join">
   <column foreigncolumn='ledger_entry_id'>source_ledger_id</column>
  • The creates a method name join_ledger_entries which returns a ledger_entries::base::joins handle for the join.

  • The resulting SQL, added to the resultset's query, executes the equivalent of LEFT JOIN ledger_entries ON table.source_ledger_id = ledger_entries.ledger_entry_id.

Using the same schema given in the beginning of this chapter, adding the following table schema:

<table name="payments">
  <column name="payment_id" datatype="int64_t" primarykey='1' />
  <column name="source_ledger_id" datatype="int64_t" />
  <column name="dest_ledger_id" datatype="int64_t" />

  <join class="ledger_entries" type="inner join" name="source_ledger_id">
    <column foreigncolumn='ledger_entry_id'>source_ledger_id</column>

  <join class="ledger_entries" type="inner join" name="dest_ledger_id">
    <column foreigncolumn='ledger_entry_id'>dest_ledger_id</column>

This adds a payments table to the schema. It has two fields, source_ledger_id and dest_ledger_id. Each one of them is a foreign key to ledger_id in the ledger_entries table.

By default, the join method's name is join_class. With two different joins, to the same ledger_entries table, returning a ledger_entries::base::joins, this doesn't work. An optional name attribute must specify a non-default name for the join. This example creates join_source_ledger_id() and join_dest_ledger_id(), each returning a ledger_entries::base::joins.

auto payments_rs=payments::create(conn);

auto source_account=payments_rs->join_source_ledger_id()->join_accounts();

auto dest_account=payments_rs->join_dest_ledger_id()->join_accounts();

payments_rs->search(source_account->get_table_alias() + ".code", "=", "Acct1",
                    dest_account->get_table_alias() + ".code", "=", "Acct2");

for (const payments_rs::base::row &row: *payments_rs)
    std::cout << row->payment_id.value()
              << " "
              << row->source_ledger_id.value()
              << " "
              << row->dest_ledger_id.value();

This example joins the payments table to the ledger_entries on both of the foreign keys, and each one is subsequently joined to the accounts table. The resultset automatically sets a unique alias for each occurence of a table, in a join. This example uses get_table_alias() to retrieve the alias of each occurence of accounts, and use it in the resultset's constraint.

This example results in the following SQL (with some whitespace added for readability):

SELECT payments.payment_id, payments.source_ledger_id,
       payments.dest_ledger_id FROM payments AS payments
    INNER JOIN ledger_entries AS ledger_entries
       ON payments.source_ledger_id=ledger_entries.ledger_entry_id
       INNER JOIN accounts AS accounts
           ON ledger_entries.account_id=accounts.account_id
    INNER JOIN ledger_entries AS ledger_entries_2
       ON payments.dest_ledger_id=ledger_entries_2.ledger_entry_id
       INNER JOIN accounts AS accounts_2
           ON ledger_entries_2.account_id=accounts_2.account_id
WHERE (accounts.code = 'Acct1' AND accounts_2.code = 'Acct2')