Overview
This is the 2nd post in a series about databases originating in the Clojure ecosystem. The first article on Datomic is linked here.
In this post, we'll be looking at XTDB version 2. It's a bitemporal SQL database that runs on top of Postgres.
Initially, I was excited about giving XTDB a try as I'm familiar with bitemporal models from using the Australian AEMO MSATS repository. I've also used Postgres in several production contexts.
As a quick reminder, the dataset I'll be using for the investigation is the daily energy offers published by the Electricity Authority in New Zealand. To describe the set in short, it allows sellers of energy and reserves to make offers to supply the NZ wholesale market.
Kicking Off
The quickstart guide on the XTDB site was easy to follow. Using the native psql client, everything appeared like standard SQL. For example, from the first example in the tutorial, the inserts were of the form:
INSERT INTO people (_id, name) VALUES (6, 'fred');
I also reviewed the Clojure/JDBC guide and noticed that the syntax differed. It was demonstrated as below using nested JSON. I'll return to this later.
(with-open [conn (jdbc/get-connection "jdbc:xtdb://localhost/xtdb")]
(jdbc/execute! conn ["INSERT INTO users RECORDS {_id: 'jms', first_name: 'James'}"])
(jdbc/execute! conn ["INSERT INTO users RECORDS ?" {:xt/id "joe", :first-name "Joe"}]))
My local setup was straightforward. Here is my local tmux session, with start-xtdb.sh just being a wrapper around docker run:
docker run --name xtdb -p 5432:5432
ghcr.io/xtdb/xtdb
root: /home/gcode/dev/arachnid/clojure/energynz
windows:
- db:
layout: main-vertical
panes:
- /home/gcode/dev/arachnid/clojure/energynz/bin/start-xtdb.sh
- datagrip
- code:
layout: even-horizontal
panes:
- idea /home/gcode/dev/arachnid/clojure/energynz
Inserting Data - Take 1
Since I already had the parsing logic from the first post using Datomic, I thought this would be a quick exercise, but I was wrong, so very very wrong. I knew that XTDB didn't require a schema upfront, so I figured a straight insert with some bind parameters would be enough.
(def offers-insert "
INSERT INTO offers (
trading_date,
trading_period,
tranche,
participant,
point_of_connection,
submission_order,
product_type,
unit,
product_description,
product_class,
reserve_type,
utc_submission_date,
utc_submission_time,
is_latest,
max_ramp_up_mw_per_hour,
max_ramp_down_mw_per_hour,
plsr_percent,
max_output_mw,
forecast_generation_potential_mw,
megawatts,
dollars_per_mwh
)
VALUES (
?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?
)
")
(defn insert-csv-records!
[connection seq-m-csv-records]
(let [ps (jdbc/prepare connection [offers-insert])
values-vec (mapv (fn [record]
[(:trading_date record)
(:trading_period record)
(:tranche record)
(:participant record)
(:point_of_connection record)
(:submission_order record)
(:product_type record)
(:unit record)
(:product_description record)
(:product_class record)
(:reserve_type record)
(:utc_submission_date record)
(:utc_submission_time record)
(:is_latest record)
(:max_ramp_up_mw_per_hour record)
(:max_ramp_down_mw_per_hour record)
(:plsr_percent record)
(:max_output_mw record)
(:forecast_generation_potential_mw record)
(:megawatts record)
(:dollars_per_mwh record)])
seq-m-csv-records)]
;; For large datasets, process in smaller batches to avoid memory issues
(if (> (count values-vec) 10000)
(do
(t/log! :info ["Processing" (count values-vec) "records in batches..."])
(doseq [batch (partition-all 5000 values-vec)]
(jdbc/execute-batch! ps batch)))
(jdbc/execute-batch! ps values-vec))))
;; =====
;; Error
;; =====
Execution error (PSQLException) at org.postgresql.core.v3.QueryExecutorImpl/receiveErrorResponse (QueryExecutorImpl.java:2733).
ERROR: Missing types for args - client must specify types for all non-null params in DML statements
Detail: {"query":"INSERT INTO offers (\n trading_date,\n trading_period,\n tranche,\n
participant,\n point_of_connection,\n submission_order,\n product_type,\n unit,\n
product_description,\n product_class,\n reserve_type,\n utc_submission_date,\n
utc_submission_time,\n is_latest,\n max_ramp_up_mw_per_hour,\n max_ramp_down_mw_per_hour,\n
plsr_percent,\n max_output_mw,\n forecast_generation_potential_mw,\n megawatts,\n
dollars_per_mwh\n )\n VALUES (\n $1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11,
$12, $13, $14, $15, $16, $17, $18, $19, $20, $21\n )",
"param-oids":[0,23,23,1043,1043,23,1043,0,1043,1043,1043,0,1043,16,0,0,0,0,0,1700,1700],
"category":"cognitect.anomalies\/incorrect","code":"xtdb.pgwire\/missing-arg-types","message":
"Missing types for args - client must specify types for all non-null params in DML statements"}
Given the error, I thought that having a table definition might help with the inserts. I have since found out that XTDB doesn't support user-defined tables, but let's follow through with the errors below anyway.
- First, I attempted to create a table and received the error message "Error 1" below. line 1:7 missing 'USER' at 'TABLE'. It kind of sounds like XTDB knows about tables from this, but I'd just missed a user prefix.
- I tried both an xtdb and postgres prefix on the create table statement after which I got the even more helpful message recorded in "Error 2" below. All arguments to `concat` must be of the same type. At this stage, I realised I was in common Clojure territory, where the error messages I was getting had nothing to do with the underlying issue.
(def offers-create-table "
CREATE TABLE IF NOT EXISTS offers (
id BIGSERIAL PRIMARY KEY,
trading_date TIMESTAMP NOT NULL,
trading_period INTEGER NOT NULL CHECK (trading_period >= 1 AND trading_period <= 50),
tranche INTEGER NOT NULL,
participant VARCHAR(20) NOT NULL,
point_of_connection VARCHAR(20) NOT NULL,
submission_order INTEGER NOT NULL,
product_type VARCHAR(20) NOT NULL,
unit VARCHAR(20),
product_description VARCHAR(100),
product_class VARCHAR(20),
reserve_type VARCHAR(10),
utc_submission_date DATE,
utc_submission_time VARCHAR(20),
is_latest BOOLEAN NOT NULL DEFAULT false,
max_ramp_up_mw_per_hour DECIMAL(10,3),
max_ramp_down_mw_per_hour DECIMAL(10,3),
plsr_percent DECIMAL(5,2),
max_output_mw DECIMAL(10,3),
forecast_generation_potential_mw DECIMAL(10,3),
megawatts DECIMAL(10,3) NOT NULL,
dollars_per_mwh DECIMAL(10,2) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
")
;; =========
;; Error 1
;; =========
Execution error (PSQLException) at org.postgresql.core.v3.QueryExecutorImpl/receiveErrorResponse (QueryExecutorImpl.java:2733).
ERROR: Errors parsing SQL statement:
- line 1:7 missing 'USER' at 'TABLE'
Detail: {"category":"cognitect.anomalies\/incorrect","code":"xtdb\/sql-error","message":"Errors parsing SQL statement:\n - line 1:7 missing 'USER' at 'TABLE'"}
;; =========
;; Error 2
;; =========
Failed to retrieve routines in xtdb.public.
[08P01] ERROR: All arguments to `concat` must be of the same type.
Detail: {"types":["utf8","i32"],"category":"cognitect.anomalies\/incorrect","code":"xtdb.expression\/type-error","message":"All arguments to `concat` must be of the same type."}.
Failed to retrieve user types in xtdb.public.
[08P01] ERROR: Errors parsing SQL statement:
- line 1:183 mismatched input '(' expecting {<EOF>, ';'}
Detail: {"category":"cognitect.anomalies\/incorrect","code":"xtdb\/sql-error","message":"Errors parsing SQL statement:\n - line 1:183 mismatched input '(' expecting {<EOF>, ';'}"}.
Inserting Data - Take 2
Now that I was aware that XTDB didn't support user-defined tables (Even though the errors might lead you to believe otherwise), I returned to trying to encode type information into the INSERT statement. My first attempt was to use inline type annotations in the statement itself. I then tried CASTs and, lastly, the jdbc.next.types and prepared statement setParameter methods.
All ended up with the error: ERROR: Missing types for args - client must specify types for all non-null params in DML statements
;; ==================================
;; Inline SQL Statement type suffixes
;; ==================================
(def offers-insert "
INSERT INTO offers (
trading_date,
trading_period,
tranche,
participant,
point_of_connection,
submission_order,
product_type,
unit,
product_description,
product_class,
reserve_type,
utc_submission_date,
utc_submission_time,
is_latest,
max_ramp_up_mw_per_hour,
max_ramp_down_mw_per_hour,
plsr_percent,
max_output_mw,
forecast_generation_potential_mw,
megawatts,
dollars_per_mwh
)
VALUES (
?::timestamp, ?::integer, ?::integer, ?::varchar, ?::varchar, ?::integer, ?::varchar,
?::varchar, ?::varchar, ?::varchar, ?::varchar, ?::date, ?::varchar, ?::boolean,
?::decimal, ?::decimal, ?::decimal, ?::decimal, ?::decimal, ?::decimal, ?::decimal
)
")
;; ==================================
;; Inline SQL Statement CASTS
;; ==================================
(def offers-insert "
INSERT INTO offers (
trading_date,
trading_period,
tranche,
participant,
point_of_connection,
submission_order,
product_type,
unit,
product_description,
product_class,
reserve_type,
utc_submission_date,
utc_submission_time,
is_latest,
max_ramp_up_mw_per_hour,
max_ramp_down_mw_per_hour,
plsr_percent,
max_output_mw,
forecast_generation_potential_mw,
megawatts,
dollars_per_mwh
)
VALUES (
CAST(? AS TIMESTAMP),
CAST(? AS INTEGER),
CAST(? AS INTEGER),
CAST(? AS VARCHAR),
CAST(? AS VARCHAR),
CAST(? AS INTEGER),
CAST(? AS VARCHAR),
CAST(? AS VARCHAR),
CAST(? AS VARCHAR),
CAST(? AS VARCHAR),
CAST(? AS VARCHAR),
CAST(? AS DATE),
CAST(? AS VARCHAR),
CAST(? AS BOOLEAN),
CAST(? AS DECIMAL),
CAST(? AS DECIMAL),
CAST(? AS DECIMAL),
CAST(? AS DECIMAL),
CAST(? AS DECIMAL),
CAST(? AS DECIMAL),
CAST(? AS DECIMAL)
)")
;; =====================================
;; Prepared Statement setX parameters
;; =====================================
(def offers-insert "
INSERT INTO offers (
trading_date,
trading_period,
tranche,
participant,
point_of_connection,
submission_order,
product_type,
unit,
product_description,
product_class,
reserve_type,
utc_submission_date,
utc_submission_time,
is_latest,
max_ramp_up_mw_per_hour,
max_ramp_down_mw_per_hour,
plsr_percent,
max_output_mw,
forecast_generation_potential_mw,
megawatts,
dollars_per_mwh
)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
")
(defn record->params [r]
[ (types/as-timestamp (:trading_date r))
(types/as-integer (:trading_period r))
(types/as-integer (:tranche r))
(types/as-varchar (:participant r))
(types/as-varchar (:point_of_connection r))
(types/as-integer (:submission_order r))
(types/as-varchar (:product_type r))
(types/as-varchar (:unit r))
(types/as-varchar (:product_description r))
(types/as-varchar (:product_class r))
(types/as-varchar (:reserve_type r))
(types/as-date (:utc_submission_date r))
(types/as-varchar (:utc_submission_time r))
(types/as-boolean (:is_latest r))
(types/as-numeric (:max_ramp_up_mw_per_hour r))
(types/as-numeric (:max_ramp_down_mw_per_hour r))
(types/as-numeric (:plsr_percent r))
(types/as-numeric (:max_output_mw r))
(types/as-numeric (:forecast_generation_potential_mw r))
(types/as-numeric (:megawatts r))
(types/as-numeric (:dollars_per_mwh r))])
(defn insert-csv-records!
[connection seq-m-csv-records]
;; Process records one by one to avoid batch issues with XTDB's type requirements
(t/log! :info ["Processing" (count seq-m-csv-records) "records..."])
(with-open [ps (jdbc/prepare connection [offers-insert])]
(doseq [r seq-m-csv-records]
(jdbc/execute-batch! ps [(record->params r)]))))
Give the AI a crack
It was time to give the AI overloads a crack, first I fired up Claude Code on the Opus model. I love Claude Code and use it most days at work but it failed to make any progress. It basically went around in circles using different next.jdbc/bind parameter incantations but never actually solved the problem.
I started an OpenAI session on the 03 model and provided it with the full namespace as input, along with the error. It generated a number of suggestions, but again, none of them worked.
It seemed that the SQL bind parameter approach I know and love was no longer viable. Next, I decided to attempt the nested 'RECORDS' syntax from the Clojure driver documentation. It deviates from standard JDBC, and XTDB is sold as a SQL database, but ah well *shrug*.
Nested Record Inserts
Well, now the syntax was going to be much simpler, I started with:
(def offers-insert "INSERT INTO offers RECORDS ?")
And was ready to go with a plain Clojure map, executed it and....
Execution error (PSQLException) at org.postgresql.jdbc.PgPreparedStatement/setMap (PgPreparedStatement.java:547).
No hstore extension installed.
By this stage, I'd run into so many errors that I was determined to get something to work. All that remained was the insertion of a nested JSON doc.
(defn map->xtdb-record-string
"Convert a Clojure map to XTDB record format string"
[m]
(str "{"
(str/join ", "
(map (fn [[k v]]
(str (name k) ": "
(cond
(nil? v) "NULL"
(string? v) (str "'" (str/replace v "'" "''") "'")
(instance? java.sql.Timestamp v) (str "TIMESTAMP '" v "'")
(instance? java.sql.Date v) (str "DATE '" v "'")
(instance? Boolean v) (str v)
:else (str v))))
m))
"}"))
(defn insert-csv-records!
[connection seq-m-csv-records]
;; Use XTDB's RECORDS syntax with string concatenation
(t/log! :info ["Processing" (count seq-m-csv-records) "records..."])
(doseq [[idx r] (map-indexed vector seq-m-csv-records)]
(let [r-with-id (assoc r :_id (inc idx))
record-str (map->xtdb-record-string r-with-id)
sql (str "INSERT INTO offers RECORDS " record-str)]
(jdbc/execute! connection [sql]))))
In the end, it was a hack using string concatenation, but I was able to insert the records; this process took over an hour for ~270k records with no batching or overarching transaction.
This allowed me to run some entry-level analytic queries, but even then, I hit parsing problems when trying to issue a query that used the ROUND operation.
SELECT COUNT(*) FROM offers
/
+---------+
|_column_1|
+---------+
|271734 |
+---------+
SELECT participant,
point_of_connection,
trading_date,
trading_period,
tranche,
is_latest,
megawatts,
dollars_per_mwh,
reserve_type
FROM OFFERS o
WHERE participant = 'CTCT'
AND point_of_connection = 'CYD2201'
AND trading_period = 24
AND tranche = 1
AND is_latest = true
AND reserve_type IS NULL
/
+-----------+-------------------+--------------------------+--------------+-------+---------+---------+---------------+------------+
|participant|point_of_connection|trading_date |trading_period|tranche|is_latest|megawatts|dollars_per_mwh|reserve_type|
+-----------+-------------------+--------------------------+--------------+-------+---------+---------+---------------+------------+
|CTCT |CYD2201 |2024-06-10 00:00:00.000000|24 |1 |true |185 |0.01 |null |
+-----------+-------------------+--------------------------+--------------+-------+---------+---------+---------------+------------+
-- With Auto Generated BiTemporal Fields
SELECT _id,
_system_from,
_system_to,
_valid_from,
_valid_to,
participant,
point_of_connection,
trading_date
FROM OFFERS o
WHERE participant = 'CTCT'
AND point_of_connection = 'CYD2201'
AND trading_period = 24
AND tranche = 1
AND is_latest = true
AND reserve_type IS NULL;
+-----+---------------------------------+----------+---------------------------------+---------+-----------+-------------------+--------------------------+
|_id |_system_from |_system_to|_valid_from |_valid_to|participant|point_of_connection|trading_date |
+-----+---------------------------------+----------+---------------------------------+---------+-----------+-------------------+--------------------------+
|60733|2025-07-20 08:42:25.836001 +00:00|null |2025-07-20 08:42:25.836001 +00:00|null |CTCT |CYD2201 |2024-06-10 00:00:00.000000|
+-----+---------------------------------+----------+---------------------------------+---------+-----------+-------------------+--------------------------+
-- Aggregate Doesn't support ROUND appearently
SELECT o.participant,
ROUND(SUM(o.megawatts), 2) megawatts_offered,
MIN(o.dollars_per_mwh) min_offer_dollar,
MAX(o.dollars_per_mwh) max_offer_dollar,
ROUND(AVG(o.dollars_per_mwh), 2) avg_offer_dollar
FROM offers o
WHERE o.is_latest = true
AND o.reserve_type IS NULL
GROUP BY o.participant
ORDER BY 1
[2025-07-21 20:36:07] [08P01] ERROR: Errors parsing SQL statement:
[2025-07-21 20:36:07] - line 2:12 mismatched input '(' expecting {<EOF>, ';'}
[2025-07-21 20:36:07] Detail: {"category":"cognitect.anomalies\/incorrect","code":"xtdb\/sql-error","message":"Errors parsing SQL statement:\n - line 2:12 mismatched input '(' expecting {<EOF>, ';'}"}
-- Without ROUND
SELECT o.participant,
SUM(o.megawatts) megawatts_offered,
MIN(o.dollars_per_mwh) min_offer_dollar,
MAX(o.dollars_per_mwh) max_offer_dollar,
AVG(o.dollars_per_mwh) avg_offer_dollar
FROM offers o
WHERE o.is_latest = true
AND o.reserve_type IS NULL
GROUP BY o.participant
ORDER BY 1
+-----------+------------------+----------------+----------------+---------------------+
|participant|megawatts_offered |min_offer_dollar|max_offer_dollar|avg_offer_dollar |
+-----------+------------------+----------------+----------------+---------------------+
|ALNT |3072 |0 |0 |0 |
|CHHE |2016 |0 |5 |2.8 |
|CNIR |22519 |0 |5005 |775.3752719298184 |
|CTCT |86208.57899999991 |0 |6500 |1822.3974270833132 |
|GENE |74258.40000000007 |0 |4999.07 |1118.5210448717783 |
|KING |468 |0 |325 |61.208333333333336 |
|LODE |2304 |0 |0.01 |0.0020000000000000013|
|MERI |132192.00000000023|0 |500 |107.60357142857268 |
|MRPL |71734.20000000001 |0 |5000.03 |1047.6626517857144 |
|NAPJ |6603 |0 |5000.03 |4000.0133333333424 |
|NGAG |1488 |0.01 |1000 |217.0020000000002 |
|NPEL |15.343999999999996|0 |1000 |109.80920833333327 |
|NZWF |2328 |0 |0.01 |0.002000000000000001 |
|PUNZ |616.28 |0 |0 |0 |
|TAOM |983 |0 |200 |46.4 |
|TODD |7456.499999999997 |0 |1500 |300.0040000000003 |
|TUAR |3883 |0 |5000.03 |4000.012000000008 |
+-----------+------------------+----------------+----------------+---------------------+
Summary
Unfortunately, working with XTDB proved to be a frustrating experience. A SQL database should be far easier to populate than this. There are so many strong relational databases including plain Postgres itself that XTDB is a hard sell.
The automatically added bitemporal fields are nice to have, but they're hardly a game changer. At Red Energy, we have been using these bitemporal models for 20 years.
I really need to mention the error messages again. They were misleading and burned developer time trying to interpret them. In the end, it was better to pretend there was no error message and to just reason about what the problem might be.
There are only so many times I can slowly shake my head in frustration when another Clojure-based offering prints some unintelligible gibberish instead of a helpful error message.