CockroachDB TIL: Volume 6 – DZone Database

This is my series of articles covering short “Today I learned” topics as I work with CockroachDB.

Topic 1: Cockroach Init Container

Use init container to initialize Cockroach and exit, you no longer need to explicitly run init when you bring up CockroachDB.

version: '3.9'

services:

  roach-0:
    container_name: roach-0
    hostname: roach-0
    image: cockroachdb/cockroach:v21.2.4
    command: start --logtostderr=WARNING --log-file-verbosity=WARNING --insecure --join=roach-0,roach-1,roach-2 --listen-addr=roach-0:26257 --advertise-addr=roach-0:26257 --max-sql-memory=.25 --cache=.25
    ports:
      - 26257:26257
      - 8080:8080
    healthcheck:
      test: ["CMD", "curl", "http://roach-0:8080/health?ready=1"]
      interval: 2s
      timeout: 1m
      retries: 5
      start_period: 10s

  roach-1:
    container_name: roach-1
    hostname: roach-1
    image: cockroachdb/cockroach:v21.2.4
    command: start --logtostderr=WARNING --log-file-verbosity=WARNING --insecure --join=roach-0,roach-1,roach-2 --listen-addr=roach-1:26257 --advertise-addr=roach-1:26257 --max-sql-memory=.25 --cache=.25
    ports:
      - 26258:26257
      - 8081:8080
    healthcheck:
      test: ["CMD", "curl", "http://roach-1:8080/health?ready=1"]
      interval: 2s
      timeout: 1m
      retries: 5
      start_period: 10s
    
  roach-2:
    container_name: roach-2
    hostname: roach-2
    image: cockroachdb/cockroach:v21.2.4
    command: start --logtostderr=WARNING --log-file-verbosity=WARNING --insecure --join=roach-0,roach-1,roach-2 --listen-addr=roach-2:26257 --advertise-addr=roach-2:26257 --max-sql-memory=.25 --cache=.25
    ports:
      - 26259:26257
      - 8082:8080
    healthcheck:
      test: ["CMD", "curl", "http://roach-2:8080/health?ready=1"]
      interval: 2s
      timeout: 1m
      retries: 5
      start_period: 10s

  init:
    image: cockroachdb/cockroach:v21.2.4
    command: init --insecure --host=roach-0

When docker-compose comes up, you can observe the following:

oach-1   | *
roach-1   | * INFO: initial startup completed.
roach-1   | * Node will now attempt to join a running cluster, or wait for `cockroach init`.
roach-1   | * Client connections will be accepted after this completes successfully.
roach-1   | * Check the log file(s) for progress. 
roach-1   | *
5-init-1  | Cluster successfully initialized
roach-0   | W220121 20:03:55.536091 72 2@gossip/gossip.go:1486 ⋮ [n?] 4  no incoming or outgoing connections
5-init-1 exited with code 0
roach-2   | W220121 20:03:55.781539 48 2@gossip/gossip.go:1486 ⋮ [n?] 4  no incoming or outgoing connections
roach-0   | CockroachDB node starting at 2022-01-21 20:03:55.7840506 +0000 UTC (took 1.1s)
roach-0   | build:               CCL v21.2.4 @ 2022/01/10 18:50:15 (go1.16.6)
roach-0   | webui:               http://roach-0:8080
roach-0   | sql:                 postgresql://root@roach-0:26257/defaultdb?sslmode=disable
roach-0   | sql (JDBC):          jdbc:postgresql://roach-0:26257/defaultdb?sslmode=disable&user=root
roach-0   | RPC client flags:    /cockroach/cockroach <client cmd> --host=roach-0:26257 --insecure

More specifically:

5-init-1  | Cluster successfully initialized
5-init-1 exited with code 0

Topic 2: Chain Column’s Schema Changes Into a Single Statement

Considering the following two schemas:

CREATE TABLE tbl (
    key UUID DEFAULT gen_random_uuid() PRIMARY KEY,
    col1 STRING,
    col2 STRING
);

With change statements, it is more efficient to execute the two statements on a single line. In that way, CockroachDB will execute the statements in a single hop.

ALTER TABLE tbl ALTER COLUMN col1 SET NOT NULL;
ALTER TABLE tbl ALTER COLUMN col1 SET DEFAULT '';

You may wrap the two statements in an explicit transaction and gain some efficiency in terms of executing the batch as a single block.

BEGIN;
ALTER TABLE tbl ALTER COLUMN col1 SET NOT NULL;
ALTER TABLE tbl ALTER COLUMN col1 SET DEFAULT '';
COMMIT;

Perhaps there’s a better way, we can rewrite the schema change statement as a single statement and vary add the subsequent column changes including columns separated by a comma.

ALTER TABLE tbl ALTER COLUMN col1 SET NOT NULL, ALTER COLUMN col1 SET DEFAULT '', ALTER COLUMN col2 SET NOT NULL;

Topic 3: Turn Off fsync (Unsafe)

I was on the fence about whether or not to include the next tip, as there are severe implications in production environments. However, for local development where data is not relevant, it can yield some efficiency and performance. CockroachDB relies on fsync to flush data on disk, so no surprise it is also slow. Considering the example from topic 2:

ALTER TABLE tbl ALTER COLUMN col1 SET NOT NULL, ALTER COLUMN col1 SET DEFAULT '', ALTER COLUMN col2 SET NOT NULL;

Output of the command:

root@:26257/test> ALTER TABLE tbl ALTER COLUMN col1 SET NOT NULL, ALTER COLUMN col1 SET DEFAULT '', ALTER COLUMN col2 SET NOT NULL;
ALTER TABLE

Time: 382ms total (execution 382ms / network 0ms)

The execution stats for this statement do not indicate any slowness. Considering I’m using a Macbook, our engineering suggested turning off fsyncconsidering the implications!

SET CLUSTER SETTING kv.raft_log.disable_synchronization_unsafe = true;
root@:26257/test> SET CLUSTER SETTING kv.raft_log.disable_synchronization_unsafe = true;
SET CLUSTER SETTING

Time: 111ms total (execution 111ms / network 0ms)

root@:26257/test> ALTER TABLE tbl ALTER COLUMN col1 SET NOT NULL, ALTER COLUMN col1 SET DEFAULT '', ALTER COLUMN col2 SET NOT NULL;
ALTER TABLE

Time: 29ms total (execution 29ms / network 0ms)

Things are instantly much faster. Again, please use this carefully!

Topic 4: Use .pgpass With Java pgjdbc Driver

Pgjdbc release 42.3.0 added an option to read the contents of .pgpass file and or environment variable PGPASSFILE. That makes me very happy, as I think any chance I can secure a password from prying eyes is cool in my book.

What we will need is to set up .pgpass file. You can find instructions in my 2nd TIL article, Topic 2. At this point, we need a Java program. Let’s use the sample Java app in our docs. Specifically, we’re going to use the JDBC/Serverless steps. These steps will also walk you through setting up a serverless cluster we are going to use for this topic.

git clone https://github.com/cockroachlabs/example-app-java-jdbc/
cd example-app-java-jdbc
git checkout cockroachcloud

Go to the initialize the database step and run the command pointing to your cluster

cat app/src/main/resources/dbinit.sql | cockroach sql --url "<connection-string>"

I saved the connection string in an environment variable $COCKROACH_URL for simplicity, as described in the TIL Volume 2, Topic 1.

cat app/src/main/resources/dbinit.sql | cockroach sql --url $COCKROACH_URL

Before we run the code, we need to update the connection parameters. Unlike the tutorial in the docs, we’re going to intentionally leave out the password property.

ds.setServerNames(new String[]{"free-tier14.aws-us-east-1.cockroachlabs.cloud"});
        ds.setPortNumbers(new int[]{26257});
        ds.setDatabaseName("artem-serverless-480.bank");
        ds.setSsl(true);T
        ds.setUser("artem");
        //ds.setPassword("{password}");

Finally, we need to update the dependency file to use the latest pgjdbc driver, which at the time of writing is 42.3.3.

You can find the file in ./example-app-java-jdbc/app/build.gradle.

 dependencies {
     // This dependency is used by the application.
     implementation 'com.google.guava:guava:29.0-jre'
-    implementation 'org.postgresql:postgresql:42.2.18'
+    implementation 'org.postgresql:postgresql:42.3.3'
 }```

At this point, we're ready to run the code. The driver will either read the contents of the `.pgpass` file, property `-Dorg.postgresql.pgpassfile=filename` or `PGPASSFILE=` environment variable. Let me show you each way:

This is a gradle project, passing an environment variable to the project looks like so:

```bash
PGPASSFILE=/tmp/.pgpass ./gradlew run

I’m using a temporary directory to illustrate the point, by default, the driver will read the location of $HOME/.pgpass.

> Task :app:run

com.cockroachlabs.BasicExampleDAO.updateAccounts:
    'INSERT INTO accounts (id, balance) VALUES ('fc4da00e-1240-4578-9bdb-aaf873c61ba8', 250)'

com.cockroachlabs.BasicExampleDAO.updateAccounts:
    'INSERT INTO accounts (id, balance) VALUES ('e469ce3a-2352-47a7-b79e-a8f7fa3eb6ab', 1000)'
BasicExampleDAO.updateAccounts:
    => 2 total updated accounts
main:
    => Account balances at time '17:32:20.237880':
    ID 1 => $1000
    ID 2 => $250

com.cockroachlabs.BasicExampleDAO.transferFunds:
    'UPSERT INTO accounts (id, balance) VALUES('2d89869e-f274-4d21-b538-b2440c4f94fd', ((SELECT balance FROM accounts WHERE id = '2d89869e-f274-4d21-b538-b2440c4f94fd') - 100)),('75473622-9889-4cfb-9c32-30dba19ae116', ((SELECT balance FROM accounts WHERE id = '75473622-9889-4cfb-9c32-30dba19ae116') + 100))'

So far, so good. Let’s run this with the Java argument. For that we’re going to edit that same build.gradle file and add a new line.

 application {
     // Define the main class for the application.
     mainClass="com.cockroachlabs.BasicExample"
     applicationDefaultJvmArgs = ['-Djdk.tls.client.protocols=TLSv1.2']
+    applicationDefaultJvmArgs = ['-Dorg.postgresql.pgpassfile=/tmp/.pgpass']
 }

Run again to verify.

./gradlew run

Finally, remove the argument and place the .pgpass file back in $HOME. Once done, run the code again without passing any arguments. It should complete successfully.

For posterity, this is what it would look like if we were unable to read the password from the file:

> Task :app:run
BasicExampleDAO.runSQL ERROR: { state => 08004, cause => null, message => The server requested password-based authentication, but no password was provided by plugin null 

Topic 5: Avoiding Cruft With cockroach demo

Cockroach demo is an invaluable tool assisting in the demo of CockroachDB capabilities. Another method of spinning up an in-memory instance of CockroachDB without persisting any data is using --store=type=mem command. My gripe with both options is that when you spin up either, they leave residual files in the directory. With the demo, you will find inflight_trace_dump and in case of --store=type=mem you will additionally find heap_profiler and goroutine_dump. The way to avoid the creation of those directories is to set your directory read-only and both methods will still work.

mkdir readwrite && cd readwrite

cockroach start-single-node --insecure --store=type=mem,size=0.25 --advertise-addr=localhost --background

In another terminal, issue ls -ltra readwrite:

drwx------@ 464 artem  staff  14848 Mar  7 15:28 ..
drwxr-x---    2 artem  staff     64 Mar  7 16:43 inflight_trace_dump
drwxr-x---    2 artem  staff     64 Mar  7 16:43 goroutine_dump
drwxr-xr-x    5 artem  staff    160 Mar  7 16:43 .
drwxr-x---    6 artem  staff    192 Mar  7 16:44 heap_profiler

You can try the same with demo, and you will notice inflight_trace_dump directory.

stop the process, pkill cockroach and let’s now try with a read-only directory:

mkdir readonly
chmod -w readonly
cd readonly
cockroach start-single-node --insecure --store=type=mem,size=0.25 --advertise-addr=localhost --background

In another terminal, issue ls -ltra readonly:

➜  readonly ls -ltra
total 0
dr-xr-xr-x    2 artem  staff     64 Mar  7 15:26 .
drwx------@ 464 artem  staff  14848 Mar  7 15:28 ..

.

Leave a Comment