Unlocking Scalability Magic: Run Postgres Read-Write Nodes on Docker and Implement it with Python’s SQLAlchemy

Fahad Ahammed
10 min readDec 16, 2023

Scaling your PostgreSQL database for high-performance read and write operations is crucial for modern applications. When you use a managed Relational Database Service like Aurora RDS (PostgreSQL) from AWS, you get one primary node and a choice of multiple reader nodes.

To have better High Availability, AWS ensures any failure in the primary node automatically failover to any other reader nodes and makes them a primary one.

Efforts to give a too-basic visualization about Auroras Read and Primary endpoints

AWS Aurora Cluster offers two endpoints in FQDN form. That DNS has the entries of the nodes that the cluster contains with shallow TTL values(5 seconds TTL). Internally, it handles

  • DNS entries for added and removed reader nodes.
  • DNS change for failover of primary nodes to reader nodes.
  • DNS entries of reader nodes will follow a round-robin algorithm.

Agenda

  • How to create a mock Aurora cluster using docker-compose?
  • How to split the read-write queries in SQLalchemy?

Database Architecture

I will be creating

  • Read-Write Node: This primary server handles all write operations and keeps the database in sync. Possible to scale this node vertically by increasing resources or horizontally by adding more replicas.
  • Reader Nodes: These secondary servers handle read-only queries, offloading pressure from the primary node and improving read performance. Possible to scale these nodes horizontally without affecting writes.

As I will be working in my workstation and not going to depend on DNS, I am choosing HAPROXY to load the balance between reader nodes with the “leastconn” algorithm.

An approach to mock the Aurora Endpoints but with HAPROXY endpoints instead of DNS

How to create a mock Aurora Cluster using docker-compose?

The directory structure follows:

$ tree
.
├── docker-compose.yaml
└── haproxy.cfg

0 directories, 2 files

In the same directory, there is a haproxy.cfg file needed for our load-balanced workload.

docker-compose.yaml

version: '3.9'

volumes:
postgresql_rw_data:
driver: local

services:
postgresql-writer:
image: bitnami/postgresql:16
restart: always
ports:
- '5432:5432'
volumes:
- postgresql_rw_data:/bitnami/postgresql
environment:
- POSTGRESQL_PGAUDIT_LOG=READ,WRITE
- POSTGRESQL_LOG_HOSTNAME=true
- POSTGRESQL_REPLICATION_MODE=master
- POSTGRESQL_REPLICATION_USER=repl_user
- POSTGRESQL_REPLICATION_PASSWORD=repl_user
- POSTGRESQL_USERNAME=postgres
- POSTGRESQL_PASSWORD=postgres
- POSTGRESQL_DATABASE=development_database
- ALLOW_EMPTY_PASSWORD=yes
- TZ=Asia/Dhaka

postgresql-reader-1:
image: bitnami/postgresql:16
restart: always
ports:
- '25432:5432'
depends_on:
- postgresql-writer
environment:
- POSTGRESQL_PASSWORD=postgres
- POSTGRESQL_MASTER_HOST=postgresql-writer
- POSTGRESQL_PGAUDIT_LOG=READ
- POSTGRESQL_LOG_HOSTNAME=true
- POSTGRESQL_REPLICATION_MODE=slave
- POSTGRESQL_REPLICATION_USER=repl_user
- POSTGRESQL_REPLICATION_PASSWORD=repl_user
- POSTGRESQL_MASTER_PORT_NUMBER=5432
- ALLOW_EMPTY_PASSWORD=yes
- TZ=Asia/Dhaka

postgresql-reader-2:
image: bitnami/postgresql:16
restart: always
ports:
- '35432:5432'
depends_on:
- postgresql-writer
- postgresql-reader-1
environment:
- POSTGRESQL_PASSWORD=postgres
- POSTGRESQL_MASTER_HOST=postgresql-writer
- POSTGRESQL_PGAUDIT_LOG=READ
- POSTGRESQL_LOG_HOSTNAME=true
- POSTGRESQL_REPLICATION_MODE=slave
- POSTGRESQL_REPLICATION_USER=repl_user
- POSTGRESQL_REPLICATION_PASSWORD=repl_user
- POSTGRESQL_MASTER_PORT_NUMBER=5432
- ALLOW_EMPTY_PASSWORD=yes
- TZ=Asia/Dhaka

haproxy:
image: haproxy:2.8
depends_on:
- postgresql-writer
- postgresql-reader-1
- postgresql-reader-2
volumes:
- ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro
ports:
- 5433:5433
- 5434:5434
- 8404:8404
environment:
- TZ=Asia/Dhaka

The docker-compose file consists of db credentials which is not at all for production use.

A brief about the “docker-compose.yaml” file:

  • It uses a volume block for the writer node.
  • It uses an image from the docker hub called “bitnami/postgresql:16 ”which denotes that the image is prepared by Bitnami and the PostgreSQL version is 16.
  • Created a “haproxy” configuration in the host and mounted that for “haproxy” container.

haproxy.cfg

defaults
mode http
timeout client 24h
timeout connect 24h
timeout server 24h
timeout http-request 10s
log global

frontend stats
bind *:8404
stats enable
stats uri /
stats refresh 10s

frontend postgresql_ro
bind *:5434
mode tcp
default_backend postgresql_ro_hosts

backend postgresql_ro_hosts
mode tcp
balance leastconn
option tcp-check
server postgresql-reader-1 postgresql-reader-1:5432 check inter 3s fall 3
server postgresql-reader-2 postgresql-reader-2:5432 check inter 3s fall 3
server postgresql-writer postgresql-writer:5432 check backup inter 3s fall 3

frontend postgresql_rw
bind *:5433
mode tcp
default_backend postgresql_rw_hosts

backend postgresql_rw_hosts
mode tcp
balance leastconn
option tcp-check
server postgresql-writer postgresql-writer:5432 check inter 3s fall 3

A brief about “haproxy” configuration:

  • Used two “frontend”, as endpoints for writer container and reader containers.
  • “frontend” contains the respective backend having nodes host via container name. backends and frontends used TCP mode.
  • there are very basic checks for the “backends” if the port is accessible and alive or not.
  • Not using PostgreSQL check for the nodes.
  • In the reader backend, used writer host as a backup for better failover case.
  • This configuration needs more curated optimization and tweaks for production. Not suggested to run this in production. This might cause a closed connection issue along with others.

Run

$ docker compose up -d
$ docker compose ps
NAME                        IMAGE                   COMMAND                                                                                  SERVICE               CREATED          STATUS          PORTS
ddb-haproxy-1 haproxy:2.8 "docker-entrypoint.sh haproxy -f /usr/local/etc/haproxy/haproxy.cfg" haproxy 36 minutes ago Up 35 minutes 0.0.0.0:5433-5434->5433-5434/tcp, 0.0.0.0:8404->8404/tcp
ddb-postgresql-reader-1-1 bitnami/postgresql:16 "/opt/bitnami/scripts/postgresql/entrypoint.sh /opt/bitnami/scripts/postgresql/run.sh" postgresql-reader-1 36 minutes ago Up 35 minutes 0.0.0.0:25432->5432/tcp
ddb-postgresql-reader-2-1 bitnami/postgresql:16 "/opt/bitnami/scripts/postgresql/entrypoint.sh /opt/bitnami/scripts/postgresql/run.sh" postgresql-reader-2 36 minutes ago Up 35 minutes 0.0.0.0:35432->5432/tcp
ddb-postgresql-writer-1 bitnami/postgresql:16 "/opt/bitnami/scripts/postgresql/entrypoint.sh /opt/bitnami/scripts/postgresql/run.sh" postgresql-writer 36 minutes ago Up 35 minutes 0.0.0.0:5432->5432/tcp

We get a set of information.

  • postgresql-writer: 0.0.0.0:5432
  • postgresql-reader-1: 0.0.0.0:25432
  • postgresql-reader-2: 0.0.0.0:35432
  • haproxy writer: 0.0.0.0:5433
  • haproxy reader: 0.0.0.0:5434

There is another port opened by “haproxy” which is for “haproxy” statistics web view and that needs another article to cover.

How to split the read-write queries in SQLalchemy?

I am going to use a simplified Python code for the testing.

read_write_splits_sqlalchemy.py

from sqlalchemy import create_engine, Pool
from sqlalchemy.orm import declarative_base
from sqlalchemy import Column, Integer, String, DateTime, func

db_configs = {
"writer": {
"host": "127.0.0.1",
"username": "postgres",
"password": "postgres",
"db": "development_database",
"port": 5433
},
"reader": {
"host": "127.0.0.1",
"username": "postgres",
"password": "postgres",
"db": "development_database",
"port": 5434
},
}

# Define connection string for read and write nodes
writer_uri = f'postgresql://{db_configs.get("writer").get("username")}:{db_configs.get("writer").get("password")}@{db_configs.get("writer").get("host")}:{db_configs.get("writer").get("port")}/{db_configs.get("writer").get("db")}'
reader_uri = f'postgresql://{db_configs.get("reader").get("username")}:{db_configs.get("reader").get("password")}@{db_configs.get("reader").get("host")}:{db_configs.get("reader").get("port")}/{db_configs.get("reader").get("db")}'

# Create engine
primary_engine = create_engine(writer_uri, echo=True)
reader_engine = create_engine(reader_uri, echo=True)


# base of schema
Base = declarative_base()


# movie table schema
class Movie(Base):
__tablename__ = "movies"

id = Column(Integer, primary_key=True, autoincrement=True)
name = Column(String(255), nullable=False, unique=True)
year = Column(Integer, nullable=False)
created_at = Column(DateTime, nullable=False, default=func.now())
updated_at = Column(DateTime, nullable=False, default=func.now(), onupdate=func.now())

def __repr__(self):
return f"<Movie(id={self.id}, name={self.name}, year={self.year}, created_at={self.created_at}, updated_at={self.updated_at})>"


def create_schema():
Base.metadata.create_all(primary_engine)


if __name__ == "__main__":
create_schema()

from sqlalchemy.orm import sessionmaker

writer_session = sessionmaker(autocommit=False, autoflush=False, bind=primary_engine)
reader_session = sessionmaker(autocommit=False, autoflush=False, bind=reader_engine)

try:
with writer_session() as session:
# Create a new movie
movie1 = Movie(name="The Matrix", year=1999)
movie2 = Movie(name="Avatar", year=2009)
session.add(movie1)
session.add(movie2)
session.commit()
except Exception as ex:
print("Writer Session: ", ex)

try:
with reader_session() as session:
# create movie using reader session/endpoint
movie2 = Movie(name="Frozen", year=2013)
movie3 = Movie(name="Titanic", year=1997)
session.add(movie2)
session.add(movie3)
session.commit()
except Exception as ex:
print("Reader Session: ", ex)

with reader_session() as session:
# Retrieve the movie
retrieved_movie = session.query(Movie).all()
print(f"Retrieved movies: {retrieved_movie}")
session.commit()

Brief:

  • Created two sessions using two engines by the endpoints got from “haproxy”.
  • Using “primary_engine”, the schema is created.
  • Using primary_engine, created writer_session, and using reader_enginer, created reader_session.
  • At the first session, added two movies.
  • At the second session, tried to use reader_session to insert two more movies and was supposed to be getting an error.
  • In the third session, just tried to read the data from the table.

Try 1

$ python3 read_write_splits_sqlalchemy.py 
2023-12-17 02:24:55,418 INFO sqlalchemy.engine.Engine select pg_catalog.version()
2023-12-17 02:24:55,418 INFO sqlalchemy.engine.Engine [raw sql] {}
2023-12-17 02:24:55,433 INFO sqlalchemy.engine.Engine select current_schema()
2023-12-17 02:24:55,433 INFO sqlalchemy.engine.Engine [raw sql] {}
2023-12-17 02:24:55,434 INFO sqlalchemy.engine.Engine show standard_conforming_strings
2023-12-17 02:24:55,434 INFO sqlalchemy.engine.Engine [raw sql] {}
2023-12-17 02:24:55,435 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2023-12-17 02:24:55,437 INFO sqlalchemy.engine.Engine SELECT pg_catalog.pg_class.relname
FROM pg_catalog.pg_class JOIN pg_catalog.pg_namespace ON pg_catalog.pg_namespace.oid = pg_catalog.pg_class.relnamespace
WHERE pg_catalog.pg_class.relname = %(table_name)s AND pg_catalog.pg_class.relkind = ANY (ARRAY[%(param_1)s, %(param_2)s, %(param_3)s, %(param_4)s, %(param_5)s]) AND pg_catalog.pg_table_is_visible(pg_catalog.pg_class.oid) AND pg_catalog.pg_namespace.nspname != %(nspname_1)s
2023-12-17 02:24:55,438 INFO sqlalchemy.engine.Engine [generated in 0.00017s] {'table_name': 'movies', 'param_1': 'r', 'param_2': 'p', 'param_3': 'f', 'param_4': 'v', 'param_5': 'm', 'nspname_1': 'pg_catalog'}
2023-12-17 02:24:55,440 INFO sqlalchemy.engine.Engine
CREATE TABLE movies (
id SERIAL NOT NULL,
name VARCHAR(255) NOT NULL,
year INTEGER NOT NULL,
created_at TIMESTAMP WITHOUT TIME ZONE NOT NULL,
updated_at TIMESTAMP WITHOUT TIME ZONE NOT NULL,
PRIMARY KEY (id),
UNIQUE (name)
)


2023-12-17 02:24:55,440 INFO sqlalchemy.engine.Engine [no key 0.00010s] {}
2023-12-17 02:24:55,449 INFO sqlalchemy.engine.Engine COMMIT
2023-12-17 02:24:55,452 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2023-12-17 02:24:55,453 INFO sqlalchemy.engine.Engine INSERT INTO movies (name, year, created_at, updated_at) SELECT p0::VARCHAR, p1::INTEGER, p2::TIMESTAMP WITHOUT TIME ZONE, p3::TIMESTAMP WITHOUT TIME ZONE FROM (VALUES (%(name__0)s, %(year__0)s, now(), now(), 0), (%(name__1)s, %(year__1)s, now(), now(), 1)) AS imp_sen(p0, p1, p2, p3, sen_counter) ORDER BY sen_counter RETURNING movies.id, movies.created_at, movies.updated_at, movies.id AS id__1
2023-12-17 02:24:55,453 INFO sqlalchemy.engine.Engine [generated in 0.00006s (insertmanyvalues) 1/1 (ordered)] {'year__0': 1999, 'name__0': 'The Matrix', 'year__1': 2009, 'name__1': 'Avatar'}
2023-12-17 02:24:55,456 INFO sqlalchemy.engine.Engine COMMIT
2023-12-17 02:24:55,470 INFO sqlalchemy.engine.Engine select pg_catalog.version()
2023-12-17 02:24:55,470 INFO sqlalchemy.engine.Engine [raw sql] {}
2023-12-17 02:24:55,471 INFO sqlalchemy.engine.Engine select current_schema()
2023-12-17 02:24:55,471 INFO sqlalchemy.engine.Engine [raw sql] {}
2023-12-17 02:24:55,472 INFO sqlalchemy.engine.Engine show standard_conforming_strings
2023-12-17 02:24:55,472 INFO sqlalchemy.engine.Engine [raw sql] {}
2023-12-17 02:24:55,472 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2023-12-17 02:24:55,473 INFO sqlalchemy.engine.Engine INSERT INTO movies (name, year, created_at, updated_at) SELECT p0::VARCHAR, p1::INTEGER, p2::TIMESTAMP WITHOUT TIME ZONE, p3::TIMESTAMP WITHOUT TIME ZONE FROM (VALUES (%(name__0)s, %(year__0)s, now(), now(), 0), (%(name__1)s, %(year__1)s, now(), now(), 1)) AS imp_sen(p0, p1, p2, p3, sen_counter) ORDER BY sen_counter RETURNING movies.id, movies.created_at, movies.updated_at, movies.id AS id__1
2023-12-17 02:24:55,473 INFO sqlalchemy.engine.Engine [generated in 0.00006s (insertmanyvalues) 1/1 (ordered)] {'year__0': 2013, 'name__0': 'Frozen', 'year__1': 1997, 'name__1': 'Titanic'}
2023-12-17 02:24:55,475 INFO sqlalchemy.engine.Engine ROLLBACK
Reader Session: (psycopg2.errors.ReadOnlySqlTransaction) cannot execute INSERT in a read-only transaction

[SQL: INSERT INTO movies (name, year, created_at, updated_at) SELECT p0::VARCHAR, p1::INTEGER, p2::TIMESTAMP WITHOUT TIME ZONE, p3::TIMESTAMP WITHOUT TIME ZONE FROM (VALUES (%(name__0)s, %(year__0)s, now(), now(), 0), (%(name__1)s, %(year__1)s, now(), now(), 1)) AS imp_sen(p0, p1, p2, p3, sen_counter) ORDER BY sen_counter RETURNING movies.id, movies.created_at, movies.updated_at, movies.id AS id__1]
[parameters: {'year__0': 2013, 'name__0': 'Frozen', 'year__1': 1997, 'name__1': 'Titanic'}]
(Background on this error at: https://sqlalche.me/e/20/2j85)
2023-12-17 02:24:55,476 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2023-12-17 02:24:55,476 INFO sqlalchemy.engine.Engine SELECT movies.id AS movies_id, movies.name AS movies_name, movies.year AS movies_year, movies.created_at AS movies_created_at, movies.updated_at AS movies_updated_at
FROM movies
2023-12-17 02:24:55,477 INFO sqlalchemy.engine.Engine [generated in 0.00012s] {}
Retrieved movies: [<Movie(id=1, name=The Matrix, year=1999, created_at=2023-12-16 20:24:55.454137, updated_at=2023-12-16 20:24:55.454137)>, <Movie(id=2, name=Avatar, year=2009, created_at=2023-12-16 20:24:55.454137, updated_at=2023-12-16 20:24:55.454137)>]
2023-12-17 02:24:55,478 INFO sqlalchemy.engine.Engine COMMIT

Try 2

$ python3 read_write_splits_sqlalchemy.py 
2023-12-17 02:33:40,351 INFO sqlalchemy.engine.Engine select pg_catalog.version()
2023-12-17 02:33:40,351 INFO sqlalchemy.engine.Engine [raw sql] {}
2023-12-17 02:33:40,366 INFO sqlalchemy.engine.Engine select current_schema()
2023-12-17 02:33:40,366 INFO sqlalchemy.engine.Engine [raw sql] {}
2023-12-17 02:33:40,367 INFO sqlalchemy.engine.Engine show standard_conforming_strings
2023-12-17 02:33:40,367 INFO sqlalchemy.engine.Engine [raw sql] {}
2023-12-17 02:33:40,368 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2023-12-17 02:33:40,370 INFO sqlalchemy.engine.Engine SELECT pg_catalog.pg_class.relname
FROM pg_catalog.pg_class JOIN pg_catalog.pg_namespace ON pg_catalog.pg_namespace.oid = pg_catalog.pg_class.relnamespace
WHERE pg_catalog.pg_class.relname = %(table_name)s AND pg_catalog.pg_class.relkind = ANY (ARRAY[%(param_1)s, %(param_2)s, %(param_3)s, %(param_4)s, %(param_5)s]) AND pg_catalog.pg_table_is_visible(pg_catalog.pg_class.oid) AND pg_catalog.pg_namespace.nspname != %(nspname_1)s
2023-12-17 02:33:40,370 INFO sqlalchemy.engine.Engine [generated in 0.00016s] {'table_name': 'movies', 'param_1': 'r', 'param_2': 'p', 'param_3': 'f', 'param_4': 'v', 'param_5': 'm', 'nspname_1': 'pg_catalog'}
2023-12-17 02:33:40,373 INFO sqlalchemy.engine.Engine COMMIT
2023-12-17 02:33:40,374 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2023-12-17 02:33:40,375 INFO sqlalchemy.engine.Engine INSERT INTO movies (name, year, created_at, updated_at) SELECT p0::VARCHAR, p1::INTEGER, p2::TIMESTAMP WITHOUT TIME ZONE, p3::TIMESTAMP WITHOUT TIME ZONE FROM (VALUES (%(name__0)s, %(year__0)s, now(), now(), 0), (%(name__1)s, %(year__1)s, now(), now(), 1)) AS imp_sen(p0, p1, p2, p3, sen_counter) ORDER BY sen_counter RETURNING movies.id, movies.created_at, movies.updated_at, movies.id AS id__1
2023-12-17 02:33:40,375 INFO sqlalchemy.engine.Engine [generated in 0.00005s (insertmanyvalues) 1/1 (ordered)] {'name__0': 'The Matrix', 'year__0': 1999, 'name__1': 'Avatar', 'year__1': 2009}
2023-12-17 02:33:40,378 INFO sqlalchemy.engine.Engine ROLLBACK
Writer Session: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "movies_name_key"
DETAIL: Key (name)=(The Matrix) already exists.

[SQL: INSERT INTO movies (name, year, created_at, updated_at) SELECT p0::VARCHAR, p1::INTEGER, p2::TIMESTAMP WITHOUT TIME ZONE, p3::TIMESTAMP WITHOUT TIME ZONE FROM (VALUES (%(name__0)s, %(year__0)s, now(), now(), 0), (%(name__1)s, %(year__1)s, now(), now(), 1)) AS imp_sen(p0, p1, p2, p3, sen_counter) ORDER BY sen_counter RETURNING movies.id, movies.created_at, movies.updated_at, movies.id AS id__1]
[parameters: {'name__0': 'The Matrix', 'year__0': 1999, 'name__1': 'Avatar', 'year__1': 2009}]
(Background on this error at: https://sqlalche.me/e/20/gkpj)
2023-12-17 02:33:40,392 INFO sqlalchemy.engine.Engine select pg_catalog.version()
2023-12-17 02:33:40,392 INFO sqlalchemy.engine.Engine [raw sql] {}
2023-12-17 02:33:40,393 INFO sqlalchemy.engine.Engine select current_schema()
2023-12-17 02:33:40,393 INFO sqlalchemy.engine.Engine [raw sql] {}
2023-12-17 02:33:40,394 INFO sqlalchemy.engine.Engine show standard_conforming_strings
2023-12-17 02:33:40,394 INFO sqlalchemy.engine.Engine [raw sql] {}
2023-12-17 02:33:40,394 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2023-12-17 02:33:40,395 INFO sqlalchemy.engine.Engine INSERT INTO movies (name, year, created_at, updated_at) SELECT p0::VARCHAR, p1::INTEGER, p2::TIMESTAMP WITHOUT TIME ZONE, p3::TIMESTAMP WITHOUT TIME ZONE FROM (VALUES (%(name__0)s, %(year__0)s, now(), now(), 0), (%(name__1)s, %(year__1)s, now(), now(), 1)) AS imp_sen(p0, p1, p2, p3, sen_counter) ORDER BY sen_counter RETURNING movies.id, movies.created_at, movies.updated_at, movies.id AS id__1
2023-12-17 02:33:40,395 INFO sqlalchemy.engine.Engine [generated in 0.00005s (insertmanyvalues) 1/1 (ordered)] {'name__0': 'Frozen', 'year__0': 2013, 'name__1': 'Titanic', 'year__1': 1997}
2023-12-17 02:33:40,396 INFO sqlalchemy.engine.Engine ROLLBACK
Reader Session: (psycopg2.errors.ReadOnlySqlTransaction) cannot execute INSERT in a read-only transaction

[SQL: INSERT INTO movies (name, year, created_at, updated_at) SELECT p0::VARCHAR, p1::INTEGER, p2::TIMESTAMP WITHOUT TIME ZONE, p3::TIMESTAMP WITHOUT TIME ZONE FROM (VALUES (%(name__0)s, %(year__0)s, now(), now(), 0), (%(name__1)s, %(year__1)s, now(), now(), 1)) AS imp_sen(p0, p1, p2, p3, sen_counter) ORDER BY sen_counter RETURNING movies.id, movies.created_at, movies.updated_at, movies.id AS id__1]
[parameters: {'name__0': 'Frozen', 'year__0': 2013, 'name__1': 'Titanic', 'year__1': 1997}]
(Background on this error at: https://sqlalche.me/e/20/2j85)
2023-12-17 02:33:40,397 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2023-12-17 02:33:40,398 INFO sqlalchemy.engine.Engine SELECT movies.id AS movies_id, movies.name AS movies_name, movies.year AS movies_year, movies.created_at AS movies_created_at, movies.updated_at AS movies_updated_at
FROM movies
2023-12-17 02:33:40,398 INFO sqlalchemy.engine.Engine [generated in 0.00009s] {}
Retrieved movies: [<Movie(id=1, name=The Matrix, year=1999, created_at=2023-12-16 20:24:55.454137, updated_at=2023-12-16 20:24:55.454137)>, <Movie(id=2, name=Avatar, year=2009, created_at=2023-12-16 20:24:55.454137, updated_at=2023-12-16 20:24:55.454137)>]
2023-12-17 02:33:40,399 INFO sqlalchemy.engine.Engine COMMIT
  • writer session also gives an error with a duplicate key as the same movie tried to insert.
  • Reader error continues as the session is still the reader_session.

Conclusion

So, from this perspective, need code level change to split the queries. There are some benefits to using Aurora MySQL, as MySQL does have a tool called “ProxySQL” which will help a lot with single engine splitting the workload according to the type of query.
For SQLAlchemy, by designing the architecture and implementing best practices, one can achieve optimal performance and availability for the application.

The Python code shared here is not for production use and neither are the other configurations as the purpose of this Article is to give an idea of how it can be handled but not to give any hard-bound solutions.

If there are things you oppose or have better ideas to improve the plan, please share them with me. I would love to learn more.

References

  1. https://repost.aws/questions/QUBZYxMjCQRI6E8C7RSUSdRg/slight-delays-between-actual-round-robining-of-aurora-reader-instances

--

--

Fahad Ahammed

A learner of Cloud, Linux, and Programming with 7+ years of experience in the related field sharing my day-to-day stories.