In work projects in the past, my team would try to avoid dealing with the complexity of SQLAlchemy database sessions by making one global session that every module referenced. It made things easy and straightforward and it played nice with FactoryBoy - whose factories seem to work quite well under those conditions.
Going into building lobit.io, I tried to do things the same way, but as the code base and the number of unit tests grew, I kept facing what seemed to be an unquashable avalanche of OperationalErrors griping about “Too many connections”.
I implemented an overhaul of the codebase, wrapping every piece of ORM code in a separate session and creating a DbUtils class to handle sessions consistently without too much boilerplate:
from contextlib import contextmanager from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker from sqlalchemy.orm.session import Session from config import config class DbUtils: Session = sessionmaker(bind=create_engine(config.db_connection_url)) @classmethod @contextmanager def session_scope(cls) -> Session: """Provide a transactional scope around a series of operations.""" session = cls.Session() try: yield session session.commit() except: session.rollback() raise finally: session.close()
Fixing the test factories
When it came to unit tests, I was at a loss for how to best handle all of the factories I was using. They were all tied into the global session. Furthermore, factories took in the session at the time of class creation, not instantiation, e.g.
import factory from models.post import Post from tests.factories.post_thread_factory import PostThreadFactory class PostFactory(factory.alchemy.SQLAlchemyModelFactory): class Meta: model = Post sqlalchemy_session_persistence = "commit" sqlalchemy_session = session text = factory.Faker("text", max_nb_chars=100) external_id = factory.Faker("text", max_nb_chars=50) post_thread = factory.SubFactory(PostThreadFactory)
Wrapping factory class creation in a function did the trick:
import factory from models.post import Post from tests.factories.post_thread_factory import PostThreadFactory def PostFactory(session): class _PostFactory(factory.alchemy.SQLAlchemyModelFactory): class Meta: model = Post sqlalchemy_session_persistence = "commit" sqlalchemy_session = session text = factory.Faker("text", max_nb_chars=100) external_id = factory.Faker("text", max_nb_chars=50) post_thread = factory.SubFactory(PostThreadFactory(session)) return _PostFactory
While simple wrapping worked, it made my tests look God-awful. Manually managing sessions had already added boilerplate overhead to every unit test that hit the database.
Before refactoring, I could create a post using
PostFactory.create(). Now it was
PostFactory(session).create(). Most of the unit tests hit multiple factories several times each, so it was also less efficient to keep creating the same classes over and over.
I took to creating a single instance of each required factory at the top of each session:
def test_post_stuff(self): with DbUtils.session_scope() as session: post_factory = PostFactory(session) post_thread_factory = PostThreadFactory(session) ...
Cleaning it up
A given test could have nearly a dozen factories defined at the top - which looked terrible even if it didn’t trigger a “too-many-locals” warning from Pylint (which it almost always did).
So I made one more change and hired a manager to keep all the factories running smoothly.
from tests.factories.post_factory import PostFactory from tests.factories.post_thread_factory import PostThreadFactory ... class FactoryManager: known_factories = [ PostFactory, PostThreadFactory, ... ] def __init__(self, session): self.session = session for factory_func in self.known_factories: setattr(self, factory_func.__name__, factory_func(self.session))
The FactoryManager didn’t require a lot of code, but let me leave my tests largely unchanged. Instead of initializing several objects at the top of each test, I could initialize one.
def test_post_stuff(self): with DbUtils.session_scope() as session: fm = FactoryManager(session)
Now I could prefix all factory calls with
fm. and get something almost as clean as before. At least,
fm.PostFactory.create() looks better to me than
PostFactory(session).create() or littering each test with a bunch of locals.