Hey everyone,
I just wanted to share a few recent learnings and musings from the coding trenches. Sometimes, the smallest bugs cause the biggest headaches, and sometimes, a new language makes you question old habits!
All my professional life, I've worked with object-oriented programming (OOP) languages (JavaScript, Java, Ruby, Python, etc.). My brain is wired to think in terms of classes, inheritance, polymorphism, and encapsulation. So, my mindset is pretty aligned with this paradigm.
But now that I'm working on an extensive Golang project, I've encountered this interesting battle in my head between traditional OOP and Go's more declarative, composition-focused approach. The Go FAQ link above is quite direct: "Does Go have OOP? Yes and no." It clarifies that while Go has types and methods and allows an object-oriented style of programming, it doesn't have type inheritance.
Instead, Go strongly encourages composition over inheritance through embedding.
Initially, this felt... limiting. My instinct, when faced with a need to share behavior or create a "specialized" version of a type, was to look for an extends keyword. How do I make a PremiumUser that is a User but with extra features? In Go, you'd typically embed a User struct within your PremiumUser struct:
type User struct { ID int Name string } func (u *User) HasPermission(p string) bool { // Base permission logic return false } type PremiumUser struct { User // Embedding User type SubscriptionLevel string } // PremiumUser automatically gets User's fields and methods. // We can also override or add new methods. func (pu *PremiumUser) HasPermission(p string) bool { if p == "premium_feature" { return true } return pu.User.HasPermission(p) // Call embedded type's method }
The "battle" for me has been retraining my brain. Instead of thinking, "A is a B," I'm learning to think, "A has a B and thus gains its capabilities." It pushes towards smaller, more focused interfaces and can avoid the complexities and tight coupling that deep inheritance hierarchies sometimes create (the "gorilla-banana problem" – you wanted a banana but got a gorilla holding the banana and the entire jungle).
I'm still navigating this shift. There are moments I miss the explicitness of super() or a clear extends relationship. However, I'm also starting to appreciate the simplicity and explicitness that Go's embedding offers. It forces a different kind of design thinking, often leading to more decoupled and flexible systems. It's a journey, and Go is definitely making me re-evaluate some deeply ingrained OOP habits!
Don't Forget to Stop All Patchers (The Pytest Cache Mystery)
I encountered a truly strange issue while running unit tests with pytest2 recently. After running the test command, all the tests were passing, but the execution itself would fail at the very end with an error related to the pytest cache folder:
error: [Errno 2] No such file or directory: 'project/.pytest_cache/README.md'
I was stumped. I couldn't find anything relevant on Google, Claude, nor our internal Amazon wikis. The error message seemed so disconnected from my actual test logic. So, I resorted to the tried-and-true method: reviewing my changes line by line.
And there it was! A patched object that wasn't being stopped after the test's execution. In my setUp method, I diligently start all my patchers, but I'd missed adding one to the tearDown.
The fix was simple, but finding it was the hard part:
class TestUtilHelpers(unittest.IsolatedAsyncioTestCase): def setUp(self): self.patcher_datetime = patch('your_module.datetime') self.mock_datetime = self.patcher_datetime.start() self.patcher_mkdir = patch('os.mkdir') # The culprit! self.mock_mkdir = self.patcher_mkdir.start() # ... other patchers def tearDown(self): self.patcher_datetime.stop() self.patcher_mkdir.stop() # The missing line!
Why did this happen? My best guess is that the unstopped
patcher_mkdirmeant thatpytestitself, during its cleanup or cache management phase after my tests ran, tried to callos.mkdir(or a similar function that internally uses it). Since my mock was still active and likely not behaving like the realos.mkdir(or perhaps raising an error or just doing nothing),pytestcouldn't create or access its.pytest_cachedirectory structure as expected, leading to the "No such file or directory" error when it tried to find/createREADME.mdwithin it.The takeaway: Always ensure your patches3 are stopped in your
tearDownmethod (or by usingwith patch(...)context managers, which handle this automatically). It's easy to miss one, especially when adding new tests or refactoring, but the side effects can be baffling and lead you down rabbit holes far removed from your actual test logic.
Async Code in Python is Tricky
Python's asyncio4 library, along with the async and await keywords, has been a game-changer for writing high-performance, I/O-bound applications. The ability to handle many concurrent operations without the overhead of threads is incredibly powerful. However, stepping into the world of asynchronous programming in Python isn't always smooth sailing.
A prime example of this challenge, and something I'm actively working through, is building a custom Python logging.Handler. The standard logging framework in Python is synchronous: when your code calls logger.info("message"), it expects that call, including the handler's emit() method, to process the log record and complete relatively quickly. My goal for this custom handler is to send log messages to a cloud service. Naturally, the library for interacting with that cloud service is, and should be, async to avoid blocking I/O.
This immediately presents a conundrum:
- The handler's
emit(self, record)method is called synchronously by the logging framework. - Inside
emit(), I need to call an async function (e.g.,async_send_to_cloud(record)) to perform the network I/O to the cloud service.
How do you bridge this sync/async gap effectively? This specific problem has thrown several general async challenges into sharp relief for me:
Bridging the sync/async divide (the "How do I even run this?" problem):
If I try to use
asyncio.run(self.async_send_to_cloud(record))inside the synchronousemit()method, I hit a couple of snags.- Firstly, if the main application using this logger is already running an
asyncioevent loop,asyncio.run()will raise aRuntimeError: asyncio.run() cannot be called from a running event loop. - Secondly, even if the application is synchronous,
asyncio.run()creates a new event loop for each log message. This is highly inefficient, can lead to significant overhead, and might even cause lost logs if the program exits before all these temporary loops complete their background work.
- Firstly, if the main application using this logger is already running an
Simply calling
self.async_send_to_cloud(record)fromemit()withoutawaitorasyncio.run()just creates a coroutine object; it doesn't actually execute the network call. The log message is effectively dropped.This leads directly to understanding that the event loop is key: asynchronous code needs an event loop to manage its execution, its
awaitpoints, and its tasks. Synchronous code, by default, isn't running one in a way that your new async function can readily use.asyncio.run()- the standard, but sometimes insufficient, entry point: As mentioned,asyncio.run()is the recommended high-level function for running an async coroutine from a top-level synchronous context. It handles creating a new event loop, running your coroutine, and cleanly closing the loop. But its design makes it unsuitable for being called repeatedly from within an already async-managed context or for scenarios like my logging handler where efficiency and integration with an existing (or non-existent) loop are critical.Exploring deeper for a solution: The logging handler problem has pushed me to learn more about
asyncio.get_event_loop(),loop.run_until_complete(), andasyncio.create_task(). For the logging handler, I'm now considering strategies like:- Checking if an event loop is already running. If so, use
asyncio.create_task()to schedule theasync_send_to_cloudcoroutine on the existing loop. - If no loop is running (i.e., the application is purely synchronous), I might need to manage a dedicated event loop in a background thread, using a thread-safe queue to pass log records from the synchronous
emit()method to this async worker thread. This ensures logs are sent without blocking the main application and without the overhead ofasyncio.run()per log. - This also involves careful consideration of shutdown: ensuring that any pending logs in the queue or tasks on the event loop are flushed before the application exits.
- Checking if an event loop is already running. If so, use
Thanks for reading,
Wil