RFC: Introducing a Standardized Interface for SQL Database Drivers in JavaScript by halvardssm · Pull Request #6 · halvardssm/stdext

Sorry everyone for the delayed answer, I've been sick for the past weeks. There are some great thoughts and insight shared from everyone, so let's see how we can improve the interfaces. Let me answer everything in one big comment, and structure it a bit (and sorry if I missed anything, just remind me if I did).

Thanks @lovasoa @rkistner @skybrian @samwillis for joining the conversation and for bringing all these great points!

Separation of Low and High level interfaces

I think they should implement a much much smaller API that looks like the one I propose above, and the implementation of the functions like execute, queryOnce, and so on should be common between all drivers, and could be updated and improved without breaking compatibility with the drivers.

The other concern I have is the lack of a wrapper object around the result rows. With PGlite we have a Results object that included metadata about the query:

@lovasoa @samwillis After thinking about it, I see what you mean, the connection class could and maybe should be a lot smaller, and only expose one iterator based query method returning a results object. As long as the result and the input are standardized, using utility functions to get the right output for the high level interfaces should not be an issue.

@rkistner @lovasoa As you mentioned, a separation between a low and high level interface allows for some interesting usecases and implementations where there can exist multiple low level drives that can be interchanged for the high level wrappers implementing the high level interfaces. I don't believe there can be one high level wrapper to rule them all since each database does things slightly different, but at least there can be a more shared structure and some interoperability where possible.

Explicit resource management

Your low-level driver API seems nice and simple ! But maybe we should avoid requiring Symbol.dispose in it ? The nice user-facing API can still use it, but it would be nice if drivers could be defined and used in today's javascript.

@lovasoa I am a bit conflicted regarding relying. On one hand, this is something that database drivers could really use, but I also agree that on the other hand it would only work for newer runtime versions supporting this. We would need to get more feedback from the community regarding this.

Versioning and changes to specs

Of course it could be added to the spec right now, but what about the next nice language feature or other new ideas that could improve the APIs?

Re: upgrades, I'm wondering whether a driver needs to import anything from a common base library. If so, where do they import it from, and what happens if the base library gets upgraded, or there are drivers depending on different versions?

@rkistner @skybrian This is one of my concerns, how can we make sure that additions (non breaking changes) to the specs it will allow backwards compatibility. I am considering to write in the specs that new additions but be added as optional properties and will only be added on major changes. What are thoughts about this?

Generally, my idea was to have the specs will follow semantic versioning, meaning:

Major version changes: Can (and most likely will) contain breaking changes like removal or change in signature of a method or property
Minor version changes: Adds or extends new methods and properties as optional properties
Patch version changes: Improvements in documentation or utilities that does not change signatures or add anything

High level interface

The main things that remain for drivers to implement are connection management, and prepared statements. Even if actual prepared statements are not used by the driver, it could still expose the same APIs with minimal overhead.

@rkistner Although I am not aware of any SQL based database which does not support transactions or prepared statements, I am not willing to rule out the possibility (take this as a challenge to convince me otherwise as I love to be wrong when it can simplify things). In the case that prepared statements or transactions are not supported, it can be documented in the respective drivers and a simple error can be thrown at runtime. The challenge when creating a standardisation is to account for flexibility while keep things simple, so I am happy to discuss this and maybe compare deeper with Go's std/sql. More input here would be good.

Provide non-type code

Some alternate ways to make implementation easier without baking it into the standard would be to provide a skeleton implementation, a base library, or common utility functions. A common test suite might be quite useful, too.

@skybrian It was actually discussed internally before to add a test suite, and the feeling was that bringing too much to the first version would be too much, but if there is a consensus from the community (mainly the library maintainers) that this is wanted during the first itteration, we should add it.

In regards to the skeleton implementation, I am not sure exactly what you mean. Do you mean to have some kind of wrapper class that can take the low level connection class? If so, I would be very hesitant in introducing this as it would make everything heavily dependent on this library. I would rather want to make it possible for these interfaces to be implemented even without importing the types and just by following the specs. Again, Some more nuances and middleground can be found here, but I would be very hesitant to introduce something that would make it required to import code.

I've now published the APIs I've been working on here: https://github.com/powersync-ja/sqlite-js

@rkistner This is really great! Probably a lot of your learnings we can use. The main thing that I saw while looking through the code and documentation is that it is very specific to SQLite. It would unfortunately not translate in all cases to MySQL, postgres and potentially other databases. Would you see any changes that would be needed in the current specs from your learnings (besides simplifying as mentioned above)?

In my experience, using instanceof anywhere in consuming code relies on a single implementation of the class being used across all drivers, which will often not be the case in practice. There should rather be common properties on error and event interfaces that can be checked. For utility functions, you can have multiple incompatible versions (used by different drivers) in the same application bundle without issues.

I agree, I am also therefore a bit concerned about the base Error class that is provided in this proposal. Some more insight here would be great from the community. Utility functions should be provided IMO, if we simplify the connection class, then having utility functions for getting all and one would be very helpful.

Connection pools

The question is really wether we want pools to be explicit or implicit. After reading through and looking at the existing drivers again, I have to say I'm conflicted and even leaning a bit towards having this implicit and handled by a config property and the respective drivers. It would simplify general usage, but when a connection is required for more than a single call (as you mentioned with prepared statements), I can see this being non-trivial to account for (but hey, I could be wrong). Maybe my worries are simply solved by some kind of "acquire mutex"? It might also be the case that drivers could keep track of prepared statements per connection in a hashmap or similar. Thoughts?

Generics

I don't have a strong argument on this, just a feeling that it may be better to focus more on simplicity here. Individual implementations may add options by overriding the implemented method definitions, without needing to use generics for the most part.

I have indeed been worried that generics introduce too much complexity. I will look more at your code and see if simplifying it can be done.

Constructor type

I think it may be a mistake for the specification to prescribe the constructor.

@samwillis I think one of the main benefits in standardising the constructor is that developers can bring the knowledge they have from one driver to another. For the cases where a connection URL is not needed, it can simply be advised to give an empty string, and the driver can ignore the field value entirely. I do however think that a connection URL is either way needed in most cases (either for a local sqlite file path, or a pg URL), and it would be up to the drivers to decide if options should only be allowed in the connectionOptions or also as url parameters. When considering simplicity first, we see that providing a single url string for connection to a database as the first argument versus having to dig through documentation to see which property is called what, makes the development experience a lot nicer.

Also, in a browser environment you may be connecting to a database instance in another tab, or in a web worker, this would require a very different constructor interface.

Can you expand on this? Maybe with examples?

Explicit connect

The explicit .connect() is also redundant for most embedded databases as the constructor will immediately create the database instance on the current thread.

Most databases requires us to connect async, so we wont get away with doing this in the constructor as its not async. I am afraid we are kindof locked to this if we want to build this interface for most databases. This is also why all the interfaces are async even though sqlite can be sync in most cases (and performs better that way too).

Template literals for querying

On the tagged template literal interface ... it's a very new feature of these database drivers and I think there is a lot more exploration and experimentation thats needed first before tightly specifying what it should look like. It's essentially a query builder, and I don't think this spec wants to cover that.

I think this would be up to the drivers themselves to implement, but from a typing perspective, it is not that advanced. I see your concern, but ultimately the signature of the function can't really be different in any aspect as far as I know, just the implementation (again, let me know if I'm wrong).