What do fintech startups use on their back end?
What tech stack would you recommend for something to constantly monitor share prices from an API feed & then action realtime alerts? I think it depends on the details and performance requirements. Otherwise the general sentiment of 'use any of the major technologies, one you're familiar with' stands. For example, is this software aimed at the end user or will the startup use it themselves? If the end user will use it (like alerts for their portfolio/watch list), then the definition of "real-time" is going to be lax because you need human interaction. If it's used by an institution for trading, then you will need to choose highly performant language, architecture, and hardware so that your algorithm can place it's orders before other institutions. You would also likely need to integrate with Swift (financial). Any major tech stack will work fine. I assume “real time alerts” will need an integration with a 3rd-party notification provider (email, sms, etc) and the bulk of your business logic will be defined with whoever is providing your share price API. So, tech doesn’t really matter, pick something simple and mainstream so you can get help as needed. Ruby/Rails, Python/Django, Nodejs, Java, whatever. Postgres for a database. You likely won’t need any exotic cool tech, novel databases etc. Take in consideration to not use floats for money. Use a `decimal` data format. I'd also advice against using javascript since it has problems dealing with big numbers. There are libraries of course to handle that. Just if you use JS make sure to google about how to do money calculations accurately. I see this statement a lot and would like to see it better illustrated or quantified. I worked at a place that used doubles and there was no fallout. Also when dealing with very small quantities, fixed point might not work so well either. https://stackoverflow.com/questions/3730019/why-not-use-doub... Personally, as a general rule I use integer's and convert for display only (as mentioned in the above SO post). This removes the accuracy issue and works in every programming language that is in common use today (at least that I know of). I even do this in Postgres (and other DB's) which keeps everything consistent and removes the chance a driver between the DB and client jacks up the decimal conversion. It works until someone forgets to divide by 100. If your language and your database both have decimal, why not use them? Primarily, forgetting to divide by 100 is an obvious error and easy to debug when compared to a database driver/interface being updated that tweaks the decimal conversion which is much more subtle and hard to debug. Your point is valid though in a situation where you only use one database system and one development language that both support decimal properly it isn't "bad" to use decimal, thought I'd still personally use integer for future proofing the data. Secondly (and why I'd still use integers regardless), every language supports integers and division, not every language can support decimals properly or consistently. So for guaranteed accuracy it is best to use integers. Web being so javascript driven is a primary consideration here. IMO this is something similar to dealing with time in distributed/complex systems. Sure there are types in postgres (and most DB's) that handles timezone, but it still is best to use UTC for everything stored in data and only do timezone conversions at the display layer when necessary. You shouldn't store future dates as UTC, bit of a recipe for disaster. Timezones change, and far more frequently than you'd think. So storing in UTC is only a good idea for dates in the past. You'd be in for a world of hurt when the EU or UK finally decide to get rid of moving their clocks forwards/back. They've both been debating it for years. Or Turkey decides to move the start of summer time again with a couple of weeks notice. The most interesting case of a time format for storage was the time they were shot for uploaded photos. The best answer was effectively save it as a local string. Together with the location it fully 'captures the moment'. This was apparent when showing new year's celebration photos with a date-time in the viewers timezone. I probably should've just skipped trying to make the time analogy as it is more nuanced as you point out. Or maybe I should've just pointed out that storing in UTC doesn't mean you shouldn't store the offset/region/sub-region etc. I assumed when I said conversion for the display that it would be obvious you need those to make it happen, but I guess people could think I meant to use the browser's tz setting for display/actions (yikes). But to point out an example, Postgres (and most major DB's today) store all timestamptz as UTC internally with the region code they use for offset calculations. At least for postgres the region code and offset are taken from the IANA DB. Essentially what I have done, many times, on complex systems is the same thing Postgres is doing just under our applications control and not relying on a specific databases implementation of conversions. This made it easier for us to ship data around between systems without having to care whether those systems had good timezone support. We then had libraries in each language that would handle all the proper conversions/lookups. Meaning (and using your example) if we needed to update future dates for Turkey, we could issue a query to do so and be done. This eliminates having to wait for Postgres to get updates from IANA, update, test and release and then we do the same. Not to mention you need to multiply that by X number of other systems which have their own TZ conversions at that point if you aren't doing it. And then what happens if one gets it wrong or is late to update. There are many times this just isn't necessary of course, small systems, systems that are only regional, systems where time isn't a critical component etc. But for systems where time crosses many regions and is critical, or is a legal record or triggers an action it becomes more and more important. FWIW too: A number of US states have passed laws to go to DST only, so no more switching in those either. They are just waiting on the federal government to pass the resolution allowing it (apparently states can opt for standard time but not DST without approval). When that happens a number of America/XXX regions will need to be split up in IANA. For example, America/New_York currently covers Florida but that would have to change. JVM or python just for the ecosystem.
kafka, debezium, workflow engines(camunda), ORMs etc - java libraries might not be as nice to use as ruby gems but some are really rock solid for your use-case. But that being said, anything really - rails/node etc most would work just fine for your use-case I didn't find debezium production ready when I tried it about a year ago. If I recall, no cascading deletes and it can't handle 0000-00-00 dates due to "some java limitation" (maybe in the kafka producer? I honestly forget) and when a dev on my team made a PR to the debezium project it was not merged because the team felt it was outside of scope for debezium. Also, because of implementation details, it will never do cascades. End result was a lot of wasted effort. It worked for one team who didn't have those requirements and even then they spent more time than they should have and then ripped out debezium and went with standard mysql replication as it doesn't require the entire kafka stack. My experience with it reaffirmed my standard technique of data migration of write to both and move reads over when ready. I'm a fan of Kafka for streaming systems, so I'd recommend to fetch data via api, add it to kafka and then consume from kafka to monitor the price + fire alerts (assuming you don't really need a real-time system with very low latency). What language you use is up to you/your team. If you like python, I'd look into Airflow or Celery as frameworks to build your workflows. For java/scala I'd take a look at Google Dataflow or Flink or Kafka Streams, depending on how heavy the "monitor share prices" part is. I think the major issue you're going to encounter is not building a system that works, but building a process that allows your team to upgrade the system, test new features, debug, etc. Dataflow, for example, lets you run more or less the same code both on a stream of data and on batches of historical data. Generic Answer: - The tech stack that is best is the one you/team are most comfortable with using. Specific Answer: - If you were using Elixir, you could have 1 process (GenServer) per stock ticker that calls the API for just that stock. GenServers are lightweight processes that run concurrently in the BEAM. That process could then have a set of subscribing user processes to notify which would then do an alert to a user. This is assuming you can call an API for one stock vs. having to parse a feed of data. If I were doing this I wouldn't store any of the stock data in any way, its ephemeral and by the time a DB commit happens the price changed. Caveats: You mention API so I assume that means you are calling an API, parsing data, (dealing with rate limits) then alerting. I'd caution the use of the term "real-time" as you are farther away from trading data than big name brokers. My intent isn't to downplay the idea, just thinking it through based on the info in the one sentence question. You can assume that a price comparison needs to be made between the saved price & the current price from the API. I would use the new k(9) if I have a small team and we can pick. Otherwise .NET Core 5. We use this for all our fintech stuff and it is very good and 'enterprises approved'. We use it with Postgres and Redis. what's k(9)? I work in fintech; we use a mix of Scala and Node. Scala for heavy data processing pipelines, Node for our client API, due to Node’s very fast cold start times in lambda. Realistically, anything you're already comfortable with will work for that. I would pick Python, at least for a quick prototype. You can start thinking about an efficient implementation when you've got a defined feature set and are seriously thinking about delivering an actual product. And when that happens you can just use which tech works best for which type of service anyway. For a system I built, which had some similarity, we used Go and postgres with Interactive Brokers running in a headless Docker container (with virtual X) to provide the ability to trade. It was pretty cool for something that took a couple of days and I believe it still works that way. I would recommend Nodejs for this use case and if other use cases come by such as machine learning or distributed transaction, you can split business logic as separate microservice using Python or Java as needed What languages are you really good at? As another commenter said, unless you're doing ML, most general purpose languages can monitor an API and make API calls. AWS SQS & Lambda Lambda is not the best option for real-time alerts since it has a cold start. but I guess it depends on the non-functional requirements he has. Hey - is this essentially provisioning a bunch of background workers to constantly scan a database table? eg. one worker for each alert (to look for a price). How would someone incorporate this in Rails?