It’s a tool like any other, appropriate under some circumstances and inappropriate in others.
Blindly rejecting it without considering whether it’s appropriate in the context is honestly just as bad as choosing it without considering whether it’s appropriate in the context, fwiw.
As an operator, this who thread reads like a bunch of devs who don’t understand networking and refuse to learn.
Sure, for smaller applications or small dev teams it doesn’t make sense. But for so many other things it does.
The problem is that all projects start small, and frankly most of them die small. Aiming for microservices architecture right away is a surefire way to get the project killed before anyone would benefit off of the microservices.
The other angle is the majority of Devs are just… Not good…
A good Dev in the situation you mention will design the solution needed now not the one you hope for later.
I’m saying this as someone who has been mired in scope creep and over engineering solutions many times in my life because “what if 5000 people need to use it at once?!”
In most cases all you need is a database, a single thread and a simple API. Build it and then as the problems come grow with them. Try to take into account issues of scale but realize you can’t and shouldn’t solve every scenario because there are too many variables which may never exist.
A good Dev in the situation you mention will design the solution needed now not the one you hope for later.
Maintainability is one of the most important if not the most important goal when programming. If a dev only designs a solution that fits for exactly the current situation but doesn’t allow any changes, it’s not a good dev.
But yeah, if you start small, a solution that’s made for that is preferable. You can still refactor things when you get magnitudes larger and have the budget.The tricky thing about software development is this balance: you don’t want to hobble your system by designing only for today, because that could waste a whole lot of time later when needs change, but you also mustn’t paralyze the project by designing for all possible tomorrows. Finding a middle path is the art, and the only proof that you got it somewhat right is that things get done with relatively few crises.
Microservice from the start may be a lot of overhead, but it should at least be made with that scalability in mind. In practice to me, that just means simple things like make sure you can configure it via environment vars, run it out of docker compose or something because you need to be able install it on all your dev systems and your prod server. That basic setup will let you scale if/when you need to, and doesn’t add anything extra when planned from the start.
Allocating infrastructure on a cloud service with auto scaling is the hard part imo. But making the app support the environment from the start isn’t as hard.
Often a simple solution is the most maintainable solution. In my experience, the code that’s most difficult to maintain are often made by devs who tried to plan ahead. The code turns over engineered to allow for stuff that never materialized.
If a dev only designs a solution that fits for exactly the current situation but doesn’t allow any changes, it’s not a good dev.
I don’t think anybody is arguing this. Nobody (in my decade-plus in this industry) actively codes in a way to not allow any changes.
You evidently haven’t met my colleagues. There are always people who go for the quickest hack despite the trouble it stores up for later, and they’re usually encouraged by management.
I always lump microservices architecture as premature optimization, one that should be used when you’re maxed out of resources or everything is too spaghetti.
I love the idea. And I even pitched it for a specific project. But I had to eat humble pie when the devops nerds threw more servers at the problem and it worked better than I expected.
Sounds like that thing was already designed to be horizontally scalable. Was your error related to not realizing that? Or what was the problem being solved by microservices?
Most software developers have no concept of real world limitations and issues like tolerances, failure, probability, latency, hysteresis, ramp-up etc. because they’re not engineers.
Normally they’d be expected to at least account for software-specific aspects like ACID or CAP or some vague awareness of the fact that when you’re dealing with multiple systems the data may not always arrive as you expect it, when you expect it. But even that is a crapshoot.
Networking has little to do with it.
Microarchitecture needs to be broken up into components. Those components need to send messages to each other. Components sending messages to each other is arguably the core of what object oriented design was trying to do all along. If your devs were bad at classifying components in an object oriented design, then they’ll probably be bad at it in a microarchitecture, too. Worse, the actual system is now likely spread amongst many different version control repositories, and teams stake out ownership of their repositories. Now you get more blockers spread amongst several teams.
Where the network layer comes into play is to replace something that used to be running in a single CPU core, or sometimes different cores on the same motherboard. Unless you can exploit parallelization for your use case, the kind where you have more threads than the number of CPU threads a single motherboard can handle (which can be several hundred on current systems), this will always be slower.
Being an older system admin this is how we worked. Would generally have an A and B side and maybe C on our stacks. Most of the time what I supported would be A and B clusters in two data centers and then do either 100% on our primary site or do a mix of traffic between our two sites
But why? Microservices do have some good advantages in some scenarios
Problem is that companies are using them for all scenarios. It’s often their entire tech stack now, with kubernetes.
It’s similar to the object oriented hype that came before it, where developers had to write all their programs in a way so they could be extended and prepared for any future changes.
Everything became complex and difficult to work with. And almost none of those programs were ever extended in any significant way where object oriented design made it easier. On the contrary, it made it far more difficult to understand the program since you had to know which method was called in which object due to polymorphism when you looked at the code. You had to jump around like crazy to see what code was actually running.
Now with kubernetes, it’s all about making the programs easier to scale and easier to develop for the developers, but it shifts the complexity to the infrastructure needed to support the networking requirements.
All these programs now need to talk over the network instead of simply communicating in the same process. And with that you have to think about failure scenarios, out of order communication, missing messages, separate databases and data storage for different services etc.
You can have the best tool in the world and still find people just hitting their own face with it.
I don’t think people have a choice. If you join a company where they use kubernetes, you have to use that technology for everything. You can’t escape the complexity even if you just want to make a simple program. It still needs to run in kubernetes.
Depends on who you think the people are.
CTOs, technical team leads and such can make those decisions. And devs can also suggest migrating to simpler solutions.
If a tech giant like Amazon can do it like they did with Prime Video, I don’t think it’s impossible other companies can do so too.
Yes but in practice, companies don’t want to replace their entire tech stacks, and specially if it’s a large company. It costs an enormous amount of money (because of the time and effort it takes) and means the entire company has to relearn how to work with that stack instead.
It’s not impossible and it can happen, but in my experience from working at probably 20 companies now, there is almost always a strong resistence to change.
People don’t even change their default search engine or browser most of the time.
if you just want to make a simple program. It still needs to run in kubernetes.
“hello OPS-team. Here is my simple program. Have fun running it on your kubernetes”
If object oriented design is fundamentally about components sending messages to each other, then microservices are a different route to OO design. If people are bad at OO design, then they’re likely bad at designing microservices, as well. The two aren’t so separate.
All these programs now need to talk over the network instead of simply communicating in the same process.
This is where things go really wrong. Separating components over the network can be useful, but needs careful consideration. The end result can easily be noticeably slower than the original, and I’m surprised anybody thought otherwise.
It’s absolutely slower. There is no way to make a network request faster than a function call. It’s slower by probably thousands of times.
I have to look it up every time, but this is always worth reading once a year to remind yourself:
Yeah I’ve seen it before. It’s a very good reminder for everyone to keep in mind isn’t it. :)
Since this is from 12 years ago, have any of these numbers changed much? Especially the SSD numbers.
Not in any order of magnitude
By that chart, 1MB read from an SSD is only 4 times slower than 1MB read from RAM. Wouldn’t have to be an order of magnitude improvement to have an important affect there.
There is no way to make a network request faster than a function call.
Apologies in advance if this it too pedantic, but this isn’t necessarily true. If you’re talking about an operation call that takes ~seconds to run, then the network overhead is negligible. And if you need specialized hardware for it, then it definitely could be delegate it out to a separate machine over the network. Examples could include requiring a GPU, more RAM, or even a faster CPU if your main application is running on more power-efficient CPUs.
I’m not saying that this is true in every case - they are definitely niche cases. But I definitely wouldn’t say that network requests are never faster than local function calls.
Well put. And this is a generic pattern; for example, GPUs are only faster than CPUs if the cost of preparing the GPU and retrieving the result is faster than directly evaluating the algorithm on the CPU. This also applies to main memory! Anything outside of the CPU can incur a latency/throughput/scaling tradeoff.
Think you’re understating it there. Network call takes milliseconds at best. Function call, if the CPU has correctly predicted the indirect branch, is basically free, but even if it hasn’t then you’re talking nanoseconds. It’s slower by millions of times.
Yeah it’s insane. But of course if scaling different parts of the application, I guess micro services are the way to do it. But otherwise one could scale the entire app by just putting more of the entire app on servers. No need for micro services. It just needs to be written to be able to listen to message queues and you can have any number of app instances.
I don’t disagree with there being tradeoffs in terms of speed, like function vs network requests. But eventually your whole monolith gets so fuckin damn big that everything else slows down.
The whole stack sits in a huge expensive VM, attached to maybe 3 or 4 large database instances, and dev changes take forever to merge in or back out.
Every time a dev wants to locally test their build, they type a command and have to wait for 15-30 minutes. Then troubleshoot any conflicts. Then run over 1000 unit tests. Then check that they didn’t break coverage requirements. Then make a PR. Which triggers the whole damn process all over again except it has to redownload the docker images, reinstall dependencies, rerun 1000+ unit tests, run 1000+ integration tests, rebuild the frontend, which has to happen before running end to end UI tests, pray nothing breaks, merge to main, do it ALL OVER AGAIN FOR THE STAGING ENVIRONMENT, QA has to plan for and execute hundreds of manual tests, and we’re not even at prod yet. The whole way begging for approvals from whoever gets impacted by anything from a one line code change to thousands.
When this process gets so large that any change takes hours to days, no matter how small the change is, then you’re fucked. Because unfucking this once it gets too big becomes such a monstrous effort that it’s equivalent to rebuilding the whole thing from scratch.
I’ve done this song and dance so many times. If you want your shit to be speedy on request, great, just expect literally everything else to drag down. When companies were still releasing software like once a quarter this made sense. It doesn’t anymore.
I agree with you, and that is a hellish environment to work in.
There must be a better middle ground for all of this.
In theory, it can be faster with parallelization. Of course, all the usual caveats about parallelization apply, and you’re most likely going to create a slower system if you don’t think it through.
On the contrary, it made it far more difficult to understand the program since you had to know which method was called in which object due to polymorphism when you looked at the code. You had to jump around like crazy to see what code was actually running.
I agree with this point, but polymorphism is often the better alternative.
Using switch statements for the same thing still have the problem that you need to jump around like crazy just to find where the variable was once set. It also tends to make the code more bloated.
Same with using function references, except this time it can be any function in the entire program.
The solution is to only use polymorphism when it’s absolutely needed. In my experience, those cases are actually quite rare. You don’t need to use it everywhere.
Yeah I agree. With experience you know where to use it and where it really shines, and when not to use it because it will just make everything harder to reason about.
But a lot of devs are not that experienced when they make these decisions. All of us learn from mistakes, and those mistakes stay in the code base. :)
They add a lot of overhead and require extra tooling to stay up to date in a maintainable way. At a certain scale that overhead becomes worth it, but it takes a long time to reach that scale. Lots of new companies will debate which architecture to adopt to start a project, but if you’re starting a brand new project it’s probably too early to benefit from the extra overhead of micro architectures.
Of course there are pros and cons to everything, don’t rely on memes for making architecture decisions.
I guess I’m not sure how others build with micro services, but using AWS SAM is stupid simple, and the only maintenance we’ve ever had to do is update a Node version. 🤷
SAM is serverless but that doesn’t necessarily mean micro services.
but they have a lot more disadvantages for most scenarios (if you’re not a faang scale company, you probably don’t need them)
The problem is that they become a buzz word for at scale companies that need them because they have huge complex architects, but then non at scale companies blindly follow the hype when they were created out of necessity for giant tech stacks that are a totally different use case.
What the fuck is a “lamba server less”, and why is my cloud bill so fucked? 🚒 🔥 💰 💸
😵😢😥🤯
I was looking up lambda functions for rust because i needed it for something and didn’t know how, what, etc. But searching anything lambda now only shows results for fucking amazon lambda bullshit! Really pisses me off… its fucked 😠
If you mean lambdas like in python where you say
lambda x: x+1
, they are calledclosure
s in rust, try searching for that instead.Thankyou so much! It can be hard to figure out when all results give me amazon lambda. Rust and their silly names…
AWS Fargate, amazing, AWS ServerLess,… Da fuk!
deleted by creator
The problem is that i didn’t know that so unless someone tells me, I’m stuck searching for lambdas :(
“You dont have to care about infrastructure…” is the bigest lie of those microservice hosting providers.
“You didn’t pay to have it across 47 different data-centres, of course you won’t get 100% uptime!”
Hockey stick go!
Dude just start with a monolith and part it out as you scale. Of course microservices are a waste of time if you build them right off the bat.
It’s just not worth it until your monolith reaches a certain size and complexity. Micro services always require more maintenance, devops, tooling, artifact registries, version syncing, etc. Monoliths eventually reach a point where they are so complicated that it becomes worth it to split it up and are worth the extra overhead of micro services, but that takes a while to get there, and a company will be pretty successful by the time they reach that scale.
The main reason monoliths get a bad rap is because a lot of those projects are just poorly structured and designed. Following the micro service pattern doesn’t guarantee a cleaner project across the entire stack and IMO a poorly designed micro service architecture is harder to maintain than a poorly designed monolith because you have wildly out of sync projects that are all implemented slightly differently making bugs harder to find and fix and deployments harder to coordinate.
I still have to find a name for this disease, but it’s somewhat like “you’re neither Google nor Netflix”.
Everything has to be Scalable™ even if a raspberry pi could serve 200 times your highest load.
I’m currently involved with a “micro service system”, that has very clear, legal requirements, so we know exactly, how much load to expect. At most, a few thousand users, never more than 100 working at the same time on very simple business objects. Complex business logic, but technically almost trivial. But we have to use a super distributed architecture for scalability…
I’m guessing you already got an answer for that though when you asked about it.
Could be either “oh you’re right let’s not do that”, or “because we want to design for horizontal scalability rather than vertical in case the demand grows later”, or “the client has requested and it’s paying for this feature” and so on.
It’s because they think it’s what you’re doing for a large project. Simple as that. There’s no future demand, the client doesn’t care, and I’m not right because they said so.
Micro-services and monoliths sit at opposite extremes though. There are other takes in-between, like multiple services (not micro) for example.
Micro services always require more maintenance, devops, tooling, artifact registries, version syncing, etc.
The initial transition is so huge too. Like, going from 20 to 21 services is no big deal, but going from 1 service to 2 is a big jump in the complexity of your operations.
One of our customers recently had tasked us with building a microservices thing. And I already thought that was kind of bullshit, because they had only vague plans for actually scaling it, but you know, let’s just start the project, figure out what the requirements really are and then recommend a more fitting architecture.
Well, that was 3 months ago. We were working on it with 2 people. Then at the end of last month, they suddenly asked us to “pause” work, because their budget situation was dire (I assume, they had to re-allocate budget to other things).
And like, fair enough, they’re free to waste their money as they want. But just why would you start a microservice project, if you can’t even secure funding for it for more than a few months?
Because some marketing asshole told them that they better be prepared to scale to a bazillion users.
In this case, the colleague who had talked to the customers told me, they wanted microservices, because they’d have different target systems which would need differing behavior in places.
So, I’m guessing, what they really needed is:
- a configuration file,
- maybe a plugin mechanism, and
- a software engineer to look at it and tell them the behavior is actually quite similar.
Typical issue of the corportate programming world being a hivemind. Just because many big tech companies use it you can’t blindly implement it for your 5 developer team.
And it for sure has its usecases - like if you run something with constant load swings that does n’t need to be 100 percent accurate like Youtube it makes sense. You can have a service for searches, comments, transcoding, recommendations, … which all scale independently trading in some accuracy. Like when you post a comment another person doesn’t need to see it within 1 second on another comment service instance.
In comparison with… and examples?
From my perspective the corporate obsession with microservices is a natural evolution from their ongoing obsession with Agile. One of the biggest consequences of Agile adoption I’ve seen has been the expectation of working prototypes within the first few months of development, even for large projects. For architects this could mean honing in on solutions in weeks that we would have had months to settle on in the past. Microservices are attractive in this context because they buy us flexibility without holding up development. Once we’ve identified the services that we’ll need, we can get scrum teams off and running on those services while working alongside them to figure out how they all fit together. Few other architectures give us that kind of flexibility.
All this is to say that if your current silver bullet introduces a unique set of problems, you shouldn’t be surprised if the solutions to those problems start to also look like silver bullets.
Microservices can be useful, but yeah working in a codebase where every little function ends up having to make a CAP Theorem trade-off is exhausting, and creates sooo many weird UX situations.
I’m sure tooling will mature over time to ease the pain of representing in-flight, rolling-back, undone, etc. states across an entire system, but right now it feels like doing reactive programming without observables.
And also just… not everything needs to scale like whoa. And they can scale in different ways: queue up-front, data replication afterwards, syncing ledgers of CRDTs… Scaling in-flight operations is often the worst option. But it feels familiar, so it’s often the default choice.
Do you feel gitops tools like fleet/argocd/flux and kubernetes don’t cover most of the deployment/rollback and system state management problems so far?
I’m talking about user interactions, not deployments.
In a monolith with a transactional data store, you can have a nice and clean atomic state transition from one complete, valid state to the next in a single request/response.
With a distributed system, you’ll often have scenarios where the component which receives the initial request can’t guarantee the final state of the system by the time it needs to produce a response.
If it did, it would spend most of its effort orchestrating other components. That would couple them together and be no more useful than a monolith, just with new and exciting failure modes. So really the best it can do is tell the client “Here’s a token you can use to check back on the state of this operation later”.
And because data is often partitioned between different services, you can end up having partially-applied state changes. This leaves the data in an otherwise-invalid state, which must be accounted for – simply because of an implementation detail, not because it’s semantically meaningful to the client.
In operations that have irreversible or non-idempotent external side-effects, this can be especially difficult to manage. You may want to allow the client to resume from immediately before or after the side-effect if there is a failure later on. Or you may want to schedule the side-effect, from the perspective of an earlier component in the chain, so that it happens even if a middle component fails (like the equivalent of a catch or finally block).
If you try to cut corners by representing these things as special cases where the later components send data back to earlier ones, you end up introducing cycles in the data flow of your microservices. And then you’re in for a world of hurt. It’s better if you can represent it as a finite state machine, from the perspective of some coordinator component that’s not part of the data flow itself. But that’s a ton of work.
It complicates every service that deals with it, and it gets really messy to just manage the data stores to track the state. And if you have queues and batching and throttling and everything else, along with granular permissions… Things can break. And they can break in really horrible ways, like infinitely sending the same data to an external service because the components keep tossing an event back to each other.
There are general patterns – like sagas, distributed transactions, and event-sourcing – which can… kind of ease this problem. But they’re fundamentally limited by the CAP Theorem. And there isn’t a universally-accepted clean way to implement them, so you’re pretty much doing it from scratch each time.
Don’t get me wrong. Sometimes “Here’s a token to check back later” and modeling interactions as a finite state machine rather than an all-or-nothing is the right call. Some interactions should work that way. But you should build them that way on purpose, not to work around the downsides of a cool buzzword you decided to play around with.
I struggle to think of any code with real-world applications that doesn’t have to make CAP tradeoffs. The reason that often doesn’t seem to be the case is willful ignorance.
The only way you can have ACID in a monolith is if the data store is incorporated into the RAM when the code runs and the code is not interacting with any exterior system.
Typically the data store is external though and yes you can have atomic states in the data store but that doesn’t translate to your code. That’s CAP right there, between your code and your data running on two distinct systems.
I mean I made be a novice on this but multi-state service in general sounds like a bad time. Isn’t the accepted best practice for a micro service program stateless operations and one state at most per service?
I mean its true for anything beyond a single threaded monolith right? Otherwise you just get apps that prentend to be asynchronous waiting on locks so they act totally synchronousaly.
Who at what company is having the conversation “let’s do (generic pattern)” without facing some kind of problem or inherent design need that can be solved by (generic pattern). Do these companies need software developers or did they just notice that all of the other companies have them? Surely some sort of inherent needs are driving their software.
Edited to make the generic pattern clearer
Yeah, I work for one of these companies. Some senior executive quotes some stupid thing Jeff Bezos said about everything being an API and is like “This! We need to do this!”
Nevermind the fact that we’re not AWS and our business has zero overlap with theirs. Nevermind that this mindset turns every service we design into a bloated, unmaintainable nightmare. And, forget the fact that our software division is completely unprofitable due to the checks notes shitty business decisions made by senior management.
No no, we’re going to somehow solve this by latching onto whatever buzzword is all the rage right. Turns out having an MBA doesn’t mean you know shit about running a business.
That sounds disgusting. This kind of thing is why I never move jobs.
- Cloud providers have financial incentive to push microservice architectures
- Cloud providers give corporate consultants statistics like “microservice architectures are proven to be X% more likely to succeed than monolithic architectures”
- Cloud providers offer subscription-based tools and seminars to help companies transition to microservice architectures
- Companies invest in these tools and seminars and mandate that all new projects adopt microservice architectures
This is how it went down with Agile at my company 10 years ago, and some process certifications and database technologies before that. Based on what I’m hearing from upper management microservice are probably next.
redundancy, rolling updates or byzantine fault tolerance in a monolith > naïve assumptions that one part of your system going down won’t mess up it’s overall usability by and large just because you’ve used microservices
Micro services alone aren’t enough. You have to have proper observability and automation to be able to gracefully handle the loss of some functionality. Microservice architecture isn’t a silver bullet, but one piece of the puzzle to reliable highly available applications that can handle faults well
We have openshift and all the works but only teams that want to use it, use it. Once your project is dockerized it goes swell.
Pretty sure microservice architecture was invented by enterprise architects as a way to justify their existence, and by dev teams in general to explain why adding new features now takes so long (we have adopted best practice!).
Removed by mod