GraphQL kinda sucks

Hacker News - Sat Aug 6 10:25

Having worked in big tech and small startups, I think GraphQL is a brilliant way to solve an organizational problem that massive tech companies have.

It's that the team maintaining the API is different from the team that needs changes to the API. Due to the scale of the organization the latter doesn't have the access or know-how to easily add fields to that API themselves, so they have to wait for the maintainers to add the work to their roadmap and get back to it in a few quarters. Relevant Krazam: https://www.youtube.com/watch?v=y8OnoxKotPQ

At a small start-up, if the GET /foo/:fooId/bar/ endpoint is missing a field baz you need, you can usually just add it yourself and move on.

From what I've read GraphQL makes the most sense in the context of large scale teams and large databases. It introduces a large amount of overhead to your backend to parse queries, but allows those queries to be more flexible and not think about the database's schema which is a tremendous boon when you get to the scale where communication between teams is more expensive than implementation time. For junior and midlevel devs watching youtubes and reading blogs it's obviously an exciting technology because of what it promises (in this sense GraphQL is hardly unique), but there's a practical cost to production workloads.

That said, if I'm not mistaken GraphQL is almost explicitly designed with a versionless paradigm in mind. Whether or not that's a good decision is up for debate, but it's less "no clear path" and more like it's the responsibility of the backend to add support for new access patterns without causing old patterns to fail.

Some libraries offer enterprise features for versioning.

Like you said it makes sense in larger companies. As a consumer of a well documented API it’s great, as someone who had to stitch together 2 microservices to provide the required data it sucks.

Whether it was the intention or not, I find GraphQL solved a fascinating problem: it let front end developers move faster by greatly decoupling their data needs from the backend developers.

Backend developers describe the data model, expose it via graphql. Front end developers, often ones who never met those backend developers, can see the data model and just use it. They can change what they're querying on the fly, get more or less as they see fit.

It lets everyone move faster.

But as a backend developer, I actually fucking hate it, myself.

I thought it was supposed to do this, but then discovered that it has no way to express joins.

Has this been addressed? I don't see how you can decouple the back-end data from front-end queries without that.


I get specialization, but are there any other good reasons to divide product teams between frontend and backend? I guess it also helps establish patterns and contracts, but I think those are only helpful above a critical mass that I haven’t reached in my career yet.


In a small organization there isn't generally any reason to divide the teams between front and back end. As you've alluded - once you have many clients you'll want to separate responsibilities in order to increase velocity.


From a management perspective, the fiction of the full stack developer that is equally skilled at everything is the easiest. You stick with that until you complicate your architecture (wisely or not) to the point where having specialists outweighs having to manage multiples queues of work and dependencies.

> good reasons to divide product teams between frontend and backend?

People specialize in different things. A great React developer may not be a great Java developer, and vice-versa


It sounds like it is promoting a siloed cogs in the machine type of work ethic. Where you are either front end or back end and no one is thinking end-to-end about the system.

That's generally true for sizable companies. Small companies can and do use full stack devs.

Segmentation makes some sense but the industry is lacking end to end thinking as you point out.


I think the causality runs the other way. Once the frontend had gotten so complex that it required a specialized team, solutions arose to reduce the back and forth necessary between frontend and backend teams.


Curious as to why you hate it specifically. Because what you could be doing is exposing every table / field automatically based on permissions (which you could set up a system where you don't even have to be involved).

It's not an unpopular opinion: it's true. Graphql is a terrible piece of software/paradigm.

I've completely avoided it for years. If a potential new job contacts me and they use graphql, it's an immediate no from me. It's an immediate red flag that the engineering culture at the company is poorly run and would be a nightmare to work in.

Run away, as fast as possible.

> It's an immediate red flag that the engineering culture at the company is poorly run and would be a nightmare to work in.

Well, at least I'm glad I know I'll never be working with you!

I'm not giving you shit because you don't like GraphQL, I'm giving you shit because of your asinine "if you use GraphQL your company is poorly run and a nightmare" comment.

Whenever I see a comment from any developer that says "If you use technology X then you are an idiot", then I know that developer is either incredibly junior and doesn't understand the tradeoffs in choosing any technology, or they're showing typical "if you don't like what I like than you don't know what you're doing" arrogance that inevitably always makes it a pain in the ass to work with that person.


You are right that this arrogant attitude is juvenile, but it is prevalent amongst humans in all areas. Unfortunately, there would be few developers to work with if you apply the policy of not working with arrogant, dogmatic developers.

> Graphql is a terrible piece of software/paradigm.

Says who, exactly? What data are you basing this on? Or is it a completely subjective opinion dictated by frustration likely caused by the lack of understanding of it?

Did you try reading the thread? What do you mean by data?

> Or is it a completely subjective opinion dictated by frustration likely caused by the lack of understanding of it?

You can ask this this very same question when people say graphql is an amazing piece of software/paradigm, except s/frustration/hype/

Be careful with anyone with a take that says some technology is 100% bad always. Given enough experience / skill you can make any technology fairly enjoyable so I've only ever seen mixed reactions at worst from people giving things a fair try.

GraphQL is a way to describe not only your API but also the entities and relationships in it. This enables certain useful things for client heavy applications, like cache normalization. If you look at clients like URQL they enable high quality features in your app that are otherwise extremely difficult.

You can also do this with JSONAPI but the GraphQL ecosystem is more developed.

Setting up GraphQL to minimize its rough edges is incredibly difficult. I've currently landed on a combination of Pothos + Genql + URQL to enable me to do everything in typescript instead of untyped strings.

It takes very high skill to use GraphQL well. Few teams get there because they don't have the upfront time to do all the research.

But if you pull it off it can be an incredibly productive system that is friendly to iteration and refactoring. I can send you some content we've produced on this if you're interested.

That said, if I'm not working on a client heavy app, I'd just use a less featureful RPC framework.

Agree with everything here, but something that often gets missed is that you don't have to use all that GraphQL enables from day one.

It is perfectly fine to start with an early implementation that treats GraphQL as mostly an RPC, with only resolvers for Query & Mutation types. You still benefit from GraphQL's type-safety, batching and code-generation.

Once you have more familiarity with dataloaders, query complexity etc. update your output objects to have links to other output objects building the graph model.

The issue is that too many people get fascinated with GraphQL early, then build deep & complex topologies and expose it in inefficient and potentially insecure way.

> Be careful with anyone with a take that says some technology is 100% bad always.

THIS 100%.

> It takes very high skill to use GraphQL well. But if you pull it off it can be an incredibly productive system that is friendly to integration and refactoring.

I could not agree more. It's like any other piece of tech: once you internalize the mental model and are able to translate those abstractions in your language of choice everything clicks. And then it's hard to imagine going back to something more "primitive" (i.e. what's conventionally called "REST").

After building "RESTful" APIs for years I can confidently say GraphQL (with a decent implementation) is a step up across almost every possible dimension (performance aside because of the additional parsing).

> Be careful with anyone with a take that says some technology is 100% bad always. Given enough experience / skill you can make any technology fairly enjoyable so I've only ever seen mixed reactions at worse from people giving things a fair try.

Completely agreed. Without knowing the team experience, greenfield project or not or in general more information about the task at hand, how can anyone say GraphQL is good or not?

One thing I've noticed among some people who've failed to move up in their career is that they carry these extreme opinions due to a proper understanding or a bad experience. Right tool for the job and all that.

In any fast paced business, APIs constantly change. Versioning is not straightforward. Often corners are cut, and deadlines have to be met. There just is one version of the API, the one in production!

In my experience the most pain around GraphQL was due to a lack of care/time. Too often schemas are not strictly defined, are too generic (type: any) and fields are not documented. Errors are poorly defined etc. Combine that with untyped code, it's almost the same as sending blobs of arbitrary JSON around. In the short term it works, the feature is live, business is happy. But takes double the effort to untangle the code when you need to change something....

GraphQL also requires effort to make tooling and monitoring work well with it as it deviates from a traditional REST like model.

> There just is one version of the API, the one in production!

Unless there are multiple in production, i.e /v1/api, /v2/api, etc

Comparing GQL to REST is like comparing a framework to a protocol. Of course it's easier to build up from REST, but if you're lucky enough that your project survives it won't be long before you're reimplementing the same GQL features you rubbished in a half-arsed way.

In an ideal world, we can grow into the tech pulling in features as needed, but REST (in its modern form) is all about day 1 productivity, not day 100.

> if you're lucky enough that your project survives it won't be long before you're reimplementing the same GQL features you rubbished in a half-arsed way.

Unless you don't need those features to begin with? I also think with this mentality you'd end up with a lot of teams saying "lets do GQL in case we need it later" and it ends up being a ton of work for no benefit later.

> It can save you bandwidth. Get what you ask for and no more

I feel like this a false truth.

Most people are building a web app, or a mobile app and consuming an api and displaying all the data they retrieve.

If you have an rest api which returns an object with 6 properties. And a graphQl scheme which returns those same 6 properties. You’re not saving anything.

Now if you have a website and a mobile app, where the mobile app needs 3 fields and the website needs 6 fields. You will obviously save on bandwidth with the mobile app.

The problem here is most of us are not building Facebook. The data saved is peanuts and the bandwidth cost is probably going to be far less than the total cost of doing the work to support graphql.

For a company like Facebook which has many different integrations as well as 3rd parties integrating, graphql is obviously a godsend as integrators and integrations can consume only what they require and save Facebook millions of dollars in bandwidth.

> If you have an rest api which returns an object with 6 properties. And a graphQl scheme which returns those same 6 properties. You’re not saving anything.

You aren't thinking big enough. We have a graphql API where we have a bunch of enterprise users all wanting to pull out different types of data. They can decide what they want to get, and pull exactly that data. They want access to different tables, fields, and for different purposes and with different filters. We don't have to be involved, we just give them the schema.

> We have a graphql API where we have a bunch of enterprise users all wanting to pull out different types of data.

That’s a great use for graphql. My issue is toooo many people are building web apps with graphql where they are the only consumer. They are not getting any of the benefits of graphql, especially bandwidth savings.

If you have many people integrating and you in vision the types of integrations needing very different queries for data then graphql is great.

But most companies are a website, maybe some mobile stuff, and not enough traffic to warrant the complexity or benefits of graphql.

Some other bad things:

- Makes caching more challenging since there are now more possible permutations of the data depending on what query the client uses. A hacker could just spam your server's memory with cache entries by crafting many variations of queries.

- Makes access control a lot more complicated, slower and error-prone since the query needs to be analyzed in order to determine which resources are involved in any specific query in order for the server to decide whether to allow or block access to a resource. It's not like in REST where the request tells you exactly and precisely what resource the client wants to access.

- Adds overhead on the server side. It requires additional resources to process a query rather than just fetching resources by ID or fetching simple lists of resources. A lot of work may need to happen behind the scenes to fulfill a query and GraphQL hides this from the developer; this can lead to inefficient queries being used. I have a similar complaint about database ORMs which generate complex queries behind the scenes; this makes it difficult to identify performance issues in the underlying queries (since these are often completely hidden from the developer). Hiding necessary complexity is not a good idea... Maybe worse than adding unnecessary complexity.

> - Makes caching more challenging since there are now more possible permutations of the data depending on what query the client uses. A hacker could just spam your server's memory with cache entries by crafting many variations of queries.

You could use something like https://stellate.co/.

> - Makes access control a lot more complicated, slower and error-prone since the query needs to be analyzed in order to determine which resources are involved in any specific query in order for the server to decide whether to allow or block access to a resource.

Hasura and Postgraphile can do this - in the case of Postgraphile it obviously requires Postgres.

I'm glad the tide seems to be turning against GraphQL.

Your company is not Facebook, you don't have an impossibly large graph dataset that needs querying

I've used it 3 times, each time it was a nightmare. REST with some additional params to query extra data is so much easier and safer ... that's just not very cool I guess.

Yes, graphql does indeed suck. Or rather it is not the best solution for all client-server communication that people treat it as, so it ends up being used in a lot of situations where it does suck.

Backend to backend communication is almost never graphql (is that changing in a big way?) which would indicate the main reason for graphql in a client-server situation is data saving but at the cost of complexity and other downsides. Almost certainly the data savings in many cases is not worth it.

Oh and then you have things like Apollo having cache bugs that result in incident level problems.

Add the fact that the cache performance is unpredictable and inconsistent.

GraphQL queries select specific fields for “performance reasons” so each request is custom and you can't cache at the edge. This is ridiculous because SQL joins are costly but fetching a few extra textual fields is basically never a bottleneck.

So in order to save a few bytes or KBs (which nobody care about anyway), you ruin your caching abilities...

...Well unless you manually cache your GraphQL queries with more fields than needed but then what's the point of using GraphQL at all?

That's very often a terrible compromise.

I also had a bad experience with graphQL, or perhaps more accurately the Apollo client for it.

I hated the namespaces on the code-generated classes and ended up manually wrapping them, obviating the benefit.

It was also super brittle, when the back end would change something it would break the clients.


IME GraphQL is kinda great on the consuming side, but the complexity and maintenance overhead of query resolvers can outweigh the benefits. I agree that it makes sense to look elsewhere first. For example, react-query can solve many of the problems GraphQL aims to solve (eg overfetching), without the downsides.

I tried graphql in a microservices environment and had huge headaches over schema stitching.

Turns out, if you have lots of objects from disparate sources, graphql don’t like that. Now you must stitch these schemas together into an über schema roll query and use with graphql. Maybe we missed a step. Maybe we misunderstood. Maybe it was a bad decision.

I used graphql at one place and our stitching was a disaster. I'm thinking bad decision.

We also had micro services. It was the most unstable backend I've ever worked with.

Yet the architects who picked all these techs were mostly lauded and didn't stick around to deal with the mess.


And good luck doing that in a language that isn’t JS. All our GQL services were Go, but were forced to use JS for any stitching needs.

"Very Senior Dev" here (though it amuses me to call myself that). I had managed to avoid GraphQL for a while, but recently had to actually look at it and use it. I was appalled that in this day and age, this hyped silver bullet basically requires me to build queries using strings.

I was honestly surprised by this: when I first heard about the idea behind GraphQL, I was certain that I'd pass in nested data structures.

I use it because I have to (external services jumped on the hype bandwagon), but I don't consider it significant progress. And this "save bandwidth" advantage is totally oversold, I really can't see how the savings can be significant in a practical way.

> I was appalled that in this day and age, this hyped silver bullet basically requires me to build queries using strings.

That's like saying SQL is crap because you need to build queries by hand. Conflating a protocol/spec with its implementation and/or the way it's used is not something a "very senior dev" should do (not questioning you personally, just the validity of what you wrote in this specific comment).

There are plenty of good GraphQL libraries that make writing queries/mutations and the whole schema a breeze. I've been using graphene for Python for a couple of years and, although it has some rough edges, it's actually pretty decent. And GraphQL is quite a good mental model to work with that both backend and frontend can share.

<rant>Honestly I'm getting tired of seeing these comments on HN, it's the same for Kubernetes or other technologies. Often written by someone who didn't take the time to actually study and understand the tech and use it for actual projects. Often with no data to back it up whatsoever. The quality of posts and comments here used to be a lot higher but it's slowly turning into a plaintext version of dev.to </rant>

The GP is right. GraphQL is especially annoying because it looks so close to JS/JSON, yet since no thought was put into how it might integrate into existing typed languages its actually surprisingly difficult to build a type-safe API around it.

And yes, SQL is "bad" because you write query strings. The funny bit is that GraphQL may be just as hard to model in a type-safe way as SQL is, if not a little harder. At least with SQL we have a reasonably good way to model it with methods (see LINQ). LINQ was built 15 years ago, so this problem was well understood back then.


Yeah, if you use something like Postgres and Hasura for a new project it's pretty simple. I doubt you could make a REST API much easier. Django + Django Admin is close, but that's not really an equivalent per se.


Not sure what you mean, I added the expected resulting GraphQL queries and they're about the same size. They aren't really normal queries, because I'm trying to exercise all of the features at once in the tests :)


Hmm, there’s really nothing preventing you from writing a library which allows you to pass a data skeleton to an async function and get a full body back.


I was psyched to hear about GraphQL, but when I looked at it and found no apparent way to do joins... I wondered what the big deal is.

> As far as the versioning goes, the prevailing wisdom seems to be that it just isn't needed on graphql apis

In reality, of course API versioning is needed.

The "prevailing wisdom" is people who like the tool, don't like its limitations, and want to pretend it isn't a problem.

.

> That said, versioning of a graphql api has been done at scale before. Shopify has done it

Perl people will happily point out that object systems exist for Perl.

Having it not be a standard, uniform part of the core tool is a severe limitation. Shopify needed this rudimentary feature, and tried to give it away because others need it to, and the tool vendors aren't willing (or able) to deliver.

> In reality, of course API versioning is needed.

I’d recommend not picking a technology that is versionless by design if you think you’re going to need it. Also I’d dispute that versioning is needed as a rule.


Ignore all the noise and just use an RPC model between your backend and frontend. All these stupid trends and overengineered abstractions will come and go, but people will still be using plain RPC in 5, 10, 100, and 1000 years.

At cost of reducing cache ability though no?

It is cheaper for me to put my react app in front of a cdn, split out my app into an api and front end than for me to have my site be entirely uncacheable.

I can also cache certain endpoints behind the cdn that are mostly invariant for users. And, the network egress of json is much lesser than the egress of markup.

There are patterns that can get you the same benefits without having to use GraphQL.

Even on a REST API, you can achieve the same pros

> - It makes working with describing the data you want easy > - It can save you bandwidth. Get what you ask for and no more

You can describe the fields you need (and I assume that is what reduces the bandwidth)

GET /users?fields=name,addresses.street,addresses.zip

> - It makes documentation for data consumers easy

I don't think so in practice. You can see Shopify's GraphQL documentation [1]. If anything it is more complex than their REST API docs

> - It can make subscription easier for you to use

Not too different from using something like SSE or even websockets and every decent web framework seems to have a decent implementation

> - Can let you federate API calls

So many ways to achieve this at the application layer (which is what GraphQL federation does with a Router). In the python world this could be separate WSGI apps or racks in ruby? And makes no difference if done at the load balancer level.

[1] https://shopify.dev/api/admin-graphql/2022-07/enums/localiza...


GraphQl has a set of trade-offs. REST also has trade-offs. Bespoke RPC calls have trade offs. It's as if the entire discipline is an exercise in selecting trade offs or something.


Well put. Problem is we can get emotionally invested and tie our identities to these systems in a way that we couldn't if we were comparing the trade-offs between 18/10 and 18/12 inox steel.

> A regular api poorly implemented will have all the same cons and none of the pros.

Okay, so don't make a poorly implemented one?

Also, no, it kind of won't. Let's look at what they are again

.

"It is actually a pain to use"

Not an API characteristic

.

"you'll have to manage two or more type systems if there are no code first generates in your language"

Not an API characteristic

.

"It doesn't support map/tables/dictionaries."

Not an API characteristic

.

"No clear path for Api versioning"

Not an API characteristic

.

Looks like it's actually zero for four


In order to build a properly implemented one, one must first learn to build one. I find most people jump onto graphql because they see the volume of work to create crud endpoints for all of their models and decided to punt.


It’s a false promise. You’re just moving the complexity elsewhere, into wiring this ridiculous graphql infrastructure together and making it actually do what you want in all but the simplest/tutorial-like scenarios.


Seriously. I sometimes think I'm visiting a satirical alter realm when I see these obtuse systems that end up requiring more cognitive energy to glue together than the regular version in <lang of your choice here> being praised rather than derided.

> I sometimes think I'm visiting a satirical alter realm when I see these obtuse systems that end up requiring more cognitive energy to glue together than the regular version in <lang of your choice here> being praised rather than derided.

You're lucky, you're only visiting. I'm working in that alternate reality :-)

First, s/opinoin/opinion/

Second, there's a reason DBA (Database Administrator, something like that) is a formal, separate role.

If you don't have a DBA it's not because that's a good idea, it's because you (or your bosses) are too cheap to do things well. That's okay, a lot of times cheap is more important than correct.

You're still going to need schema.

Both NoSQL and GraphQL seem to throw the baby out with the bath water. It was all done before, and Relational Model won. Gotta know your history kids: https://en.wikipedia.org/wiki/Database#History

- - - -

Tidbit of lore: SQL is not the original language for relational model databases, Codd had a language he called Alpha: https://en.wikipedia.org/wiki/Alpha_(programming_language)

SQL is the JS of DBs!

> It can save you bandwidth. Get what you ask for and no more

I don't, and never have believed this argument. I would bet that JSON with Gzip or Zstd is just as small, or close enough that it doesn't matter.


If you have any data sets with an open text field that you don't care about, or where you really only need to get the ID from a wide table, it will definitely save a ton of bandwidth. How often that really matters on the other hand...


And in this case usually you can easily opt to exclude the troublesome field with a param, or an endpoint that returns a minimum version of the entity.