this post was submitted on 12 Jun 2023

387 points (99.2% liked)

Lemmy.World Announcements

29063 readers

2 users here now

This Community is intended for posts about the Lemmy.world server by the admins.

Follow us for server news 🐘

Outages 🔥

https://status.lemmy.world/

For support with issues at Lemmy.world, go to the Lemmy.world Support community.

Support e-mail

Any support requests are best sent to info@lemmy.world e-mail.

Report contact

DM https://lemmy.world/u/lwreport
Email report@lemmy.world (PGP Supported)

Donations 💗

If you would like to make a donation to support the cost of running this platform, please do so at the following donation URLs.

If you can, please use / switch to Ko-Fi, it has the lowest fees for us

Join the team

founded 1 year ago

MODERATORS

ruud@lemmy.world

lwadmin@lemmy.world

lwCET@lemmy.world

jelloeater85@lemmy.world

Serinus@lemmy.world

lw_mod_notification@lemmy.world

387

How is lemmyworld so stable? (monero.house)

submitted 1 year ago by maltfield@monero.house to c/lemmyworld@lemmy.world

126 comments fedilink hide all child comments

At the time of writing, Lemmyworld has the second highest number of active users (compared to all lemmy instances)

https://lemmy.fediverse.observer/list

Also at the time of writing, Lemmyworld has >99% uptime.

By comparison, other lemmy instances with as many users as Lemmyworld keep going down.

What optimizations has Lemmyworld made to their hosting configuration that has made it more resilient than other instances' hosting configurations?

https://lemmy.ml/post/1221008

you are viewing a single comment's thread
view the rest of the comments

[–] PriorProject@lemmy.world 167 points 1 year ago (46 children)

I'm not an admin, but have followed the sizing discussions around the lemmyverse as closely as I can from my position of lacking first-hand knowledge:

lemmy.ml is the biggest instance by user count, but runs on incredibly modest 8-cpu hardware. Their cloud provider doesn't provide any easy scale up options for them, so they can't trivially restart on a bigger VM with their db and disk in place. I suspect this means that instance is going to suffer for a bit as they figure out what to do next.
lemmy.world on the other hand was running on a box at least twice as big as lemmy.ml at last count, and I believe they can go quite a bit bigger if they need to.
The lemmy.world admins also run mastodon.world and lived through the twitterpocalypse, seeing peak user registrations rates of 4k per hour. So this is not their first rodeo in terms of explosive growth, I'm sure that experience gives them some tricks up their sleeve.
The admin team is pretty clearly technically strong. If I recall correctly, ruud is a professional database admin. One of the spooky parts of Lemmy performance-wise is the db. If ruud or others on the admin team custom-tuned their pg setup based on their own analysis of how/why it's slow, they may be getting more performance per CPU cycle than other instances running more stock configs or that are cargo-culting tweaks that aren't optimal for their setup without understanding what makes them work.

I'm surprised that sh.itjust.works isn't growing faster. They also have a hefty hardware setup and seemingly the technical admins to handle big user counts. I wonder if it's a branding problem, where lemmy.world sounds inviting and plausibly serious where sh.itjust.works sounds like clowntown even though it's run by a capable and serious team.

[–] maltfield@monero.house 7 points 1 year ago (11 children)

Right, but if you don't have a cache setup, then the DB gets taxed. At a certain point a cache looses its benefit, but an enormous amount of savings can be made (to backend DB calls, for example) by just caching all API reads for ~60 seconds.

[–] andrew@radiation.party 8 points 1 year ago* (last edited 1 year ago) (10 children)

Ensuring there's no data leakage in those cached calls can be tricky, especially if any api calls return anything sensitive (login tokens, authentication information, etc) but I can see caching all read-only endpoints that return the same data regardless of permissions for a second or two being helpful for the larger servers.

It's also worth noting that postgres does its own query-level caching, quite aggressively too. I've worked in some places where we had to add a SELECT RANDOM() to a query to ensure it was pulling the latest data.

[–] maltfield@monero.house 4 points 1 year ago (1 children)

In my experience, the best benefits gained from caching are done before the backend and are stored in RAM, so the query never even reaches those services at all. I've used varnish for this (which is also what the big CDN providers use). In Lemmy, I imagine that would be the ngnix proxy that sits in-front of the backend.

[–] PriorProject@lemmy.world 3 points 1 year ago (3 children)

I haven't heard admins discussing web-proxy caching, which may have something to do with the fact that the Lemmy API is currently pretty much entirely over websockets. I'm not an expert in web-sockets, and I don't want to say that websockets API responses absolutely can't be cached... but it's not like caching a restful API. They are working on moving away from websockets, btw... but it's not there yet.

The comments from Lemmy devs in https://github.com/LemmyNet/lemmy/issues/2877 make me think that there's a lot of database query optimization low-hanging fruit to be had, and that admins are frequently focusing on app configs like worker counts and db configs to maximize the effectiveness of db-level caches, indexes, and other optimizations.

Which isn't to say there aren't gains in the direction your suggesting, but I haven't seen evidence that anyone's secret sauce is in effective web-proxy caches.

[–] maltfield@monero.house 5 points 1 year ago (1 children)

Yeah, that's exactly why I'm asking this question. All the effort seems to be going into the DB -- but you can have a horribly shitty DB and backend but still have a massively performant webserver by just caching away the reads to RAM.

I didn't see any tickets about this on the GitHub, which is why I'm asking around to see if there's actually some very low-hanging-fruit for improving all the instances with a frontend RAM cache.

[–] PriorProject@lemmy.world 5 points 1 year ago

Yeah, that's exactly why I'm asking this question. All the effort seems to be going into the DB -- but you can have a horribly shitty DB and backend but still have a massively performant webserver by just caching away the reads to RAM.

Much of your post seemed to focus on the techniques employed by lemmy.world, caching websocket responses in the web-proxy does not seem to prominently feature among those techniques.

If you're interested in advancing the state of the discussion around web-proxy caching, I'd consider standing up an instance to experiment with it and report your own findings. You wouldn't necessarily have to take on the ongoing expense and moderation headache of a public instance, you could set up with new user registrations closed, create your own test users, and write a small load generator powered by https://join-lemmy.org/api/ to investigate the effect of caching common API queries.

[–] s900mhz@beehaw.org 3 points 1 year ago (1 children)

I may be wrong, but there is a branch in the works (UI repo) that pulls the web socket out and replaces it all with http calls. So the web socket may not be here for long

[–] PriorProject@lemmy.world 1 points 1 year ago (1 children)

You're correct, the devs are already committed to deprecating the websocket API. This may make caching easier in the future and people may use it more as a result. I'm a little bit skeptical as most of the the heavy requests are from authenticated users, and web-proxy caching authenticated requests without risking serving them up to the wrong user is also non-trivial. But caching is not my area of expertise, there may be straightforward solutions here.

But my comment was in reference to current releases in use on real world Lemmy servers.

[–] s900mhz@beehaw.org 1 points 1 year ago (1 children)

Yes, I didn’t intend to downplay your comment. Caching at the proxy later with auth is something I am not familiar with. I never had to implement it in my career. (So far 😅) I just wanted to make it known that the web socket may be a thing of Lemmy past for anyone unaware

[–] PriorProject@lemmy.world 2 points 1 year ago (1 children)

Yes, I didn’t intend to downplay your comment.

I never interpreted it that way. Your comment was helpful, and I was expanding on it with more context. Lemmy on, friend.

[–] s900mhz@beehaw.org 1 points 1 year ago

Good to hear! Lemmy on 🐭✊

[–] yourstruly@dataterm.digital 3 points 1 year ago* (last edited 1 year ago)

I work on nginx cache modules for a CDN provider.

While websockets can be proxied, they're impractical to cache. There are no turn key solutions for this that I'm aware of, but an interesting approach might be to build something on top of NChan with some custom logic in ngx_lua.

I agree with you that web proxy cache's aren't the silver bullet solution. They need to be part of a more holistic approach, which should start with optimizing the database queries.

Caching with auth is possible, but it's a whole can of worms that should be a last resort, not a first one.

load more comments (8 replies)

load more comments (42 replies)