Yup, in my experience E2E tests have been super successful at catching bugs not surfaced through others.
Yup, in my experience E2E tests have been super successful at catching bugs not surfaced through others.
It is fairly relevant to lemmy as is. Quite a few instances have ram constraints and are hitting swap. Consider how much worse it would be in python.
Currently most of the issues are architectural and can be fixed with tweaking how certain things are done (i.e., image hosting on an object store instead of locally).
On the other hand, Rust is fairly resilient. The issues Lemmy is experiencing wouldn’t be fixed in Python vs Java, it’s more of an architectural constraint. Those issues, experienced devs can fix mostly regardless of language.
Sorry if I was curt! No reason to be sorry for throwing out a decent idea
Caddy is not going to fix anything, on the contrary, it consumes more ram. Generally the instances have been slowing down when swap gets hit by the db, so lowering ram usage and optimizing that should be the first priority.
It’s often useful to have a discord or something to throw around approaches and discussions more conversationally before formalizing an issue or RFC imo, but happy to do it via github too.
I would think it helps newer people to get set up and hacking on it as well
The issues I’ve seen more are around images. Hosting the images on an object store (cloudflare r2, s3) and linking there would reduce a lot of federation bandwidth, as that’s probably cause higher ram/swap usage too.
pict-rs supports storing in object stores, but when getting/serving images, it still serves through the instance as the bottleneck IIRC. That would do quite a bit to free up some resources and lower overall IO needed by the server.
I will be working on this when I get cycles. Barring the issues already above, there are a lot of areas for optimizations, for instance how images are handled (i.e., they can be handled through object storage like Cloudflare R2 to decrease bandwidth/ram costs). Some is more dev-ops on how common instances are setup, others are code changes to make things more efficient.
Perhaps we should start a community or communication group for this?
+1, lemmy/kbin/mastodon are communities users can shape and contribute to (literally to the code as well) far more than reddit. That alone with along with the recent influx of users, makes it a far more interesting place than reddit.
That latter part is the same reason we mainly just use it in unit tests, and not much else. It’s such a baffling situation when there’s an assert, but the code still executes despite the assert. Debugging an incident or issue just becomes super annoying when that can be the case.
Was playing around with a small in memory cache as well as materialized views to prevent the swap hits. Hard to prevent the inbound traffic though, maybe a CDN could help, but need to see what the traffic patterns look like.
Context size is huge, as well as the ability to context switch effectively. It can mean the difference between solving something in a day or weeks.
I like problem decomposition a lot as a discrete step. There’s a huge tendency to go, I have problem A, let’s just solve with it B. Many times the nuance of why A occurred, whether it’s a symptom of something, and what are the different subproblems that comprise A are skipped.
This often causes solutions which don’t actually solve the problem, or just mask it. That extra effort up front, leads to the proper solution, and as you said, very tactical fixes instead of huge unnecessary solutions.
Definitely agree there! Communication is super underrated, especially with how difficult it can be to align people and teams across organizations.
Jargon is great for consolidating complexity into just a few words, reducing the things you have to think about. It can be equally valuable though to poke into implicit assumptions that are commonly made.
It’s definitely a balance, and being inclusive in conversations is super important as you mentioned. It allows newer folks to get up to speed much faster in comparison, and allows more engagement across the people within the discussions.
Oh absolutely! Looking to get it running locally so I can start building something out and trialing something. Trial and error iteration would work well here I think
What API are you using? Where is the data stored? Would be easier to answer with more data on how this is setup.
The tricky part is defining an algo for hot, it’s a bit different than reddit since some instances may have many people, and local may have fewer, so there should be a balance such that posts from larger instances don’t overwhelm local instances perhaps.
Indirect prompt injections will make this worse. Plugins lead to scraping insecure websites (i.e., search for docs for a particular topic). This can result in malicious context being embedded and suggested during a PR or code output.
That along with the above, faking commonly recommended inputs, it becomes very difficult to just trust and use LLM output. One argument is that experienced devs can catch this, but security is often about the weakest link, one junior dev’s mistake with this could lead to a hole.
There are guard rails to put in place for some of these things (i.e., audit new libraries, only scrape from ‘reliable’ websites), but I suspect most enterprises/startups implementing this stuff don’t have such guard rails in place.
Related
Other types of tests**.
For instance integration with other services, performance regressions, etc.