Is preloading/caching data before the actual method call an (anti)pattern?

@Cyno · edit-2 7 months ago

Is preloading/caching data before the actual method call an (anti)pattern?

booooop [any] · 7 months ago

If testing this properly is your problem you should invest time in integration testing, running them on an in-memory database is an option as well. I think retrieving all the data and “caching” it like you call it has some negative consequences, for example what if the validation for some action fails and you didn’t need to load whatever you preloaded? Waste of a call to the db

@pohart · edit-2 7 months ago

You’re right that this could introduce regressions, but it sounds like it’s making more testable.

My biggest concern would be introducing db contention with locks being held for too long, and introducing race conditions because the cached data isn’t locking the records when they’re cached.

Edit: your->you’re

@Cyno · 7 months ago

Validation is usually the first step so I only start preloading after it’s done of course, but you are right - you can easily end up loading more data than it necessary.

However, it can also result in fewer overall queries - if I load all relevant entities at the beginning then later I won’t have to do 2+ separate calls to get relevant data perhaps. For example, if I’m processing weather for 3 users, I know to preload all 3 users and weather data for the 3 locations where they live in. The old implementation could end up loading 3 users, then go into a loop and eventually into a method that processes their weather data and do 3 separate weather db hits for each of the users (this is a simplified example but something that I’ve definitely seen happen in more subtle ways).

I guess I’m just trying to find a way to keep it a pure method with only “actual logic” in it, without depending on a database. Forcing developers to think ahead about what data they actually need in advance also seems like a good thing maybe.

@pohart · 7 months ago

Forcing developers to think ahead about what data they actually need in advance also seems like a good thing maybe.

It does.