Hi everyone !

Intro

Was a long ride since 3 years ago I started my first docker container. Learned a lot from how to build my custom image with a Dockerfile, loading my own configurations files into the container, getting along with docker-compose, traefik and YAML syntax… and and and !

However while tinkering with vaultwarden’s config and changing to postgresSQL there’s something that’s really bugging me…

Questions


  • How do you/devs choose which database to use for your/their application? Are there any specific things to take into account before choosing one over another?

  • Does consistency in database containers makes sense? I mean, changing all my containers to ONLY postgres (or mariaDB whatever)?

  • Does it make sense to update the database image regularly? Or is the application bound to a specific version and will break after any update?

  • Can I switch between one over another even if you/devs choose to use e.g. MariaDB ? Or is it baked/hardcoded into the application image and switching to another database requires extra programming skills?

Maybe not directly related to databases but that one is also bugging me for some time now:

  • What’s redis role into all of this? I can’t the hell of me understand what is does and how it’s linked between the application and database. I know it’s supposed to give faster access to resources, but If I remember correctly, while playing around with Nextcloud, the redis container logs were dead silent, It seemed very “useless” or not active from my perspective. I’m always wondering “Humm redis… what are you doing here?”.

Thanks :)

  • @towerful
    link
    1
    edit-2
    29 days ago

    These days, I just use postgres for my projects.
    It’s rare that it doesn’t do what I need, or extensions don’t provide the functionality. Postgres just feels like cheating, to be honest.

    As for flavour, it’s up to you.
    You can start with an official image. If it is missing features, you can always just patch on top of their docker image or dockerfile.
    There are projects that build additional features in, or automatic backups, or streaming replication with automatic failover, or connection pooling, or built in web management, etc

    Most times, the database is hard coded.
    Some projects will use an ORM that supports multiple databases (database agnostic).
    Some projects will only use basic SQL features so can theoretically work with any SQL database, some projects will use extended database features of their selected database so are more closely tied to that database.

    With version, again, some features get depreciated. Established databases try to stay stable, and project try and use databases sensibly. Why use hacky behaviour when dealing with the raw data?!
    Most databases will have an LTS version, so stick to that and update regularly.

    As for redis, it’s a cache.
    If “top 10 files” is a regular query, instead of hitting the database for that, the application can cache the result, and the application can query redis for the value. When a new file is added, the cache entry for “top 10 files” can be invalidated/deleted. The next time “top 10 files” is requested by a user, the application will “miss” the cache (because the entry has been invalidated), query the database, then cache the result.
    Redis has many more features and many more uses, but is commonly used for caching. It’s is a NoSQL database, supports pub/sub, can be distributed, all sorts of cool stuff. At the point you need redis, you will understand why you need redis (or nosql, or pub/sub).

    For my projects, I just use a database per project or even per service (depending on interconnectedness).
    If it’s for personal use, it’s nice to not worry about destroying other personal stuff by messing up database stuff.
    If it’s for others, it’s data isolation without much thought.

    But I’ve never done anything at extremely large scales.
    Last big project was 5k concurrent, and I ended up using Firebase for it due to a bunch of specific requirements