Happy Labor Day! I labored over a Taylor Swift Acronym Lookup app, and I'd love some feedback

charles@poptalk.scrubbles.tech · 2 years ago

Happy Labor Day! I labored over a Taylor Swift Acronym Lookup app, and I'd love some feedback

devdad · 2 years ago

I have no skin in the game for the app itself, I just saw your post on the “front page” while scrolling and shitting….

there are 30,000 files

I’m more intrigued why you appear to be managing this with files. Why not use a database?

Edit: you’re also not handling misses very well. I just tried a random string and got the below error. That doesn’t tell me if it was my fault or a server error.

An error occurred while processing your request. Please try again later.

charles@poptalk.scrubbles.tech · 2 years ago

I appreciate the reasonse, even from a non-swiftie.

Yeah, that error message is left over from an earlier version where I sharded by acronym length instead. At that time, there would always be a file. The problem was the files were getting to be huge at the 50 character length (20MB) and performance went to shit on poor mobile connections. So I refactored to shard by first 4, and files dropped down to a few K each and became a lot snappier.

As for why files, there are a few reasons:

This way, the data are statically hosted, which means I can take advantage of a number of different free hosting services. In this case, I used firebase.
The data doesn’t change frequently. There’s gonna be two more re-records released, and then we’ll probably get back to the old cadence of an album every year or two.
Why would I use a whole ass database if I could avoid it? From the client perspective, the request looks basically the same: “hey server, give me this data”, but this way it’s 100% static with a CDN, so the response will be sub-10ms for a lot of people
Not doing any processing server side means I don’t have to worry about it going viral and breaking under the load.

devdad · 2 years ago

Yeah, that error message is left over from an earlier version where I sharded by acronym length instead. At that time, there would always be a file. The problem was the files were getting to be huge at the 50 character length (20MB) and performance went to shit on poor mobile connections. So I refactored to shard by first 4, and files dropped down to a few K each and became a lot snappier.

I don’t think that’s necessarily true. I just tried a random string, and I got the correct 404 response back, but it doesn’t look like the app handles that case and it just prints that error on any error.

            .catch(error => {
                console.error("Error fetching or processing the JSON file:", error);
                displayError("An error occurred while processing your request. Please try again later.", tableBody);
            });

Anyway, I wasn’t trying to shit on it. Good job :)

charles@poptalk.scrubbles.tech · 2 years ago

Yeah I sincerely appreciate the feedback.

The 404 should just say the same message as “acronym not found”. It just means the first 4 letters didn’t match a file on the backend; I didn’t enumerate all the blank json for A-Z*4.

It was a really challenging project to process all the data. As with most large datasets, there are tons of pain points. Like 60% of the time spent was parsing out the song name from the janky first line of metadata. Some pain I dealt with over the project off the top of my head:

Song titles with special characters (Question…?)
Song titles that start with special characters (…Ready For It?)
song titles without capitalization (all of folklore/evermore albums)
Inconsistent apostrophes ' vs ’
Lyric words that start with special characters
The god damn ZWJ’s littered throughout everything
For some reason the Cyrillic “e” is used everywhere, which leads to some dupe lyrics
How to treat numbers (I decided the acronym should be the first letter of the number; search for “OTTF”)
The source (genius) littering random strings throughout “you might also like…”

devdad · 2 years ago

The 404 should just say the same message as “acronym not found”. It just means the first 4 letters didn’t match a file on the backend; I didn’t enumerate all the blank json for A-Z*4.

Yeah, but it doesn’t translate to the site. That’s what I’m trying to say :) Your catch above doesn’t distinguish between 404 or anything else (5xxx) and displays An error occurred while processing your request. Please try again later for all eventualities. So, regardless whether the acronym wasn’t found or there was a genuine server error, the same error message is displayed.

It was a really challenging project to process all the data. As with most large datasets, there are tons of pain points. Like 60% of the time spent was parsing out the song name from the janky first line of metadata. Some pain I dealt with over the project off the top of my head:

I honestly have no idea what you had to drag it all out from, but it looks well implemented from the small amount I played around with it. I’ve never used Firebase, but it looks like you got it working so that’s a good job too.

It’s probably just my old man brain that saw you were doing this all with files and it felt odd. That’s not to say it’s wrong, it’s just different to what I would have done.

There’s a bunch of advantages to databases, like indexes and partial/fuzzy text matching - but I can certainly understand why you went this route if you needed to keep costs down and didn’t want to bother with any DB maintenance.

Well done :)

charles@poptalk.scrubbles.tech · 2 years ago

As a fellow old man, at least relative to this fanbase, I fully understand, and this is exactly the kind of feedback I was hoping to get. Thanks!

As a devops engineer, sometimes the most efficient server is the one that doesn’t exist; next best is the one that someone else pays for. If heroku free tier existed, I’d consider using that to handle queries server side and aggressively cache them in a CDN.

charles@poptalk.scrubbles.tech · 2 years ago

Forgot to include the link in the body: https://taylorswiftacronyms.web.app/

Happy Labor Day! I labored over a Taylor Swift Acronym Lookup app, and I'd love some feedback

Happy Labor Day! I labored over a Taylor Swift Acronym Lookup app, and I'd love some feedback

Acronym Lookup for Taylor Swift Songs