- cross-posted to:
- programmer_humor
- [email protected]
- cross-posted to:
- programmer_humor
- [email protected]
We have Unicode these days: blåhaj
Unicode in filenames? Are you crazy?!
Okay that was /s to some extent but I gotta rant, I’m totally convinced that there’s still new software today that completely trip over themselves when files or paths have non-ASCII characters, or sometimes even a space. Incompetence didn’t go anywhere.
I still use underscores for filenames, basically muscle memory at this point
Spaces in file names will always be fiddly though. It’ll work, but it’ll still be wrong, because arguments are space separated, and having spaced file names totally messes with that.
I try to just always put files names or paths into quotes in CLI or tie it to a variable in programming. This way it also accepts spaces and knows how to separate it from arguments.
Yeah. It’s a good idea to guard against it, but I would still never put spaces in filesnames that I myself choose.
Unicode in filenames can be a bad idea, since there are more than one way to achieve what looks like the same character. So matching patterns could fail if you think it’s one way, but it’s actually another representation in unicode.
Good point. Do filesystems use a normal form to at least prevent having two files with effectively the same name?
I should point out the flip side though, that there’s no avoiding Unicode in filenames. Users in languages that don’t use the Latin alphabet (such as Japanese, Chinese, Korean, Hebrew, Arabic, Greek and Russian, and the list could go on) can reasonably expect to be able to give a file a name they can read and understand with no extra effort. All the software woes that come with it - too bad, software needs to deal with it.
I’m not sure. A few years ago I remember that OpenBSD expected ASCII for files, but I think Linux expects utf-8. I could be wrong though.
I’m assuming Unicode anyway, and UTF-8 is by far the most natural because most files will be in ASCII. A “normal form” (see link above), you might think of it as a canonical form, is a way to check if two strings are equivalent, even if they encoded the text differently. Like the example mentioned on Wikipedia:
For example, the distinct Unicode strings “U+212B” (the angstrom sign “Å”) and “U+00C5” (the Swedish letter “Å”) are both expanded by NFD (or NFKD) into the sequence “U+0041 U+030A” (Latin letter “A” and combining ring above “°”) which is then reduced by NFC (or NFKC) to “U+00C5” (the Swedish letter “Å”).
Incompetence didn’t go anywhere.
Now that’s certainly true, but the beauty of open source software is that we can fix bugs when we encounter them.
I’m too lazy to memorize alt codes
Use a compose key
Why you torture blahaj?
Why are we sttill kink shaming?
Blahaj cannot speak, therefore Blahaj cannot give consent.
You don’t necessarily need speech for consent since non-verbal/mute people exist.
It can’t say no either /s
What? That shouldn’t be any basis for consent! Only if someone is able to consent, i.e. emphatically say ‘yes’ (or otherwise agree), should you start thinking about doing anything sexual involving them. If you do anything sexual involving someone who cannot say ‘no’ then that is a sexual violation of them.
This was a satire
Hard to spot and people might take it at face value.
blahaj.exe.tar.gz
blahaj.elf.tar.gz.part
blåhaj.squashfs
I feel like unicode in the filename is heavily against the spirit of using squashfs, or at least the ways I’ve seen it used.
Ok, what kind of monster names their executables
.elf
?Well, a.out doesn’t make much sense these days.
Gotta move to
.elf
Pi Pico SDK does. Well, the version for debugging symbols, anyway. Regular executable is .uf2.
I reserve .elf for executables for other platforms, like microcontroller firmware.
wii homebrew developer maybe?
mv blahaj.elf.tar.gz.part ./rivendell
deleted by creator
Speaking of which, it blew my mind when I discovered that .EXEs are just
zip filescompressed archives. Same goes for .DLLs, and a lot of other common Windows file extensions as well. (.DOC too, for example IIRC). They all open in your favorite archiver software (I like NanaZip; which is a fork of 7-Zip with a modern UI).I don’t think that’s true for .exe or .dll files, but it’s definitely true for .docx files and other Office files ending with x. Some .exe’s are self-extracting archives or have other files embedded in them, so maybe that’s what you’ve been seeing.
You are actually correct. They can contain archived files or resources that can be unpacked with an archive program (including on Linux btw), but they aren’t just a zip file. That’s why my Linux archive manager (ark I think) offer to open one, but won’t execute it. They can see the extra content even if they can’t execute the file as intended.
Thanks for the backup :)
Mate I saw the blind leading the blind and had to step in. You could have actually opened some exes on Linux as the other guy suggests. In fact I am surprised you never noticed your system presenting that option. It just isn’t actual proof of what they said, even if it appears like it. In fact I am a bit lost how neither of you realized something weird was going on. On what planet would an executable format being a zip file make any sense? Exes actually can include several executable formats.
There are things like self extracting archives that make this all more confusing. They are basically an archive with an extraction program in the same file. Installer exes work in a similar way too. Not all exes can be extracted since not all of them contain secret hidden archives or extra resources.
There actually are tools to show you the contents of an executable file, and you could probably learn a lot by using one. They contain more than just a blob of machine code like one might assume. Often they contain data as well, and instructions and information on how to load the executable like what memory layout to use.
I am annoyed that people upvoted the other guy without double checking as well. Now we have more people walking around spreading misinformation just because of some guy on Lemmy. This is why things like climate change become contentious issues. People come to their own conclusions based on partial information, and since it appears to make sense without proper investigation it gets spread around like wildfire. It’s only when you actually know what’s going on at a deeper level that it becomes possible to spot the flaws in the reasoning.
Aren’t the x-suffixed files just an xml format?
It’s a zip file that includes a bunch of things, including embedded images and a bunch of other junk, but yes - the most important and central files in the zip are XML-based.
Why don’t you just try it and see for yourself?
Remind me in about 5 hours and I’ll upload a screenshot as proof when I get home.
I’m not on Windows.
Let me know when you have the screenshot!
deleted by creator
You could always download a random exe even in Linux, you know. But I’ll handle it. Commuting home now.
Well, I did get my hands on an exe file (some game on Steam) and opened it with Archive Manager. It does show some files, but the file properties say Type: application/x-ms-dos-executable (as opposed to application/zip). So it’s not an actual archive file, the archive manager is just displaying it as such to be helpful.
The “files” I can see are:
/.text
/.reloc
/.rsrc/version.txt
/.rsrc/ICON/2.ico
/.rsrc/ICON/3.ico
/.rsrc/ICON/4.ico
/.rsrc/GROUP_ICON/32512.icoI tried to create a zip file and rename it to .exe, but Archive Manager failed to open it at all which I found strange. You’d think it would look at the actual file contents to figure out what type of archive it is, and not rely on the extension.
Okay that’s actually slightly different from what I was expecting. Does the .text file contain machine code or assembly language by any chance? It seems the archive program can pull out the executable code as well, similar to the binary analysis tools I have worked with.
.reloc is probably the relocation table used by the OS to load the program into an address space.
Well fair enough. You clearly have more knowledge on the subject than I do.
FWIW, by “zip file”, I meant that the file is a compressed archive. Apologies for implying a specific file format. That wasn’t my intention.
Just because they open in 7-Zip or whatever doesn’t mean they are just a zip file. There are several kinds of archives. EXEs are a special case as well. They aren’t archives at all. Rather they can contain archives or extra content along with being an executable. One reason is self extracting archives. Here an archive is packaged with an extraction program as an exe all in one. The other case is exes that have extra resources like images, videos, graphics textures, etc. Either way it’s an executable plus some extra stuff, not a zip archive. DLLs I am not sure about, but I suspect something similar is happening here.
Next time you should research stuff before posting it on Lemmy. Things are sometimes more complicated than they appear.
docx you are correct about though. Specifically it’s a zip file that contains XML files and resources.
Edit: I actually found an article on self extracting archives, it’s quite an interesting technology to be fair even if it causes confusion: https://en.m.wikipedia.org/wiki/Executable_compression
By “zip file”, I meant a compressed archive. I’m not as nerdy as you guys are so I see now that there is a difference. I appreciate the correction.
That said, you have to admit that it’s still cool that these different file formats are nothing more than archives. Maybe not to you but it blew my mind when I first learned this.
Bruh an exe is not an archive. Some just happen to contain an archive, not all. As me and the other guy discovered some archive utilities can read them, but what they are doing is closer to a binary analysis tool than unpacking an actual archive. It’s not about being nerdy, it’s about getting your facts right.
Man, even when I try to be diplomatic I still get berated.
Should have just said fuck you and called it a day. (kidding)
You’re still trying to weasel out of being wrong. It’s not an archive nor is it compressed. Go read what a Portable Executable is. It’s not about being diplomatic or whatever. Just admit you’re wrong and go and read about how it actually works. You might learn something.
blahaj.zone
free him
You don’t need to tape archive it, it’s one thing
Yeah but you can
I feel so compressed.