If a recording of someones very rare voice is representable by mp4 or whatever, could monkeys typing out code randomly exactly reproduce their exact timbre+tone+overall sound?
I don’t get how we can get rocks to think + exactly transcribe reality in the ways they do!
Edit: I don’t get how audio can be fossilized/reified into plaintext
When you talk about a sample, what does that actually mean? Like I recognize that the frequency of oscillations will tell me the pitch of something, but how does that actually translate to a chunk of data that is useful?
You mention a sample being stored as a number, which makes sense, but how is that number utilized? Again assuming uncompressed, if my sample “value” comes up as 420, does that include all of the necessary components of that sound bite in a 1/44100th of a second? How would a sample at value 421 compare? Is this like a RGB type situation where you’d have multiple values corresponding to different attributes of the sample (amplitude, frequencies, and I’m sure other things)? Is a single sample actually intelligible in isolation?