50 million rendered polygons vs one spicy 4.2MB boi

seaQueue@lemmy.world · 1 year ago

50 million rendered polygons vs one spicy 4.2MB boi

themoonisacheese@sh.itjust.works · 1 year ago

Maybe it’s time we invent JPUs (json processing units) to equalize the playing field.

seaQueue@lemmy.world · 1 year ago

The best I can do is an ML model running on an NPU that parses JSON in subtly wrong and impossible to debug ways

Aceticon@lemmy.world · 1 year ago

Just make it a LJM (Large JSON Model) capable of predicting the next JSON token from the previous JSON tokens and you would have massive savings in file storagre and network traffic from not having to store and transmit full JSON documents all in exchange for an “acceptable” error rate.

seaQueue@lemmy.world · 1 year ago

Hardware accelerated JSON Markov chain operations when?

AeroLemming@lemm.ee · edit-2 9 months ago

deleted by creator

knorke3@lemm.ee · 1 year ago

Did you know? By indiscriminately removing every 3rd letter, you can ethically decrease input size by up to 33%!

Terrasque@infosec.pub · 1 year ago

So you’re saying it’s already feature complete with most json libraries out there?

NigelFrobisher@aussie.zone · edit-2 1 year ago

Latest Nvidia co-processor can perform 60 million curly brace instructions a second.

I Cast Fist · 1 year ago

Finally, something to process “databases” that ditched excel for json!

JackbyDev · 1 year ago

60 million CLOPS? No way!

Trailblazing Braille Taser@lemmy.dbzer0.com · 1 year ago

Until then, we have simdjson https://github.com/simdjson/simdjson

ApeNo1@lemm.ee · 1 year ago

JSON and the Argonaut RISC processors

Randelung@lemmy.world · 1 year ago

Well, do you have dedicated JSON hardware?

<optimized out> :v_trans: :v_bi:@social.lizzy.rs · 1 year ago

@Randelung @seaQueue well, i have dedicated JavaScript hardware (https://developer.arm.com/documentation/dui0801/h/A64-Floating-point-Instructions/FJCVTZS)

Randelung@lemmy.world · 1 year ago

The R in ARM and RISC is a lie.

ChaoticNeutralCzech@feddit.de · edit-2 1 year ago

The website title says “Arm Developer”, not “ARM Developer”, in a clearly non-acronym way so it’s a guide for making prosthetic hardware. Of course you want a cyborg arm to parse JS natively, why else even get one?

Reddfugee42@lemmy.world · 1 year ago

Lie starts with L, dummy

barsoap@lemm.ee · 1 year ago

Nope it’s still a register-register op, that’s very much load-store architecture.

It’s reduced, not minimalist, otherwise every RISC CPU out there would only have one instruction like decrement and branch if nonzero. RISC-V would not have an extension mechanism. The instruction exists because it makes things faster because you don’t have to do manual bit-fiddling over 10 instructions to achieve a thing already-existing ALU logic can do in a single cycle. A thing that isn’t even javascript-specific (or terribly relevant to json), it’s a specific float to int cast with specific rounding and overflow mode. Would it more palatable to your tastes if the CPU were to do macro-op fusion on 10(!) instructions to get the same result?

DumbAceDragon@sh.itjust.works · 1 year ago

At this point ARM is a CISC architecture

frezik@midwest.social · edit-2 1 year ago

No, that’s not what RISC is about. There was some early attempts to keep the number of instructions low–originally, ARM didn’t have a multiply instruction, and there’s still a bunch of microcontrollers you can buy that don’t have a divide instruction–but it was quickly abandoned as it’s just not that useful. It only holds back instructions that optimize common cases. Your compiler can implement multiplication by doing addition in a loop, but that’s not very efficient.

What really worked about it was keeping a separation between how memory is accessed. You don’t have an ADD instruction that can fetch from both registers or main memory. You have a MOV instruction that can fetch from memory into a register, and you have an ADD instruction that can work on registers.

ARM still does this just fine.

DumbAceDragon@sh.itjust.works · edit-2 1 year ago

I’m a computer engineering major (still a student tbf), I’m well aware of the difference between CISC and RISC, I was making a joke.

Also, I understand your point, but you should know though that a load-store architecture and a RISC instruction set are not the same thing. The vast majority of RISC ISAs are load-store, but not all load-store architectures are RISC.

frezik@midwest.social · 1 year ago

http://www.quadibloc.com/arch/sriscint.htm

The RISC architecture contains several common elements. Some of them are no longer present in most chips that still call themselves RISC:

All instructions execute in a single cycle.

Floating-point operations, specifically, are therefore excluded.

But most of the defining characteristics of RISC do remain in force:

All instructions occupy the same amount of space in memory.

Only load, store, and jump instructions directly address memory. Calculations are performed only between operands in registers.

https://groups.google.com/g/comp.arch/c/IZP5KUJprHw?pli=1

MOST RISCs:
3a) Have 1 size of instruction in an instruction stream
3b) And that size is 4 bytes
3c) Have a handful (1-4) addressing modes) (* it is VERY hard to count these things; will discuss later).
3d) Have NO indirect addressing in any form (i.e., where you need one memory access to get the address of another operand in memory)
4a) Have NO operations that combine load/store with arithmetic, i.e., like add from memory, or add to memory. (note: this means especially avoiding operations that use the value of a load as input to an ALU operation, especially when that operation can cause an exception. Loads/stores with address modification can often be OK as they don’t have some of the bad effects)
4b) Have no more than 1 memory-addressed operand per instruction
5a) Do NOT support arbitrary alignment of data for loads/stores
5b) Use an MMU for a data address no more than once per instruction
6a) Have >=5 bits per integer register specifier
6b) Have >= 4 bits per FP register specifier

Note that none of this has to do with reducing the number of instructions, which is what people tend to think of when they hear the name.

barsoap@lemm.ee · 1 year ago

All instructions occupy the same amount of space in memory.

Both ARM and RISC-V have compressed instructions. Dunno how ARM works but with RISC-V the 16-bit instruction set is freely interspersable with the 32 bit one, which also get their alignment reduced to 16 bits. Gets like 95% of the space reduction possible with full variable-width instructions without overcomplicating the insn decoder.

As to addressing and loads and arithmetic: No such instructions, but every CPU but the tiniest ones are expected to do macro-op fusion for things like indexed loads. Here’s an overview.

The MMU thing… well the vector extension can do gather/scatter, I guess it could stay within the letter of “use the MMU once” but definitely not the spirit.

areyouevenreal@lemm.ee · 1 year ago

Someone confusing load-store with RISC again.

WanderingCat@lemm.ee · 1 year ago

You don’t?

frezik@midwest.social · 1 year ago

There were XML DOM accelerators for a while. Might still be out there.

XPost3000@lemmy.ml · 1 year ago

Everybody gangsta still we invent hardware accelerated JSON parsing

Overtheveloper@lemmy.world · 1 year ago

https://ieeexplore.ieee.org/document/9912040 “Hardware Accelerator for JSON Parsing, Querying and Schema Validation” “we can parse and query JSON data at 106 Gbps”

ByteJunk@lemmy.world · 1 year ago

I’m so impressed that this is a thing

enleeten@discuss.online · 1 year ago

Coming soon, JSPU

vvvvv@lemmy.world · 1 year ago

106 Gbps

They get to this result on 0.6 MB of data (paper, page 5)

They even say:

Moreover, there is no need to evaluate our design with datasets larger than the ones we have used; we achieve steady state performance with our datasets

This requires an explanation. I do see the need - if you promise 100Gbps you need to process at least a few Tbs.

neatchee@lemmy.world · 1 year ago

Imagine you have a car powered by a nuclear reactor with enough fuel to last 100 years and a stable output of energy. Then you put it on a 5 mile road that is comprised of the same 250 small segments in various configurations, but you know for a fact that starts and ends at the same elevation. You also know that this car gains exactly as much performance going downhill as it loses going uphill.

You set the car driving and determine that, it takes 15 minutes to travel 5 miles. You reconfigure the road, same rules, and do it again. Same result, 15 minutes. You do this again and again and again and always get 15 minutes.

Do you need to test the car on a 20 mile road of the same configuration to know that it goes 20mph?

JSON is a text-based, uncompressed format. It has very strict rules and a limited number of data types and structures. Further, it cannot contain computational logic on it’s own. The contents can interpreted after being read to extract logic, but the JSON itself cannot change it’s own computational complexity. As such, it’s simple to express every possible form and complexity a JSON object can take within just 0.6 MB of data. And once they know they can process that file in however-the-fuck-many microseconds, they can extrapolate to Gbps from there

trolololol@lemmy.world · 1 year ago

That’s why le mans exist, to show that 100m races with muscle cars are a farce

vvvvv@lemmy.world · 1 year ago

Based on your analogue they drive the car for 7.5 inches (614.4 Kb by 63360 inches by 20 divided by 103179878.4 Kb) and promise based on that that car travels 20mph which might be true, yes, but the scale disproportion is too considerable to not require tests. This is not maths, this is a real physical device - how would it would behave on larger real data remains to be seen.

neatchee@lemmy.world · edit-2 1 year ago

Except we know what the lifecycle of physical storage is, it’s rate of performance decay (virtually none for solid state until failure), and that the computers performing the operations have consistent performance for the same operations over time. And again, while for a car such a small amount can’t be reasonably extrapolated, for a computer processing an extremely simple format like JSON, when it is designed to handle FAR more difficult tasks on the GPU involving billions of floating point operations, it is absolutely, without a doubt enough.

You don’t have to believe me if you don’t want but I’m very confident in my understanding of JSON’s complexity relative to typical GPU workloads, computational analysis, computer hardware durability lifecycles, and software testing principles and best practices. 🤷

trolololol@lemmy.world · 1 year ago

But to write such a file you need a few quantum computers map reducing the data in alternative universes

UnfortunateShort@lemmy.world · edit-2 1 year ago

There is acceleration for text processing in AVX iirc

BaardFigur@lemmy.world · edit-2 1 year ago

deleted by creator

uis@lemm.ee · 1 year ago

No. Verlilogjson.

nickwitha_k (he/him)@lemmy.sdf.org · 1 year ago

Personally, now that I have a machine capable of running the toolchains, I want to explore hardware accelerated compilation. Not all steps can be done in parallel but I bet a lot before linking can.

2deck@lemmy.world · 1 year ago

Render the json as polygons?

Dasnap@lemmy.world · 1 year ago

It’s time someone wrote a JSON shader.

ApeNo1@lemm.ee · 1 year ago

Ray TraSON

Anarki_@lemmy.blahaj.zone · 1 year ago

Rayson

perishthethought@lemm.ee · 1 year ago

I just added this to my linked in profile. Thanks!

BlessedDog@lemmy.world · 1 year ago

GLTF

Lumisal@lemmy.world · 1 year ago

That just results in an image of JSON Bourne.

I Cast Fist · 1 year ago

JSON Sphere

Skull giver@popplesburger.hilciferous.nl · edit-2 1 year ago

deleted by creator

refalo · 1 year ago

yea we need multithreaded json parsers

HauntedCupcake@lemmy.world · 1 year ago

CUDA accelerated JSON parser is sorely needed

AdrianTheFrog@lemmy.world · 1 year ago

Doesn’t a 3070 have less than 7k cores? A UHD 750 (relatively recent iGPU) only has 256.

And I don’t know the structure of JSON that well, but can’t tokens be made of multiple chars?

Skull giver@popplesburger.hilciferous.nl · 1 year ago

deleted by creator

Xyloph@lemmy.ca · 1 year ago

That is sometime the issue when your code editor is a disguised web browser 😅

icesentry@lemmy.ca · 1 year ago

No, if you’re struggling to load 4.2 mb of text the issue is not electron.

voxel@sopuli.xyz · edit-2 1 year ago

there are simd accelerated json decoders

manmachine@lemmy.world · 1 year ago

every day we stray further from god

xmunk@sh.itjust.works · 1 year ago

Don’t worry, they still make extensive use of regexes.

ByteJunk@lemmy.world · 1 year ago

Can you at least wait for me to die before taking me to hell, Satan?

xmunk@sh.itjust.works · 1 year ago

Satan? There is no Satan here…

ZA̡͊͠͝LGΌ ISͮ̂҉̯͈͕̹̘̱ TO͇̹̺ͅƝ̴ȳ̳ TH̘Ë͖́̉ ͠P̯͍̭O̚N̐Y̡ H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ

dan@upvote.au · 1 year ago

I didn’t think any JSON parsers used regex given how simple the grammar is… but I’ve seen some horrors, so I shouldn’t rule it out.

nickwitha_k (he/him)@lemmy.sdf.org · 1 year ago

That’s actually ok with me.

proton_lynx@lemmy.world · 1 year ago

deleted by creator

model_tar_gz@lemmy.world · 1 year ago

Would you rather have 100,000 kg of tasty supreme pizza, or 200 kg of steaming manure?

Choose wisely.

PrettyFlyForAFatGuy@feddit.uk · 1 year ago

200kg of steaming manure would be pretty sweet if you had a vegetable garden

ByteJunk@lemmy.world · 1 year ago

Not sure if I’m just missing a reference here, but if you choose the pizza you can have both.

PrettyFlyForAFatGuy@feddit.uk · 1 year ago

Yeah but then that would be human manure which is a little (although not much) more dangerous to use on stuff you’re going to eat

MehBlah@lemmy.world · 1 year ago

Not a day or two from harvest.

PrettyFlyForAFatGuy@feddit.uk · 1 year ago

you gotta rot it down a bit anyway. stick it in the corner of the garden and leave it until the following year.

Otherwise you’ll get weeds all up in your veggies

AeonFelis@lemmy.world · 1 year ago

Not sure I’d chose to use the word “sweet” here…

bloubz@lemmygrad.ml · 1 year ago

The pizza can be used to feed some people but you really have to go fast and find hungry people

Manure can be sold easily

Buttons · 1 year ago

Careful, the 100,000 kg of pizza will turn into manure.

model_tar_gz@lemmy.world · 1 year ago

I figure I can probably convert about 10 kg into manure before it autoconverts into compost. Which is maybe even a worse problem.

lustyargonian@lemm.ee · 1 year ago

CPU vs GPU tasks I suppose.

Potatos_are_not_friends@lemmy.world · 1 year ago

GPU, render my 4.2 MB json file!

pipe01 · 1 year ago

I’m afraid I can’t do that, Dave

jballs@sh.itjust.works · 1 year ago

I have the same problem with XML too. Notepad++ has a plugin that can format a 50MB XML file in a few seconds. But my current client won’t allow plugins installed. So I have to use VS Code, which chokes on anything bigger than what I could do myself manually if I was determined.

seaQueue@lemmy.world · 1 year ago

Time to train an LLM to format XML and hope for the best

PsychedSy@lemmy.dbzer0.com · 1 year ago

Do we need a “don’t parse xml with LLM” copypasta?

QuazarOmega@lemy.lol · 1 year ago

L arge  
L regex  
M odel

PsychedSy@lemmy.dbzer0.com · 1 year ago

I don’t wish death on anyone.

BluesF@lemmy.world · 1 year ago

Wait, it’s all regex?

Always has been

expr · 1 year ago

Meanwhile, I can open a 1GB file in (stock) vim without any trouble at all.

Formatting is what xmllint is for.

Caveman@lemmy.world · 1 year ago

I use vim macros. You can do some crazy formatting with it

Johanno@feddit.de · 1 year ago

Just install python and format it. Then

bloubz@lemmygrad.ml · 1 year ago

You don’t need to open a file in a text editor to format it

MacN'Cheezus@lemmy.today · 1 year ago

Someone just needs to make a GPU-accelerated JSON decoder

fmstrat@lemmy.nowsci.com · 1 year ago

Works fine in vim

bjornsno@lemm.ee · 1 year ago

Except if it’s a single line file, only god can help you then. (Or running prettier -w on it before opening it or whatever.)

xavier666@lemm.ee · 1 year ago

cat file.json | jq also works

expr · edit-2 1 year ago

https://porkmail.org/era/unix/award#cat

jq < file.json

cat is for concatenating multiple files, not redirecting single files.

jimbolauski@lemm.ee · 1 year ago

Render Media works the best

rm file.json

xavier666@lemm.ee · 1 year ago

Yes, Render Media is the best. It’s hard to believe that not many people know about this tool. It’s also natively installed in all Linux distros.

fl42v@lemmy.ml · 1 year ago

4.2 megs on one line? Vim probably can handle it fine, although syntax won’t be highlighted past a certain point

bjornsno@lemm.ee · 1 year ago

I’ve accidentally opened enormous single line json files more than once. Could be lsp config or treesitter or any number of things but trying to do any operations after opening such a file is not a good time.

fl42v@lemmy.ml · 1 year ago

Yeah, very well may be. LSPs always slow down opening big files, so I usually inspect those with an empty/different config

BaldProphet@kbin.social · 1 year ago

Technically every JSON file is a single line, with line break characters here and there

expr · 1 year ago

:syntax off and it works just fine.

Andrew@mander.xyz · 1 year ago

Reject MB, embrace MiB.

MacN'Cheezus@lemmy.today · 1 year ago

HopFlop@discuss.tchncs.de · 1 year ago

Reject MiB, call it “MB” like it originally was.

Andrew@mander.xyz · edit-2 1 year ago

If you’re not aware, it was called MB because of JEDEC when IEC units weren’t invented. IEC units were introduced because they remove the double meaning of JEDEC units — decimal and binary. IEC units only carry the binary meaning, hence why they’re superior. If you convert 1000 kB to 1 MB then use MB, but in case of 1024 KiB to 1 MiB you should be using MiB. It’s all about getting the point across, and JEDEC units aren’t good at it.

HopFlop@discuss.tchncs.de · 1 year ago

I’m failing to understand why we would need decimal units at all. Whats the point of them? And why do the original units havr to change name to something as ridiculous as “Gibibyte” while the unnecessary decimal units get the binary’s old name?

Andrew@mander.xyz · edit-2 1 year ago

You poor innocent soul… I can try to explain why decimal is even mentioned, but it would probably take a lot of time, and I’m not sure if I will be able to clarify things up.

I can at least say this: 2 TB HDD drive is indeed 2*10^12 B, but suddenly shindow$ in its File Explorer will show you that in fact the drive is only 1.82 TB. But WHY? Everyone asks, feeling scammed. Because HDD spec uses decimal units (SI; MB) and Window$ uses binary units (JEDEC; MB), i.e., 1.82 TiB (IEC; MiB). And macOS also uses JEDEC units, AFAIK.

More and more FOSS software uses IEC units and KDE Plasma is a good example: file manager, package manager etc. uses IEC units. Simply put, JEDEC added the binary meaning to decimal units, so at first MB (and now) only carried decimal meaning (until JEDEC shit out their standard). And the only reason why “gibibyte” is ridiculous, is because we all grew up with JEDEC interpretation of SI units. So it will take many generations for everyone to adapt xxbityte words into daily conversations. I’m (already) doing my part. It’s just the legacy that we have to deal with.

All international bodies (BIPM, NIST, EU) agree that the SI prefixes “refer strictly to powers of 10” and that the binary definitions “should not be used” for them.

https://en.wikipedia.org/wiki/Binary_prefix#IEC_1999_Standard

https://en.wikipedia.org/wiki/Binary_prefix#Other_standards_bodies_and_organizations

https://en.wikipedia.org/wiki/JEDEC_memory_standards#JEDEC_Standard_100B.01

HopFlop@discuss.tchncs.de · 1 year ago

Well, thank you for taking the time to write this detailed explanation!

Windows and MacOS use the abbriviation “MB” referring to the binary units, correct? How come that these big OS’s use another unit than these large international bodies recognize?

On a side note, I’ve always found it weird why HDDs or SSDs are/were sold with 128GB, 265GB, 512GB etc. when they are referring to decimal units.

Andrew@mander.xyz · 1 year ago

Windows and MacOS use the abbriviation “MB” referring to the binary units, correct?

Yez. I’m only sure about the first one, but didn’t test myself whether the macOS is using power of 2 or 10 under the hood (of MB). You can open properties of something big and try converting raw number of bytes with /1024^n and /1000^n and compare the end results.

How come that these big OS’s use another unit than these large international bodies recognize?

Legacy, legacy everywhere (IMO). And of course they don’t want to confuse their precious users that don’t know any better. And this also would break some scripts that rely on that specific output. GNU C library also uses JEDEC units by default, hence flatpak and other software.

On a side note, I’ve always found it weird why HDDs or SSDs are/were sold with 128GB, 265GB, 512GB etc. when they are referring to decimal units.

It is weird for everyone, because we mainly only count with multiples of 2 when it comes to digital size of information. I didn’t investigate why they use power of 10, but I’ve seen that some other hardware also uses decimal units (I think at least in RAM, but JEDEC is used intentionally or not for CPU cache memory). I had a link where the RAM thingy is lightly addressed, but I couldn’t find it.

spoiler

P.S. it’s “OSes” and “macOS” BTW.

tomi000@lemmy.world · 1 year ago

Maybe people would listen to you if you werent such a prick

Andrew@mander.xyz · 1 year ago

Ok, show me what I did wrong and what should I do instead to not be a prick, please.

tomi000@lemmy.world · 1 year ago

Dont start a comment with ‘you poor innocent soul’

georgette@lemmy.world · 1 year ago

You’ve got them confused, MiB is the one misusing metric

TechieDamien@lemmy.ml · 1 year ago

It isn’t misusing metric, it just simply isn’t metric at all.

Ironfacebuster@lemmy.world · 1 year ago

Rockstar making GTA online be like: “Computer, here is a 512mb json file please download it from the server and then do nothing with it”

veni_vedi_veni@lemmy.world · 1 year ago

Let it be known that heat death is not the last event in the universe