I should have guessed.
Thank you 🤣
Bash scripts are rarely the best choice for large, complicated, programs or for software that requires complex data structures. (Git is very much in both categories)
In bash there are so many ways to accidentally shoot yourself in the foot that it’s absurd. That can lead to bizarre seeming behavior, which may break your script, or even lead to a security vulnerability.
There are things that make it a bit more manageable, like “[[]]” instead of “[]”, but a frustrating number of such things are bash specific and this is written for the subset that is POSIX shell, meaning you don’t even get those token niceties.
Where you generally “want” to use POSIX sh is for relatively simple scripts where the same file needs to run on Linux with bash, Linux with only BusyBox sh, OSX with zfs (and an outdated version of bash which will never be updated because Apple refuses to ship any GPLv3 software in MacOS).
This is not that, and so one would expect that:
The developer of this git implementation has poor / foolish judgement.
Shit will be buggy
Shit will be insecure
Shit will be a PITA to try to troubleshoot or fix
And shit will be slow, because bash is slow and this isn’t a job that you can just hand off all of the heavy lifting to grep / sed / awk*, because the data structures don’t lend themselves to that.
* You could write the entire program in awk, and maybe even end up with something almost as fast as a python implementation done in ⅒ the time, but that would be terrible in other ways.
Either way, this is a rule that you as a human are required to follow, and if you fail the compiler is allowed to do anything, including killing your cat.
It’s not a rule that the compiler enforces by failing to build code with undefined behavior.
That is a fundamental, and extremely important, difference between C and rust.
Also, C compilers do make optimization decisions by assuming that you as a human programmer have followed these strict aliasing rules.
https://gist.github.com/shafik/848ae25ee209f698763cffee272a58f8
Has a few examples where code runs “properly” without optimizations but “improperly” with optimizations.
I put “improperly” in quotes because the C spec says that a compiler can do whatever it wants if you as a human invoke undefined behavior. Safe rust does not have undefined behavior, because if you write code which would invoke UB, rustc will refuse to build it.
It has already been translated into rust. Python wasn’t ever intended to be used in the “real” driver, but I thought it was a fun anecdote none the less.
To put it another way:
Strict aliasing is an invariant that C compilers assume you as a developer will not violate, and use that assumption to make optimization choices that, if you as the developer have failed to follow the strict aliasing rules, could lead to undefined behavior. So it’s a variant that the compiler expects, but doesn’t enforce at compile time.
I guess it is possible to just disable all such optimizations to get a C compiler that doesn’t create UB just because strict aliasing rules were broken, but there are still many ways that you can trigger UB in C, while safe rust that compiles successfully theoretically has no UB at all.
By refusing to compile any code that has undefined behavior. This is what rust’s compiler does, and is simply not possible for a C compiler to do.
I am myself a crustacean, and we crabs know that lack of memory leaks is not one of the guarantees of safe rust.
Yes: https://github.com/avr-rust
When you’re writing code involving global state and interrupts, and any access to an integer larger than a u8 needs to be surrounded by cli() and sei() just for guaranteed atomicity, then you will truly come to value rust’s statically enforced thread / memory safety.
For years I wrote embedded C for 8 bit microcontrollers used in industrial controls.
Never again.
Rust is by far a better language for embedded. The only times I would consider it reasonable to write embedded code in C is if you’re doing it for fun, or you depend on an existing and well tested / audited codebase or library and your application logic is less complicated than rust to C FFI.
Even then, you won’t find me contributing to that effort.
Unless I’m missing something, the most recent example there is from 2002 which, to my own horror, was more than 20 years ago.
… Which compilers don’t consistently enforce, much like most undefined behavior in C.
Leaks aren’t usually security critical though, and I’ve never heard of sudo triggering the OOM killer.
Also, no general purpose language that I’m aware of can guarantee a lack of memory leaks.
Fun fact!
The Asahi Linux drivers for the Apple M1 GPU were originally written in Python: https://asahilinux.org/2022/11/tales-of-the-m1-gpu/
GPU drivers in Python?!
Since getting all these structures right is critical for the GPU to work and the firmware to not crash, I needed a way of quickly experimenting with them while I reverse engineered things. Thankfully, the Asahi Linux project already has a tool for this: The m1n1 Python framework! Since I was already writing a GPU tracer for the m1n1 hypervisor and filling out structure definitions in Python, I decided to just flip it on its head and start writing a Python GPU kernel driver, using the same structure definitions. Python is great for this, since it is very easy to iterate with! Even better, it can already talk the basic RTKit protocols and parse crash logs, and I improved the tools for that so I could see exactly what the firmware was doing when it crashes. This is all done by running scripts on a development machine which connects to the M1 machine via USB, so you can easily reboot it every time you want to test something and the test cycle is very fast!
Maybe don’t try to make programming language jokes out of the concept of systematic rape?
One key problem with forced arbitration clauses is that company chooses and pays the “neutral” arbiter, who is inevitably biased against the consumer.