• 0 Posts
  • 44 Comments
Joined 2 years ago
cake
Cake day: June 13th, 2023

help-circle
  • So maybe we’re kinda staring at two sides of the same coin. Because yeah, you’re not misrepresentin my point.

    But wait there’s a deeper point I’ve been trying to make.

    You’re right that I am also saying it’s all bullshit - even when it’s “right”. And the fact we’d consider artificially generated, completely made up text libellous indicates to me that we (as a larger society) have failed to understand how these tools work. If anyone takes what they say to be factual they are mistaken.

    If our feelings are hurt because a “make shit up machine” makes shit up… well we’re holding the phone wrong.

    My point is that we’ve been led to believe they are something more concrete, more exact, more stable, much more factual than they are — and that is worth challenging and holding these companies to account for. i hope cases like these are a forcing function for that.

    That’s it. Hopefully my PoV is clearer (not saying it’s right).


  • Ok hear me out: the output is all made up. In that context everything is acceptable as it’s just a reflection of the whole of the inputs.

    Again, I think this stems from a misunderstanding of these systems. They’re not like a search engine (though, again, the companies would like you to believe that).

    We can find the output offensive, off putting, gross , etc. but there is no real right and wrong with LLMs the way they are now. There is only statistical probability that a) we’ll understand the output and b) it approximates some currently held truth.

    Put another way; LLMs convincingly imitate language - and therefore also convincing imitate facts. But it’s all facsimile.



  • Surely you jest because it’s so clearly not if you understand how LLMs work (at the core it’s a statistic model - and therefore all approximation to a varying degree).

    But great can come out of this case if it gets far enough.

    Imagine the ilk of OpenAI, Google, Anthropic, XAI, etc. being forced to admit that an LLM can’t actually do anything but generate approximations of language. That these models (again LLMs in particular) produce approximations of language that are so good they’re often indistinguishable from the versions our brains approximate.

    But at the core they cannot produce facts because the way they are made includes artificially injected randomness layered on-top of mathematically encoded values that merely get expressed as tiny pieces of language (tokens) - ones that happen to be close to each other in a massively multidimensional vector space.

    TLDR - they’d be forced to admit the emperor has no clothes and that’s a win for everyone (except maybe this one guy).

    Also it’s worth noting I use LLMs for work almost daily and have studied them quite a bit. I’m not a hater on the tech. Only the capitalists trying to force it down everyone’s throat in such a way that we blindly adopt it for everything.





  • I see Jellyfin suggested as an alternative to Plex here. I hope it is one day.

    At the moment it’s nowhere close.

    I’ve been running Jellyfin side-by-side Plex for two years and it’s still not a viable replacement for anyone but me. Parents, my partner, none of the possible solutions for them come anywhere near close to the usability of Plex and its ecosystem of apps for various devices.

    That will likely change because plex is getting worse every day and folks can contribute their own solutions to the playback issues. With plex it’s more noise, more useless features. So one gets better (Jellyfin) and one gets worse (Plex).

    But at the moment it really isn’t close for most folks who are familiar with the slickness of commercial apps.

    Even from the administrative side, Jellyfin takes massively more system resources and it doesn’t reliably work with all my files.

    Again, Jellyfin will get there it’s just not a drop in replacement for most folks yet.

    And for context I started my DIY streaming / hosting life with a first gen Apple TV (pretty much a Mac mini with component video outs) that eventually got XBMC and then Boxee installed on it. I even have the forksaken Boxee box.



  • We use NGINX’s 444 on every LLM crawler we see.

    Caddy has a similar “close connection” option called “abort” as part of the static response.

    HAProxy has the “silent-drop” option which also closes the TCP connection silently.

    I’ve found crawling attempts end more quickly using this option - especially attacks - but my sample size is relatively small.

    Edit: we do this because too often we’ve seen them ignore robots.txt. They believe all data is theirs. I do not.


  • I think that depends on what you’re doing. I find Claude miles ahead of the pack in practical, but fairly nuanced coding issues - particularly in use as a paired programmer with Strongly Typed FP patterns.

    It’s almost as if it’s better in real-world situations than artificial benchmarks.

    And their new CLI client is pretty decent - it seems to really take advantage of the hybrid CoT/standard auto-switching model advantage Claude now has with this week’s update.

    I don’t use it often anymore but when I reach for a model first for coding - it’s Claude. It’s the most likely to be able to grasp the core architectural patterns in a codebase (like a consistent monadic structure for error handling or consistently well-defined architectural layers).

    I just recently cancelled my one month trial of Gemini - it was pretty useless; easy to get stuck in a dumb loop even with project files as context.

    And GPT-4/o1/o3 seems to really suck at being prescriptive - often providing walls of multiple solutions that all somehow narrowly miss the plot - even with tons of context.

    That said Claude sucks - SUCKS - at statistics - being completely unreliable where GPT-4 is often pretty good and provides code (Python) for verification.




  • thatsnothowyoudoit@lemmy.catoTechnology@lemmy.worldChatGPT is down
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    4 months ago

    Depends what you’re doing.

    4o is way better at analytical work. Think big datasets and statistics. It’ll provide the Python it used for analysis so you can double check.

    Claude is far superior for more challenging development tasks. For example I found ChatGPT pretty useless for a lot of Scala troubleshooting and rubber ducking.

    Claude 3.5 Sonnet is much better though far from error free. Also not free if I remember correctly.

    Both get stuck in weird loops, make stuff up and leave things out when taken at face value.

    Ultimately they have their own strengths and either can be a force multiplier.


  • Apple’s MacBook Pro includes HDMI and a third usb/Thunderbolt port alongside an SDXC and headphone jack (the latter of which is on all their laptops albeit on the other side). This seems like the perfect balance for most users.

    It’s nonsense they don’t include HDMI on the Air, but then “it’s kinda thin and kinda light”.

    I was not sad to see FireWire and mini-DisplayPort replaced with usb-c/thunderbolt.

    Current port line up on “pro” machines:



  • I can’t imagine that being the case for most users. I’m absolutely a power user and I keep being surprised at how consistently high the performance is of my base model M1 Air w/16GB even when compared to another Mac workstation of mine with 64GB.

    I can run two VMs, a ton of live loading development tooling, several JVM programs and so much more on that little Air and it won’t even sweat.

    I’m not an Apple apologist - lots of poor decisions these days and software quality has taken a real hit. While 16GB means everyone’s getting a machine that should last much longer, I can’t see a normal user needing more any time soon, especially when Apple is optimizing their local machine learning models for their 8GB iOS platforms first and foremost.