• 0 Posts
  • 142 Comments
Joined 5 years ago
cake
Cake day: October 2nd, 2020

help-circle

  • tar pits target the scrapers.

    were you talking also about poisoning the training data?

    two distinct (but imo highly worthwhile) things

    tar pits are a bit like turning the tap off (or to a useless trickle). fortunately it’s well understood how to do it efficiently and it’s difficult to counter.

    poisoning is a whole other thing. i’d imagine if nothing comes out of the tap the poison is unlikely to prove effective. there could perhaps be some clever ways to combine poisoning with tarpits in series, but in general they’d be deployed separately or at least in parallel.

    bear in mind to meaningfully deploy a tar pit against scrapers you usually need some permissions on the server, it may not help too much for this exact problem in the article (except for some short term fuckery perhaps). poisoning this problem otoh is probably important



  • anywhere shit gets cliquey it gets toxic real fast - and that goes for ANY and ALL organisations.

    safe-space concepts often inherently deals with an “us/them” dichotomy, which is unfortunately fertile ground for things getting cliquey.

    it’s not that one must lead to the other, its just that the foundation is there so the risk is higher if it’s not managed properly.

    this is why safe-spaces need to be protected from within and without. regardless of whether you’re in the clique or out of it, it hurts everyone in the end.




  • ganymede@lemmy.mltoPrivacy@lemmy.mlIs Signal messaging really private?
    link
    fedilink
    arrow-up
    12
    arrow-down
    1
    ·
    edit-2
    12 days ago

    Imo signal protocol is mostly fairly robust, signal service itself is about the best middle ground available to get the general public off bigtech slop.

    It compares favorably against whatsapp while providing comparable UX/onboarding/rendevous, which is pretty essential to get your non-tech friends/family out of meta’s evil clutches.

    Just the sheer number of people signal’s helped to protect from eg. meta, you gotta give praise for that.

    It is lacking in core features which would bring it to the next level of privacy, anonymity and safety. But it’s not exactly trivial to provide ALL of the above in one package while retaining accessibility to the general public.

    Personally, I’d be happier if signal began to offer these additional features as options, maybe behind a consent checkbox like “yes i know what i’m doing (if someone asked you to enable this mode & you’re only doing it because they told you to, STOP NOW -> ok -> NO REALLY, STOP NOW IF YOU ARE BEING ASKED TO ENABLE THIS BY ANYONE -> ok -> alright, here ya go…)”.







  • no, they steal everything.

    why do we keep letting them steal

    ‘free speech’ has always been about the freedom of the oppressed to fight upwards against their oppressor with language - but now they stole it & trying to make it mean their freedom to oppress minorities.

    same for ‘woke’ - it used to mean basic human decency, once again they stole it & warped it’s meaning by pretending they’re the victims and it’s preventing their freedom (ie. their freedom to be a bigot).

    same for ‘political correctness’, which was originally a criticism of using fake concern over moral issues for political agenda (sounds familiar), now warped beyond use.

    swastika - used for THOUSANDS of years before the fucking nazis came along & stole it. now the cultures it actually belongs to get hate for practicing their ancient beliefs.

    pepe and many others are a long list of things they steal and ruin.

    why do we keep letting them steal?





  • good points on the training order!

    i was mostly thinking of intentionally introduced stochastic processes during training, eg. quantisation noise which is pretty broadband when uncorrelated, and even correlated from real-world datasets will inevitably contain non-determinism, though some contraints re. language “rules” could possibly shape that in interesting ways for LLMs.

    and especially the use of stochastic functions for convergence & stochastic rounding in quantisation etc. not to mention intentionally introduced randomisation in training set augmentation. so i think for most purposes, and with few exceptions they are mathematically definable as stochastic processes.

    where that overlaps with true theoretical determinism certainly becomes fuzzy without an exact context. afaict most kernel backed random seeds on x86 since 2015 with the RDSEED instruction, will have an asynchronous thermal noise based NIST 800-90B approved entropy source within the silicon and a NIST 800-90C Non-deterministic Random Bit Generator (NRBG).

    on other more probable architectures (GPU/TPU) I think that is going to be alot rarer and from a cryptographic perspective hardware implementations of even stochastic rounding are going to be a deterministic circuit under the hood for a while yet.

    but given the combination of overwhelming complexity, trade secrets and classical high entropy sources, I think most serious attempts at formal proofs would have to resign to stochastic terms in their formulation for some time yet.

    there may be some very specific and non-general exceptions, and i do believe this is going to change in the future as both extremes (highly formal AI models, and non-deterministic hardware backed instructions) are further developed. and ofc overcoming the computational resource hurdles for training could lead to relaxing some of the current practical requirements for stochastic processes during training.

    this is ofc only afaict, i don’t work in LLM field.



  • ganymede@lemmy.mltoAsklemmy@lemmy.mlWhat is Lemmy's problem with AI?
    link
    fedilink
    arrow-up
    42
    arrow-down
    3
    ·
    edit-2
    2 months ago

    ignoring the hate-brigade, lemmy users are probably a bit more tech savvy on average.

    and i think many people who know how “AI” works under the hood are frustrated because, unlike most of it’s loud proponents, they have real-world understanding what it actually is.

    and they’re tired of being told they “don’t get it”, by people who actually don’t get it. but instead they’re the ones being drowned out by the hype train.

    and the thing fueling the hype train are dishonest greedy people, eager to over-extend the grift at the expense of responsible and well engineered “AI”.

    but, and this is the real crux of it, keeping the amazing true potential of “AI” technology in the hands of the rich & powerful. rather than using it to liberate society.