• 0 Posts
  • 30 Comments
Joined 3 years ago
cake
Cake day: July 6th, 2021

help-circle







  • One you don’t wanna join ;) (Google). I’m still on the free tier of what’s now Workspace and intent to move, but I’m dreading the work that comes with it.

    A year or so ago Google almost killed the free tier (look up gsuite legacy if you want to know more). Back then I prepared to move away and settled on Zoho as my replacement, but in the end Google responded to the community’s backlash and kept the free tier free for personal use (although there are some other restrictions put in place, so eventually a move is inevitable). Zoho might also give you the features you want.




  • jochem@lemmy.mltoLemmy@lemmy.mlSelfhosting single person instance?
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    1 year ago

    I would not recommend running your own email server. Major email providers like gmail only accept email from servers that have all kinds of measures in place to make them as trustworthy as possible. That’s hard and probably not possible on a home internet connection.

    Filtering incoming spam is also a pain in the ass.

    It’s nice as an exercise to learn how email works, but I would not rely on it.




  • Correct, it’s not just regurgitating words, it’s predicting which token comes next. A token is sometimes a whole word, but for longer ones it’s part of a word (and some other rules that define how tokenization works).

    How it knows which token comes next is why the current generation of LLMs is so impressive. It seems to have learned the rules the underpin our languages, to the point that it seems to even understand the content. It doesn’t just know the grammer rules (without anyone telling it, it just learned the patterns), it also knows which words belong to each other in which context.

    It’s your prompt + some preset other context (e.g. that it is an OpenAI LLM) that creates that context. So being able to predict a token correctly is one part, the other is having a good context. This is why prompt engineering quickly became a thing. This is also why supporting bigger contexts is another thing (but a larger context requires way more processing power, so there’s a trade-off there).

    It’s btw not just the trained model + context that gives you the output of ChatGPT. I’m pretty sure there are layers before and after, possibly using other ML models, that filter content or make it more fit for processing. This is why you can’t ask it how to make bombs, even though those recipes are in its training set and it very likely can create a recipe based on that.





  • Markdown is notoriously understandized, so there are lots of unofficial extensions. This is a major downside of markdown, as you cannot trust a renderer to properly show the formatting beyond the basics.

    It’s still really nice, because of two great features:

    • it’s super easy to learn. Just look at a few examples and off you go.
    • even when it’s not rendered, it’s still easy to read (which I think contributes to making it easy to learn).

  • If I would guess, then it has to do with making long lines fit in a window without requiring horizontal scrolling.

    Markdown is used a lot in the context of software development. Software code is usually accompanied by a readme, detailing what it does, how to setup your environment for development, how to contribute, etc.

    The defacto standard is to write this in markdown. Since it’s written in a software development program (an IDE), you don’t have text wrapping, meaning lines continue when they don’t fit in the window. This is because otherwise the code becomes unreadable. Most code can also be kept to fairly short lines, normally not requiring any horizontal scrolling. However, a long sentence in a readme will easily become much longer than a line of code. So being able to break a line anywhere without having an actual line break in your rendered output is super useful for that.

    This is btw how html also behaves. Markdown gets rendered to html.