this post was submitted on 12 Oct 2024
193 points (89.7% liked)

Technology

59116 readers
2971 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Google's AI may replace traditional websites and content creators leading to potential monopolization and diminishing user experience - Mrwhosetheboss

you are viewing a single comment's thread
view the rest of the comments
[โ€“] asdfasdfasdf@lemmy.world 5 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

The search engine magic isn't just about caching pages. It's also extremely expensive / complex to:

  • maintain an index of all the websites in the world. This is an extremely high cost
  • refresh that index in almost real time. How long will your self hosted crawler take to find new content for every website in the world?
  • there's also the algorithm for weighing results. The order of results and their relevance is not easy at all. How many times a word appears on a page is a terrible metric.

A hosted search is also a lot more environmentally friendly - that gigantic search index and all the energy poured into the work is something that can be shared by everyone. If everyone did that themselves at home, you spend that same amount of energy for every single household.

[โ€“] rottingleaf@lemmy.world 2 points 3 weeks ago

"Almost real time" is not what I'd call necessary.

Weighting results - I'd expect user feedback (good result, bad result, combined with keywords from the request) would be good enough. Similar to ed2k for files' reputation.

The index is going to be big, yes. But if we want a p2p system with split storage and computation, something between Freenet and Ceph, may be doable.

A hosted search is also a lot more environmentally friendly - that gigantic search index and all the energy poured into the work is something that can be shared by everyone. If everyone did that themselves at home, you spend that same amount of energy for every single household.

With some kind of such a p2p system I can imagine the overhead to be like 10 or maybe 100 times Google. But not what you said.