Choices have slowly been running out when it comes to effective search engines. It seems inevitable an open source search engine project independent from big tech will be needed.

Some of my own tricks are:

  • Use the blacklist plugin to block sites from search.
  • Search for forum sites and communities instead of specific queries. (Wikipedia has a list of forums that might be useful)
  • For technical questions favor Q&A websites like stack exchange.
  • YouTube videos often offer better information than results from search engines. (Use search engines instead of YT search)
  • Look for blogs and journals that specialize in the topic you’re searching for.
  • Use boolean search when possible.
  • Self-host and customize your own metadata search engine. Create a graph network linking websites based on subject/topic. You may not be able to query specific questions but you can discover sites that you otherwise can’t in traditonal search. This is a great way to discover hidden gems! (Example: https://internet-map.net/)
  • (Difficult) Self-host and scrape sites across the web in order to create your own query-able database. This would be the most effective way to search the internet and would be completely independent from potential enshittification and censorship. The cost however is quite high both in term of hardware and time. Kiwix offers a way to download websites for offline use. (Ex: Wikipedia, Stack exchange). This is a good starting point to build your own custom search engine.

I would love to hear the tips and tricks you use! I hope this post helps others in more efficiently finding information on the internet!

  • SmokeyDope@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    1 day ago

    A fun weekend project was to set up a local model to tool call from openweather and wolfram alpha through their API for factual dataset retrieval and local weather info.

    Someone In our community showed off toolcalling articles on local instance of Wikipedia through a kiwix server and zim file and that seems really cool project too.

    I would like to scrape preprints from ArXiv and do basic rag with them. Also Try to find a way to have a local version of OEIS or see if theres an API to scrape.

    So I guess my solution is to use automation tools to automate data retrieval from wiki’s and databases directly. Use RSS, direct APIs, scrapers and tool calling.