this post was submitted on 15 Jun 2023
14 points (100.0% liked)

Lemmy Support

4654 readers
21 users here now

Support / questions about Lemmy.

Matrix Space: #lemmy-space

founded 5 years ago
MODERATORS
 

I have grown used to searching the internet with the help of the search site:reddit.com command. However, I'm looking for a more convenient method to search across all popular Lemmy instances without the need to manually copy and paste the following query every time: (site:lemmy.world OR site:lemmy.ml OR site:beehaw.org OR site:feddit.de OR site:sh.itjust.works OR site:lemmy.one OR site:lemmy.ca). Is there a more user-friendly approach to perform these searches? Alternatively, is there a comprehensive instance list available that I can easily copy to my clipboard and utilize within a script to obtain the formatted output for the search? Any suggestions or assistance would be greatly appreciated.

top 3 comments
sorted by: hot top controversial new old
[–] Veritas@lemmy.ml 0 points 1 year ago* (last edited 1 year ago) (1 children)

To create the stats.json file, you can follow these steps:

  1. Clone the lemmy-stats-crawler repository from GitHub by running the following command in your terminal:
git clone https://github.com/LemmyNet/lemmy-stats-crawler
  1. Change your current directory to the cloned repository:
cd lemmy-stats-crawler
  1. Build and run the crawler using Cargo:
cargo run -- --json > stats.json

This command will execute the cargo run command and redirect the output to a file named stats.json. The --json flag instructs the crawler to output the JSON data.

To create a string like (site:lemmy.world OR site:lemmy.ml OR site:beehaw.org OR site:feddit.de OR site:sh.itjust.works OR site:lemmy.one OR site:lemmy.ca) from the list of instances extracted from the JSON output of the stats.json file, you can follow these steps using Python:

  1. Import the json module to work with JSON data[1].
  2. Read the JSON file and load the data into a Python object[1].
  3. Extract the list of instances from the Python object.
  4. Create a string using the list of instances with the desired format.

Here's a Python script that demonstrates these steps:

import json

def sort_instances_by_monthly_users(data):
    instances = [instance for instance in data["instance_details"]]
    sorted_instances = sorted(instances, key=lambda x: x["site_info"]["site_view"]["counts"]["users_active_month"], reverse=True)
    return sorted_instances[:20]  # Limit to 20 instances

# Read the JSON file and load the data into a Python object
with open("stats.json", "r") as file:
    data = json.load(file)

# Call the function to sort the instances and print the result
sorted_instances = sort_instances_by_monthly_users(data)

# Create a string using the list of instances with the desired format
formatted_string = " OR ".join([f"site:{instance['domain']}" for instance in sorted_instances])
formatted_string = f"({formatted_string})"

print(formatted_string)

This script will output a string in the desired format:

(site:lemmy.world OR site:lemmy.ml OR site:beehaw.org OR site:feddit.de OR site:sh.itjust.works OR site:lemmy.one OR site:lemmy.ca OR site:lemmy.blahaj.zone OR site:lemmygrad.ml OR site:lemmy.fmhy.ml OR site:sopuli.xyz OR site:lemmynsfw.com OR site:lemm.ee OR site:discuss.tchncs.de OR site:midwest.social OR site:lemmy.sdf.org OR site:lemmy.dbzer0.com OR site:aussie.zone OR site:feddit.uk OR site:feddit.it)

Citations: [1] https://www.geeksforgeeks.org/read-json-file-using-python/

[–] cosmicsploogedrizzle@lemmy.ml 0 points 1 year ago (1 children)

Could something like this be developed into a browser extension? You search in the extension and it does this for you in your preferred search engine?

[–] Veritas@lemmy.ml 12 points 1 year ago

There are many different search engines and doing an extension that works for all is probably very complex, so I doubt it.