this post was submitted on 05 Jul 2023
196 points (99.5% liked)

Technology

59377 readers
4179 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

OpenAI has disabled the Browse with Bing feature in ChatGPT to prevent users from bypassing paywalls and accessing website information without making a subscription first.

you are viewing a single comment's thread
view the rest of the comments
[–] Frellwit@lemmy.world 34 points 1 year ago (4 children)

I'm wondering why websites keep using fake paywalls when they can use a real one where the content isn't available until user verification.

[–] UESPA_Sputnik@lemmy.world 37 points 1 year ago (1 children)

They do that to let search engines index their articles. Then they switch on the paywall an hour later or so but still get a lot of traffic (which is good for advertising) when people click on the link on Google etc.

[–] 15liam20@lemmy.world 12 points 1 year ago (1 children)

The crawler identifies itself as a "robot" which can get past the paywall. When you browse using Chrome the site behaves differently. That's why it's so easy to get past by pasting the link into archive.ph

[–] lemmyvore@feddit.nl 5 points 1 year ago* (last edited 1 year ago)

Or by using a browser addon that changes the way the browser identifies itself to pretend it's a search engine crawler.

[–] lemmyvore@feddit.nl 15 points 1 year ago* (last edited 1 year ago)

They'd like to allow search engines and block (non-paying) visitors, but they took a lazy approach to it.

The correct approach would indeed be to identify paying visitors (user+password) and search engines correctly (secret key), then they can reliably shut down everyone else.

But that would require Google to cooperate and I suspect they don't want to set a precedent where they let a website dictate how they get content. They like to deal from an all or nothing position.

Of course there are other methods, such as making public just enough about the article to be relevant in searches, but I don't know why they don't do that. Probably lowers their SEO effectiveness if I were to guess.

[–] Ferris 3 points 1 year ago

America's Test Kitchen used to substitute their article text with a bunch of Lorem Ipsum, but I can't tell whether they are still doing it without a laptop in front of me.