this post was submitted on 27 Aug 2023
110 points (91.7% liked)

No Stupid Questions

36169 readers
494 users here now

No such thing. Ask away!

!nostupidquestions is a community dedicated to being helpful and answering each others' questions on various topics.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)


Rule 1- All posts must be legitimate questions. All post titles must include a question.

All posts must be legitimate questions, and all post titles must include a question. Questions that are joke or trolling questions, memes, song lyrics as title, etc. are not allowed here. See Rule 6 for all exceptions.



Rule 2- Your question subject cannot be illegal or NSFW material.

Your question subject cannot be illegal or NSFW material. You will be warned first, banned second.



Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.



Rule 4- No self promotion or upvote-farming of any kind.

That's it.



Rule 5- No baiting or sealioning or promoting an agenda.

Questions which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.



Rule 6- Regarding META posts and joke questions.

Provided it is about the community itself, you may post non-question posts using the [META] tag on your post title.

On fridays, you are allowed to post meme and troll questions, on the condition that it's in text format only, and conforms with our other rules. These posts MUST include the [NSQ Friday] tag in their title.

If you post a serious question on friday and are looking only for legitimate answers, then please include the [Serious] tag on your post. Irrelevant replies will then be removed by moderators.



Rule 7- You can't intentionally annoy, mock, or harass other members.

If you intentionally annoy, mock, harass, or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.



Rule 8- All comments should try to stay relevant to their parent content.



Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.



Rule 10- Majority of bots aren't allowed to participate here.



Credits

Our breathtaking icon was bestowed upon us by @Cevilia!

The greatest banner of all time: by @TheOneWithTheHair!

founded 2 years ago
MODERATORS
 

Edit: To clarify:

Is it even possible, financially speaking, to keep adding storage? I mean, advertisements don't even make a lot of money, is the indefinite growth of server storage even sustainable?

Or will they do what Twitch does with old content and just delete them?

you are viewing a single comment's thread
view the rest of the comments
[–] LetMeEatCake@lemm.ee 96 points 1 year ago* (last edited 1 year ago) (3 children)

Storage is cheap, especially at the corporate scale.

Make two simplifying assumptions: pretend that Google is paying consumer prices for storage, and pretend that Google doesn't need to worry about data redundancy. In truth Google will pay a lot less than consumer prices, but they'll also need more than 1 byte of storage for each byte of data they have, so for the sake of envelope math we can just pretend they cancel out.

Western Digital sells a 22TB HDD for $400. Seagate has a 20TB HDD for $310. I don't like Seagate but I do like round numbers, so for simplicity we'll call it $300 for 20TB. This works out to $15/TB. According to wikipedia, Youtube had just under $29b of revenue in 2021. If youtube spend just $100m of that — 0.34% — they'd be able to buy 6,666,666 of those hard drives. In a single year. That's 6,666,666x20TB = 133,333,333 TB of storage, also known as 133^note^ ^1^ exabytes.

That's a lot of storage. A quick search tells me that youtube's compression for 4k/25fps is 45Mbps, which is about 5.5 megabytes/s. That's 768,722 years of 4k video content. All paid for with 0.34% of youtube's annual revenue.

Note 1: Note that I am using SI units here. If you want to use 1024^n^ for data names, then the SI prefixes aren't correct. It'd be 115 exbibytes instead.

EDIT: I initially did the price wrong, fixed now.

[–] AceBonobo@lemmy.world 28 points 1 year ago (3 children)

I wouldn't assume Googe pays less for storage. They need to pay for land use in many countries, power usage, redundancy and the staff that manages all of it.

They also need powerful servers with fast caching storage and a lot of RAM. They also need to pay for the bandwidth.

As far as I know, they save multiple copies of each video in all resolutions they serve. So an 8K video will also have 4K + 1440p + 1980p + 720p + 480p + 240p + 144p Possibly also 60Hz and 30Hz for some of them and also HDR versions.

You have to add all that to the cost per TB. Finally, there is the question of how much additional storage they need per year, 100 PByr? Presumably also increasing yearly?

[–] LetMeEatCake@lemm.ee 30 points 1 year ago

I wasn't calculating server costs, just raw storage. Google is not buying hard drives at retail prices. I wouldn't be surprised if they're paying as little as 50% of the retail price to buy at volume.

All of what you say is true but the purpose was to get a back of the envelope estimation to show that the cost of storage is not a truly limiting factor for a company like youtube. My point was to answer the question.

With the level of compression youtube uses, the storage costs of everything below 4k is substantially lower than 4k by itself: for back of envelope purposes we can just ignore those resolutions.

[–] sarchar@programming.dev 6 points 1 year ago

Do you absolutely know they're storing those qualities individually? It's perfectly plausible that they do on the fly transcoding.

[–] 6mementomori@lemmy.world 3 points 1 year ago (1 children)

what is that ^note^ notation?

[–] LetMeEatCake@lemm.ee 3 points 1 year ago (2 children)

It's a superscript. You can see it in the comment editor options. It's: ^text^ which looks like ^text^

You can also check a comment's source by clicking on the icon that looks like a dog eared piece of paper at the bottom of it.

[–] 6mementomori@lemmy.world 10 points 1 year ago (1 children)

ah it must be my client not visualizing

[–] glibg10b@lemmy.ml 3 points 1 year ago (1 children)
[–] 6mementomori@lemmy.world 2 points 1 year ago (1 children)
[–] glibg10b@lemmy.ml 3 points 1 year ago* (last edited 1 year ago)

Ah, the Home Instance button for lemmy.world comments is broken. Try lemmy.ml instead: https://lemmy.ml/comment/3143665

[–] ChaoticNeutralCzech@feddit.de 3 points 1 year ago

You can use footnotes now^Note1.

They are neat and don't look too bad if unsupported by the interpreter.

[–] You999@sh.itjust.works 1 points 1 year ago (1 children)

I know you are saying Google doesn't have to worry about redundancy to simplify the math but I think that makes it completely useless.

Redundancy is not just about having another copy incase of data loss but more importantly for enterprises redundancy allows for more throughput. If each video was on a single hard drive the site would not be able to function as even the fastest multi actuator hard drive can only do 524 MB/s in a perfect vacuum.

[–] LetMeEatCake@lemm.ee 4 points 1 year ago* (last edited 1 year ago)

It's useless for answering a questions that wasn't asked, sure. But I didn't pretend to answer that question. What it is useful for is answering the topic question. You know, the whole damn point?

How much of a factor off do you think the estimate is? You think they need three drives of redundancy each? Ten? Chances are they're paying half (or less) for storage drives compared to retail pricing. The estimate on what they could get with $100m was also 134 EB, a mind boggling sum of storage. I wouldn't be surprised if they're using up on the order of 1 EB/year in needed storage. There's also a lot more room in their budget than 0.34%.

The point is to get a quick and simple estimate to show that there really will not be a problem in Google acquiring sufficient storage. If you want a very accurate estimate of their costs you'll need data that we do not have. I was not aiming to get a highly accurate estimate of their costs. I made this clear, right from the beginning.

If each video was on a single hard drive the site would not be able to function as even the fastest multi actuator hard drive can only do 524 MB/s in a perfect vacuum.

The most popular videos are all going to be kept in RAM, they don't read them all off disk with every single view request. If you wanted a comment going over the finer details of server architecture, you shouldn't have looked at the one saying it was doing back of the envelope math on storage costs only, eh?