this post was submitted on 30 Apr 2024
142 points (96.7% liked)
Casual Conversation
1658 readers
285 users here now
Share a story, ask a question, or start a conversation about (almost) anything you desire. Maybe you'll make some friends in the process.
RULES
- Be respectful: no harassment, hate speech, bigotry, and/or trolling
- Keep the conversation nice and light hearted
- Encourage conversation in your post
- Avoid controversial topics such as politics or societal debates
- Keep it clean and SFW: No illegal content or anything gross and inappropriate
- No solicitation such as ads, promotional content, spam, surveys etc.
- Respect privacy: Donβt ask for or share any personal information
Casual conversation communities:
Related discussion-focused communities
- !actual_discussion@lemmy.ca
- !askmenover30@lemm.ee
- !dads@feddit.uk
- !letstalkaboutgames@feddit.uk
- !movies@lemm.ee
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
My day consisted of users complaining about speed, everything on analytics looked fine, I checked some random high demand applications and they indicated that they were waiting on network IO pretty consistently, so I go to check the file server where the data for those apps is centrally located with no redundancy, and I managed to.... Turn off the network interface on the file server.
π€¦ββοΈ
Don't ask me how, I'm still confused about it myself. My manager calls me not 5 minutes later after he got a call from the client. I'm sitting there absolutely shitting myself trying to figure out how to turn the network card back on, and every method I'm trying to use to connect to the system is failing.
Even my manager had some serious trouble trying to figure it out.
Took about 45 minutes until the system was back on the network. Right at the end of the day, on one of the busiest days of the year for that specific customer.
I feel really stupid.
Ooof. I've accidentally done that (though on a less important system), but thankfully was able to get in via iLO console to reset it.
I.... Didn't have ILO or any ipmi to the system. It was a cloud VM on Azure. Looking at their management tools, they're all IP based, which means you need a valid IP connection to the system to control it.
The setting I think that did it, was in the advanced network interface settings. I'm not sure which one, but long story short, whatever it was, messed up the network interface and basically everything was useless. My manager was doing something and I pushed an Azure CLI network interface reset, whatever he was doing, and/or what I did, eventually brought it back online. Luckily throughout this all the settings I changed were reverted, and the system was power cycled, so all evidence was destroyed.... Unless it logged something to the system logs. IDK.
The programs are very file heavy and most of that weight is on IO, not throughput, so I was tinkering with the buffers, but I can't be sure that the buffer settings were to blame. I'm sure I clicked on more than just buffer settings while I was in the network adapter settings.
I'm still pretty upset about it. Azure really doesn't make it easy to connect to the stupid console. Then again, hyper-V is mostly the same way, so....
I prefer running another hypervisor technology, but we're pretty heavily invested into Azure at my workplace, so I'm not sure that even suggesting it will get any traction.
The part that annoys me is that everything is built glass cannon style. Get big, fast systems in Azure, throw everything into those few systems and run it. It only takes one person running prime 95 for fun and profit(?) to wreck the ability for anyone to do meaningful work.
Glass cannon. Azure isn't a golden ticket that can handle all the workloads with a minimal number of virtual machines, but we're committed to a server-less architecture, even when the client would be better off with local, or colo, given their relative size. Like a quarter rack in a nearby colo, and a couple of hyperconverged systems and they would be very well served. It wouldn't be very different from what they're doing right now, either functionally or physically (or even cost-wise), but they would get a lot more out of it.
Suggesting it would be a lot like talking to a wall.