Servers down (for now - work being done on fixing it)

The latest announcements from your Hala Team

Moderators: Arkon, Top Team

Post Reply
demonlady
Knight of the Holy Church of Annoyance
Posts: 111
Joined: Sat Jul 03, 2004 3:32 am

Servers down (for now - work being done on fixing it)

Post by demonlady »

*More or less copied from Themicles' talking on Discord*

Near as Themicles can tell, there's issues with Odin (VM host for Tairis and other stuff he does) look to have caused Phoenix (bare hardware Linux, no VMs) to have a major segfault and lock the system.

Attempting to get Phoenix back up now showed it is dead.
Even if he can get it to boot again, he is looking at multiple bulging capacitors on its motherboard.
Currently he is working his ass off to get Odin back up so he can then get the database running on Odin.

Also working on what to do about a replacement.
We were given a dual Xeon system to move everything to last year, but we cannot run it without doing custom cooling work.
The thing sounds like a jet engine taking off 24/7 as-is.
Passive heatsinks on dual high TDP Xeons, cooled by banks of 40mm fans doubled up.... yeah....

Plan is: Remove hard drive from Phoenix.
Install it with Odin in Odin's new case. Get Odin running.
Get Odin to run Phoenix's OS as a VM.
Call it good for now.

Apologies, we have been trying to catch up to our budget to get all this stuff done before failures, and as usual, something blows up before we can.

(In other news, his wife is redoing the bathroom floor today because a leaking water supply line ruined it
So yeah, fix one computer, deal with a failed computer, and deal with a ruined floor all in one weekend! Fun stuff!)

All the lovely mythological names are those of servers of course.
"If I had been a cat, curiosity would have killed me a long time ago."
Akai
Wearer of the Holy Pants
Posts: 1710
Joined: Fri Nov 03, 2006 9:09 pm
Location: Martinsburg, WV
Contact:

Re: Servers down (for now - work being done on fixing it)

Post by Akai »

Ack! Thank you for the update DL! And many thanks to Themi for everything!
demonlady
Knight of the Holy Church of Annoyance
Posts: 111
Joined: Sat Jul 03, 2004 3:32 am

Re: Servers down (for now - work being done on fixing it)

Post by demonlady »

Fixing this will take several hours.

If anyone feels wealthy enough to financially support Themicles and his wife during these trying times please do.
viewtopic.php?f=9&t=10138

I wish I could myself and would if I could.
"If I had been a cat, curiosity would have killed me a long time ago."
demonlady
Knight of the Holy Church of Annoyance
Posts: 111
Joined: Sat Jul 03, 2004 3:32 am

Re: Servers down (for now - work being done on fixing it)

Post by demonlady »

Themicles wrote: Here's what I know so far: Odin's radiator fan failed, seized and caused Odin to go into thermal shutdown. Odin hosts Tairis and Tairis connects to the DB. In troubleshooting, Odin was turned on three times, each time going into thermal shutdown within minutes. My best guess as to why this affected Phoenix is perhaps some horribly malformed packets causing the SegFault I saw when I plugged a monitor into Phoenix. But really, this was just the proverbial straw. Phoenix would not reboot. So I pulled it from the rack and started investigating. Once I looked at the board, that feeling of doom set in. There were bulged capacitors everywhere.

Phoenix hosts the database, so that is why the servers are down. Plan is to run the database on a VM on Odin (preferably boot Phoenix's OS as a VM if possible), and then work on the plan to have a case built for the dual Xeon system so that we can properly cool it without listening to the worst noise imaginable.
I cannot give a time estimate on the return of the game servers. But I am aiming to have it done tonight.
"If I had been a cat, curiosity would have killed me a long time ago."
demonlady
Knight of the Holy Church of Annoyance
Posts: 111
Joined: Sat Jul 03, 2004 3:32 am

Re: Servers down (for now - work being done on fixing it)

Post by demonlady »

Copied from Discord, posted over a time of a day and a half or so.
Themicles wrote: I have Odin up and running now, and temps are looking great at a mere 45 C! Still working on getting the databases up. There was a lot of physical moving of stuff about that needed doing to accomplish a lot of this. Odin is in a new case, since the ancient (circa 2005?) case it was in was not flexible enough for the cooler that was installed. We ran out to MicroCenter today and bought the new case knowing it would be the quickest fix for Odin's heat issue. We were thankfully right. Could've been worse, like a dead pump.
Themicles wrote: I was hoping to be done by now, and I am very close. The trouble is updating to a newer MySQL server version along the way. I'd have installed the same that was on Phoenix, if I had a way to check what version that was, but I can't figure that out
You may see the servers pop on if you happen to check in the next few minutes. Please do not connect. I am booting Poseidon and getting ready to get things back up and running.
Themicles wrote: Hala and Ysgard should be up and running and appear to be connected to their database! Please @ me if you encounter any problems.
Now to get Tairis back up
Themicles wrote: Phoenix is now a VM and much more portable. Should be far easier to transition everything to the dual Xeon server when that is running... I'll be spending the rest of the week re-doing a decades worth of little hacks I had on Phoenix to get certain networking tricks done for some things.

And that should be Tairis back up too.
Just in time for lunch
puts up an "Out to Lunch" sign
"If I had been a cat, curiosity would have killed me a long time ago."
Akai
Wearer of the Holy Pants
Posts: 1710
Joined: Fri Nov 03, 2006 9:09 pm
Location: Martinsburg, WV
Contact:

Re: Servers down (for now - work being done on fixing it)

Post by Akai »

Hurray for Themi!
Post Reply