Posted by val
on Wednesday, June 04
Instead of relaying on running cleanup of old releases via capistrano, we have a cron job to only keep releases for last two days (but at least three latest).
#!/usr/bin/env ruby
require 'fileutils'
KEEP_RELEASES = 3
KEEP_DAYS = 2
EXCLUDE_APPS = %W(uploadr)
cut_time = (Time.now.utc - KEEP_DAYS*24*60*60).strftime("%Y%m%d%H%M%S").to_i
Dir['/u/apps/*'].each do |app|
next if EXCLUDE_APPS.include?(File.basename(app))
dirs = Dir["#{ app }/releases/*"]
fresh = dirs.select { |dir| (dir.split('/').last).to_i > cut_time }
latest = dirs.sort.last(KEEP_RELEASES )
(dirs - fresh - latest).each do |dir|
FileUtils.rm_rf dir
end
end
Posted by val
on Sunday, August 19
The challenge with hosting of multiple Rails-based Facebook applications is that the amount of users grow quickly. To address this problem we are using EC2 nodes that we can expand/shrink as the demand grows. The price/performance ratio isn’t quite what we first expected, so we are moving toward having a few dedicated boxes instead. Another problem that we add at least a couple of applications a week. On each box that hosts them, we need to reconfigure monit, haproxy, nginx, logrotate and nagios.
To mitigate both issues on dedicated boxes, we resolved to have a central configuration definition in svn with individual box configurations keyed on localhost name. A ruby script regenerates all those aforementioned configuration files from
ERB-processed templates when it is run on a box and bounces the services. A sample config looks like:
dedicated-1:
description: "The dedicated box #1"
ip: 64.233.167.99
failover: dedicated-2
apps:
bookshelf:
port: 5000
instances: 20
response: Book
ljconnect:
port: 6000
instances: 7
virtual: ljconnect.hungrymachine.com
response: Journal
That definition would generate a monit config with 20 instances of the bookshelf application and 7 instances of the ljconnect application plus all other configurations (including nagios health checks expecting the response value) . It is all possible because we adopt a fixed application deployment file structure and port numbering conventions (via offsets) for all servers.
Posted by val
on Thursday, August 16
We found that sometimes monit fails to restart all mongrel instances after deployment and some of them end up running with the pid file gone. Since there is no pid, monit believes the instance is not running so it tries to start a new one on the same port and, of course, fails. Which leads to stale mongrel instances with old code. We’re investigating a long term solution but in the meantime have wrapped the mongrel_rails start script with a replacement which finds and kills the stale mongrel instances before starting a new one.
#!/usr/bin/env ruby
class MongrelController
def self.run_mongrel(args)
pid = extract_pid(args)
kill_stale_process(pid) if pid
system "/bin/mongrel_rails #{ args.join(' ') }"
end
def self.extract_pid(args)
(args[0] == 'start') && (i = args.index('-P')) && args[i + 1]
end
def self.kill_stale_process(pid)
mongrel_processes(pid).each { |p| process_running?(p) && Process.kill(9, p) }
end
def self.mongrel_processes(pid)
`ps axww -o 'pid command'`.split(/\n/).inject([]) do |mongrels, process|
mongrels << process[/^\s*(\d+)/][$1].to_i if process.match(%r{/bin/mongrel_rails\s.*\s-P\s#{ pid }\b})
mongrels
end
end
def self.process_running?(pid)
pid && (`ps -p #{ pid }`.split(/\n/).size == 2)
end
end
MongrelController.run_mongrel(ARGV)
Posted by val
on Tuesday, August 14
If you use
nagios for monitoring of your rails instances, you might want to get notification not only via email or
SMS-messages but to your
AIM when you are online. The script (
libexec/aim_notifier.rb) utilizes the
Net::TOC gem for sending out notifications:
#!/usr/bin/env ruby
require 'rubygems'
require 'net/toc'
user = 'your_bot_name'
password = 'bot_password'
msg = ARGV[0].to_s.gsub('\n', "\n")
client = Net::TOC.new(user, password)
client.connect
sleep 3
buddies = []
client.buddy_list.each_group { |g, b| buddies = b if g == 'Friends' }
buddies.each do |b|
b.send_im(msg) if b.available?
end
sleep 3
client.disconnect
You need to add any account you want to be notified to bot’s friends (either by logging to
AIM using the bot account or using Net::TOC’s ability to add friends).
The last piece is to add a new notifier in
etc/objects/commands.cfg as:
define command{
command_name notify-service-by-aim
command_line $USER1$/aim_notifier.rb $ARG1$ $ARG2$ "***** Nagios *****\n\nNotification Ty
pe: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState:
$SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$"
}
and to append it to the list of notifiers defined for a contact template in
etc/objects/commands.cfg:
service_notification_commands notify-service-by-email,notify-service-by-aim
Repeat the configuration if you want to use the AIM notification for hosts as well.