Archives

Converting pbxadmin to Database Authentication

Apache Configuration Changes

Access the container via the command line, navigate to /etc/httpd/conf and edit the httpd.conf file. Search for <Directory "/var/www/webgui/pbxadmin/"> and either comment out or delete all lines within the directive except for the following:

Options Indexes FollowSymLinks
order allow,deny
allow from all

Save and exit the file.


File Changes Within the pbxadmin Directory

/var/www/webgui/pbxadmin/index.html

Search for

src="/pbxadmin/phpsysinfo/"

Change it to

src="/pbxadmin/admin/"

/var/www/webgui/pbxadmin/ringfree.php

Delete the following section:

<?php
 if (!isset($_SERVER['PHP_AUTH_USER'])) {
    header('WWW-Authenticate: Basic realm="My Realm"');
    header('HTTP/1.0 401 Unauthorized');
    echo 'ERROR CCT-001: Authentication failed.';
    exit;
 } else {
    $orchiduser=$_SERVER['PHP_AUTH_USER'];
    $orchidpass=$_SERVER['PHP_AUTH_PW'];
}
?>

Then remove the only instance of the following string:

<a href="phpsysinfo/" target="body"><img border="0" src="images/classiccitytelco_system.png"></a>

/var/www/webgui/pbxadmin/phpsysinfo

Completely remove the directory.


/var/www/webgui/pbxadmin/editor

Completely remove the directory.


Update the Authentication Type in the Asterisk Database

Access the Asterisk database for the PBX in question and run the following:

UPDATE freepbx_settings SET value='database' WHERE keyword='AUTHTYPE';

Update the Authentication Type in the Papal Database

Access the Papal database and run the following:

UPDATE pbx SET authtype='database' WHERE ctid=<<<CTID>>>;

Note that “<<<CTID>>>” should be replaced with the actual container ID for the PBX.


Restart Apache in the Container

Back in the PBX container run the following:

service httpd restart

Everything should be good to go at this point.

New HA Filesystem

Overview of new storage system

 
The goal of GlusterFS in the Ringfree Infrastructure is to provide geo redundant PBX service.  For this to occur, portions of the PBX container filesystems will be replicated across all datacenter, while other portions need to stay site specific (such as log files).
A previous project still in development, RORW, provides Ringfree with a way to have granular control over the mounting process for an OpenVZ container using disparate sources for the mount data.  Use of GlusterFS will make heavy use of these tools.
Old PBX Container storage was stored solely in a dc-reachable NFS mount, targeting /rf-images/pbx for the container file system, /rf-images/pbxroot for a container mount point, and /rf-images/vzconf for container configurations.  They were mounted on host nodes as /vz/nfsprivate, /vz/nfsroot and /etc/vz/conf respectively.
(On NFS server) /rf-images/pbx     -> (On PBX Node) /vz/nfsprivate
(On NFS server) /rf-images/pbxroot -> (On PBX Node) /vz/nfsroot
(On NFS server) /rf-images/vzconf  -> (On PBX Node) /etc/vz/conf
New clustered storage will differ drastically and the RORW mounting hook ins will handle building the container mount before starting.
(On GlusterFS server, Fuse) /rf-images/nimbus     -> (On PBX Node) /vz/nimbus
(On GlusterFS server,  NFS) /rf-images/dc-atl     -> (On PBX Node) /vz/dc-atl
(On GlusterFS server,  NFS) /rf-images/dc-dal     -> (On PBX Node) /vz/dc-dal
(On GlusterFS server,  NFS) /rf-images/containers -> (On PBX Node) /vz/containers
 
Configurations for containers will need to be altered in transition from old NFS storage to new GlusterFS storage.  The mounting process is outlined here:
 
1) Private filesystem for container is found in /vz/containers/$CTID and mounted on /vz/root
2) Site specific filesystem is detected from node environment variable and mounted on top in /vz/root.
 
If DAL detected
    Mount /vz/dc-dal/$ctid/ /vz/root/$ctid
If ATL detected
    Mount /vz/dc-atl/$ctid/ /vz/root/$ctid
 
3) Nimbus storage is then mounted from a GlusterFS fuse mount (all others are GlusterFS-NFS for increased speed)
 
Mount /vz/nimbus/$ctid /vz/root/$ctid
 
4) At this point, a complete filesystem is mounted and the container can be started.  File writes are replicated regardless of NFS or Fuse mount, but a primary GlusterFS-NFS mount failure will require all nodes targeting it as storage to unmount, remount to alt target and restart containers.
 
Note:  $ctid.mount and $ctid.umount scripts are added during the transitioning phase to enact the additional mount options for building a complete container filesystem from the disparate sources.
 

Server Configuration Notes

 
Because in Atlanta we are migrating away from vanilla NFS to a mixture of GlusterFS NFS and GlusterFS Fuse, we run into a small problem where we cannot live transition the primary NFS mount while setting up GlusterFS in production.  This is fine because we will start the GlusterFS initialization on the secondary NFS mount and when a successful container start is made from it mounted on one of the nodes, we can stop vanilla NFS on primary and start the cluster filesystem import from existing storage.
 
(SERVER) Installation
# yum install glusterfs-server
# chkconfig —levels 235 glusterd on
# service glusterd start
(SERVER) Create Bricks for volume
# mkdir -p /rf-images/nimbus
# mkdir -p /rf-images/dc-atl
# mkdir -p /rf-images/dc-dal
# mkdir -p /rf-images/containers 
Note: there should NEVER be any direct writes to these folders as it will corrupt the glusterfs volume.
(SERVER) Create local Volume mounts
# mkdir -p /rf-images/nimbus-localmount
# mkdir -p /rf-images/dc-atl-localmount
# mkdir -p /rf-images/dc-dal-localmount
# mkdir -p /rf-images/containers-localmount
(SERVER) Make Brick nodes peer with each other
# gluster peer probe $otherbricknodes
(SERVER) Create Volume from bricks
# gluster volume create nimbus replica $noofpeers transport tcp $peer1hostname:/rf-images/nimbus $peer2hostname:/rf-images/nimbus force
 
# gluster volume create dc-atl replica $noofpeers transport tcp $peer1hostname:/rf-images/dc-atl $peer2hostname:/rf-images/dc-atl force
 
# gluster volume create dc-dal replica $noofpeers transport tcp $peer1hostname:/rf-images/dc-dal $peer2hostname:/rf-images/dc-dal force
 
# gluster volume create containers replica $noofpeers transport tcp $peer1hostname:/rf-images/containers $peer2hostname:/rf-images/containers force
 

Client Configuration Notes

 
Note:  Client (node) configuration will be taken care of 99% of the time via the ringfree-cloudstor utility.  However they are listed here for documentation.  This information is also found when running ringfree-cloudstor –listrawcmds
 
(CLIENT) mount GlusterFS volumes via Fuse #TODO
 
# mount -t glusterfs -o backupvolfile-server=$altbrickserver,use-readdirp=no,log-level=WARNING $brickserver:nimbus /rf-images/nimbus-localmount
 
(CLIENT) mount GlusterFS volumes via GlusterFS-NFS. #TODO
# mount -t nfs

Detailed description of new storage:


(Stage 1 Mount) CONTAINER STORAGE
 
TYPE     STORAGE LOC                NODE LOC             CONTAINER LOC 
NFS      /rf-images/containers-atl  /vz/containers       /
NFS      /rf-images/containers-dal  /vz/containers .     /
 
This data is present only in a single datacenter, is not required to sync site to site.  This provides a single target to eventually move to RORW storage in future update. When provisioning a container for high availability, the container must be provisioned in each datacenter.
 
/rf-images/containers-atl/CTID/
                              /                # Root Container FS

 


 
(Stage 2 Mount) DC STORAGE
 
TYPE      STORAGE LOC         NODE LOC    CONTAINER LOC
NFS       /rf-images/dc-atl   /vz/dc-atl  /ringfree-mnt/dc
NFS       /rf-images/dc-dal   /vz/dc-dal  /ringfree-mnt/dc
 
This data is present only in a single datacenter, is not required to sync and has site specific storage.  When provisioning a container for high availability, the container must be provisioned in each datacenter.  This storage, while similar and offering the same features and speed of generic container storage is segregated in order to provide HA features.
 
/rf-images/dc-atl/CTID/
                      /dev                     # POSIX Device FS
                      /lib-udev-devices        # UDEV
                      /lib-udev-devices-pseudo # REQUIRED FOR DAHDI
                      /proc                    # LINUX PROC FS
                      /srv                     # LINUX srv dir
                      /sys                     # LINUX sys dir
                      /tmp                     # tmp 777
                      /var-lib-php-session     # PHP session authentications
                      /var-local               # used by apps
                      /var-lock                # used by apps
                      /var-log                 # Application Logs
                      /var-run                 # Service run files

 
(Stage 3 Mount) NIMBUS 
TYPE               STORAGE LOC        NODE LOC    CONTAINER LOC
GlusterFS (Fuse)   /rf-images/nimbus  /vz/nimbus  /ringfree-mnt/nimbus
 
This data is ALWAYS in sync across data centers.  Nimbus storage only needs to be provisioned once, regardless of high availability or not.
 
/rf-images/nimbus/CTID/
                      /etc                    # PBX Configuration
                      /opt                    # For Custom Apps
                      /var-lib-asterisk       # ASTDB, sounds, framework scripts
                      /var-spool-asterisk     # Recordings, Voicemail
                      /var-www                # framework web modules
 


Updating Timezone for PBX

****This has now been scripted****

From the container, run ringfree-tz-change {eastern|central|mountain|pacific} to alter all necessary files and restart services.

********************************

Very occasionally, the timezone for a PBX is will be incorrect. By default, the template uses Eastern time (GMT -5). If the customer is using any time conditions and is not in that timezone, then they will need to be updated so that the PBX knows their local time and can trigger at the right time per those time conditions. All of these commands will need to be run in the container in question.

Step 1: Set up a symbolic link from /etc/localtime to the correct timezone

The command looks something like this:

ln -sf /usr/share/zoneinfo/Country/City /etc/localtime

For instance, if you wanted to set a PBX to Pacific time, you would use the following command:

ln -sf /usr/share/zoneinfo/America/Los_Angeles /etc/localtime

If you are not sure which city to use, you can use ls or ll in the /usr/share/zoneinfo/America directory to find all the possible options.

Step 2: Update the php.ini file

The php.ini file lives in the /etc directory. So open /etc/php.ini with your preferred cli text editor, and find the timezone section. You can search for ‘timezone’ or ‘date.timezone’ as both will take you to the appropriate section. You will then need to update the date.timezone setting to reflect the new timezone. For instance, if you were updating a PBX to Pacific time, you would change

date.timezone = US/Eastern

to

date.timezone = US/Pacific

Remember to save your changes, and close the file.

Step 3: Restart httpd

Once you’ve updated this files, run the following command:

service httpd restart

Step 4: Confirm new timezone

From the cli, run the date command and confirm that the timezone in the output is correct. Then check the Applications -> Time Groups section on the PBX to confirm that the time indicated is in the right timezone.

Troubleshooting and Adding Missing SBC Routes

As more and more phone numbers are allocated within the North American Numbering Plan, or NANP, more routes must be added to Ringfree’s Session Border Controllers to accommodate the new numbering prefixes. An example of a prefix would be 1828555 and once added as a route in the SBC would serve as the means of routing calls to destinations that begin with that prefix. Formerly, calls to missing routes would be caught by a catch-all route and associated with the US48+Ext calling plan, however as of November 2016 calls to US48+Ext numbers require routes as well.

Identifying a Missing Route

From time to time you will have a customer who can not dial a particular number. To check and see if this is a missing SBC route, the simplest method is to dial the number from an extension on the Ringfree phone system. Given that the Ringfree phone system has extended NANP and international dialing enabled, successfully dialing the number is a solid indicator that the issue may very well be a missing SBC route for that number’s prefix.

To confirm the missing SBC route log into the SBC, click on the “Routes” tab across the top, and using the provided search/filtering tools, enter the following:

  • Route Table: All
  • Search For: The first 7 digits of the number (1NPANXX format)
  • In Column: DigitMatch

If there are no results, then the route is indeed missing. As a sanity check, you can always check a known route to make sure your search data has been properly input.

Adding a Missing Route

First identify whether the route is for the US48 or the US48+Ext calling plan. To do this, search for another route with the same area code and note the value of the Tbl column on the far left. A value of 65533 indicates US48 and a value of 65534 indicates US48+Ext.

Once you’ve identified the calling plan, Click the Add button located toward the top of the Routes page and enter the following (default values indicated):

  • Alias: The route in 1NPANXX format (i.e. “1828692”)
  • Digit Match: Select “Digits” and enter the route in 1NAPNXX format as above
  • Extension: 1 (default value)
  • Route Table: Select either US48:65533 or US48+Ext:65534
  • Time of day routing: weekly (default value)
  • Start time of day: Monday 00:00:01 (default value)
  • Stop time of day: Sunday 24:00:00 (default value)
  • Minimum length match: 6
  • Maximum length match: 40 (default value)
  • Route Group: None (default value)
  • Policy: top_down (default value)

At this point you’ll need to add the trunks. For US48 routes, set the following:

  1. Level3:99996
  2. Thinq:99998
  3. VoipInnovations:99997

Leave the Load Percentage blank and leave Continuous Routing set to yes.

For US48+Ext routes, simply set the first (and optionally more) trunks to Thinq:99998. Once completed, click the Submit button and the new route will be live.

haproxy pem file creation

The .pem cert bundle must be built in a specific order to be properly read by haproxy.  After obtaining a new cert package from Comodo, the file names will have STAR_ prefixes.  First remove those.

Next, combine the following files into a new one called main_ringfree_biz.pem:

# cd /etc/ringfree/certs/
# cat ringfree_biz.key > main_ringfree_biz.pem
# cat ringfree_biz.crt >> main_ringfree_biz.pem
# cat ringfree_biz_bundle.pem >> main_ringfree_biz.pem

Afterwards, restart the haproxy service:

# service haproxy reload

Shinken Monitoring

Shinken is an open source monitoring application written in Python that is a complete binary replacement for Nagios. Ringfree has an instance running on a DigitalOcean virtual private server that monitors the nodes, the Sansay SBCs, and the VoIPmonitor server. The nodes and voipmon are primarily monitored using a service called the “Nagios Remote Plugin Executor” or NRPE which is installed and running on each server.

NRPE is used over other monitoring methods (such as SNMP) as it allows for more granular control over what is monitored and how it’s being monitored. A Nagios plugin can be written in practically any language to monitor practically anything so this gives us the ability to write one-off plugins and install them on any given server for any given purpose.

The Sansay SBCs are monitored primarily using a Sansay provided Nagios plugin that makes use of the SOAP API client. There is an additional plugin written in-house at Ringfree to monitor the SBC state (active, standby, etc) that also makes use of the SOAP API client.


Ringfree’s Shinken instance can be accessed at http://shinken.ringfree.biz:7767. If you need access credentials or a password reset, please consult with John or Kendall. The server can be accessed via ssh using either shinken.ringfree.biz or via IP address at 159.203.76.166.

On the various servers being monitored, NRPE runs as a service but is NOT configured to start automatically on boot. This is on purpose and makes it easy to spot an incident where a server was rebooted.

In the event of a warning or error, Shinken communicates with PagerDuty by sending an email. This email triggers whichever escalation policy is in place within PagerDuty to alert whoever is on call.


Shinken Configuration

Configuration for the monitoring of Ringfree’s infrastructure can be found on the server in the /etc/shinken/objects directory. Specific configuration for each box being monitored can be found within the hosts subdirectory. If you need to monitor a new box, begin by creating a definition here using an existing file as a template.

Most of the specifics associated with what is being monitored can be found in the commands and services subdirectories, the former containing the individual commands (most of which run check_nrpe), the latter containing definitions for which hosts/templates should use which commands.

The Shinken server contains an installation of Monitoring Plugins which contains the check_nrpe command along with many additional Nagios plugins.

There is nothing especially notable about the configuration as it’s all very elementary Nagios. With any prior Nagios/Shinken experience, you should have no problem navigating everything.


Server Configuration

In addition to NRPE, each of the servers also contains an installation of Monitoring Plugins. In the case of the nodes, Monitoring Plugins has been packaged and is available in the Ringfree repository. In other cases (such as voipmon) it has been downloaded and compiled from source.

The NRPE service runs the various commands using the username nrpe. The specific commands along with their arguments are defined in the /etc/nagios/nrpe.cfg file. The commands are aliased with whatever Shinken will use to call that particular command on the server in question. An example of a command and alias definition is:

command[check_openvz]=/usr/lib64/nagios/plugins/check_openvz

In this definition, command describes that this is, in fact, a command definition, [check_openvz] describes that the command can be run using check_openvz from the Shinken instance, and /usr/lib64/nagios/plugins/check_openvz is the command on the server that will be run when Shinken issues a check_openvz signal to NRPE.

Some commands require root privileges in order to run correctly. In order to do this without exposing a gaping security hole, some configuration has been added to the /etc/sudoers file on the affected servers and the commands in nrpe.cfg are prefixed with /usr/bin/sudo. For example on node001, the following can be found within the nrpe.cfg file:

command[check_haproxy]=/usr/bin/sudo /usr/lib64/nagios/plugins/check_haproxy

In /etc/sudoers the following has been added in order to securely allow nrpe to execute the command:

Defaults: nrpe !requiretty
nrpe ALL=(root) NOPASSWD: /usr/lib64/nagios/plugins/check_haproxy ""

The first line indicates that nrpe does not require an active terminal session in order to execute commands. The second line gives explicit access for nrpe to run the check_haproxy command (and no other commands) with root privileges given that there are no arguments passed. Limiting access on a “per command” basis and limiting arguments are necessary to ensure proper security.


Custom Plugins

Several custom plugins have been developed in-house at Ringfree for monitoring various specifics for which there was no preexisting plugin available. Included in these are check_haproxycheck_openvz, and check_voipmon. Most of these are written in Python with very obvious syntax and very obvious goals.

The most notable of the custom plugins is check_sansay_ha which is located on the Shinken server in the /var/lib/shinken/libexec directory. This plugin is used to communicate with the Sansay SBCs and pull down “HA State” information. Any change of state will trigger an alert.

Platform and Deployment for papal.ringfree.biz

papal.ringfree.biz is a complete rewrite of the previous papalmainframe.ringfree.biz PBX management interface. The current instance is deployed on a DigitalOcean virtual private server running Ubuntu 16.04. In this article I’m going to go over the existing Papal platform and then follow with deployment instructions.

Papal Platform

As mentioned, Papal is on a DigitalOcean VPS running Ubuntu 16.04. This is a “long term support” release with security updates guaranteed through at least April of 2021. The server is running Apache 2.4.18, PHP 7.0.4 and MariaDB 10.0.24.

Papal was built to target PHP version 7 so while it may run on some recent versions of PHP 5, it will not optimally do so. It should work with any reasonably recent version of MariaDB, MySQL, or Percona Server. For the sake of guaranteeing maximum compatibility across multiple data centers, please deploy additional instances using MariaDB 10.0 unless directed otherwise.

SSL for papal.ringfree.biz is provided by Let’s Encrypt with automatic renewal checking enabled once weekly. Papal requires SSL in order to function normally. While configuring the Let’s Encrypt client on new Papal instances, please select the “Secure” option to push all traffic through SSL.

Deployment

Install Packages

On a fresh server, begin by installing Apache, PHP7, MariaDB, and a few other necessary packages to complete the LAMP configuration as we’ll need it. In Ubuntu 16.04, please run the following:

sudo apt install apache2 libapache2-mod-php7.0 php7.0 mariadb-server php7.0-xml php7.0-curl php7.0-zip php7.0-mcrypt php7.0-mysql git openjdk-8-jre-headless

Anything additional packages we need should be pulled in automatically as dependencies.


Configure the Apache User Environment

Various Linux distributions use different usernames for the Apache user. Red Hat and derivative distributions use the username apache whereas Debian and derivative distributions use the username www-data. Regardless of which, we’ll need to configure a few things for everything to work normally.

Begin by giving ownership of the user home directory to the user. In the case of Ubuntu 16.04, this is /var/www.

sudo chown -R www-data:www-data /var/www

Next, set up a log file for the Papal instance and give ownership of it to Apache:

sudo touch /var/log/papal.log && sudo chown www-data:www-data /var/log/papal.log

We’ll also be running a few commands as the Apache user. I find that it’s easiest to allow the Apache user to login rather than running each command with sudo, but your mileage may vary. To allow the Apache user to login, edit the file /etc/passwd as root and change the Apache user’s shell from /usr/sbin/nologin or /sbin/nologin to /bin/bash.


Setup Let’s Encrypt

DigitalOcean has a nice write up on setting up Let’s Encrypt on an Ubuntu server. You can find that write up here. Follow the steps exactly. When prompted, use the email address kendall@ringfree.com and select the Secure option to route all traffic through SSL. The configuration process should generate a file for an appropriate Apache VirtualHost. We’ll edit it later.


Retrieve Papal

As the Apache user (use either sudo -u www-data -i to login, or prefix your commands with sudo -u www-data) navigate to the /var/www/html directory and run the following command to pull down the source code:

git clone https://github.com/ringfreejohn/papalmainframe.git

This will install Papal into the /var/www/html/papalmainframe directory.


Set up the Database

MariaDB in most Linux distributions will prompt you to create a password for the root user during installation. This is not the case in Ubuntu 16.04. To set up a root password, please follow the instructions outlined in this article, beginning with the line:

Important Note: In Ubuntu 16.04/15.10/15.04, MariaDB won’t ask you to set root user password during installation.

Once your root password has been set, you’ll need to determine if this instance will use a remote database, a replicated database, or a new database. For a remote or replicated database, please consult with Kendall to set this up. For a new database, login as the root user and create a database and user specifically for our Papal instance:

CREATE DATABASE papal;
GRANT ALL ON papal.* TO papal@localhost IDENTIFIED BY '<password>';

Note: Please replace <password> with a randomly generated password.

The Papal source code includes a base.sql file that will generate all the necessary tables and insert some preliminary data. You may import this file to the new database or get a database dump from another Papal instance to populate the new instance.


Configure Cron

Papal manages sessions and IP blocking from within the database and uses a Cron job for garbage collection rather than handling it internally. As the Apache user, add the following with crontab -e:

* * * * * /usr/bin/php -f /var/www/html/papalmainframe/cron.php &> /dev/null

Copy Files and Delete the Install Directory

Within Papal’s install directory located at /var/www/html/papalmainframe/install there are a few XML files necessary for proper communication with the Sansay SBCs. Copy these files to a system directory somewhere. No specific location is explicitly required for these files, however Papal will look for them in /etc/ringfree by default:

sudo mkdir /etc/ringfree
sudo cp /var/www/html/papalmainframe/install/*.xml /etc/ringfree/

A copy of the Sansay SOAP client will also need to be copied somewhere, preferably to the same directory, and made executable. The SOAP client is not included with the Papal source, however it can be retrieved with wget:

cd /etc/ringfree
sudo wget https://www.dropbox.com/s/7t4948s3o440l9h/VSXi_WS_Client.jar
sudo chmod +x VSXi_WS_Client.jar

Once done, please delete the /var/www/html/papalmainframe/install directory as leaving it would expose the SQL and XML files on the public internet. While there’s practically no chance that they would be identified and downloaded, much less used as an attack vector, the information there is proprietary to Ringfree and we don’t want it exposed.


Configure the Apache VirtualHost

Let’s Encrypt will generate and enable a new site file during setup. If originally using the default site on Ubuntu 16.04, the new file will be at /etc/apache2/sites-available/000-default-le-ssl.conf. Open the file as root, change the ServerAdmin to kendall@ringfree.com, change the DocumentRoot to /var/www/html/papalmainframe, and insert the following within the VirtualHost tags:

 <Directory />
     Options FollowSymLinks
     AllowOverride All
 </Directory>

 <Directory /var/www/html/papalmainframe>
     Options Indexes FollowSymLinks MultiViews
     AllowOverride All
     Order allow,deny
     allow from all
 </Directory>

Functionally this will allow .htaccess rewrites to work, which is necessary for Papal to work properly.


Adjust the Settings and Start Papal

As the Apache user, edit the Papal settings file at /var/www/html/papalmainframe/settings.php and adjust any necessary fields per your above configuration. This would include the database username, password, name, and host, the location of the log file, The location of the SBC SOAP client and template files, etc. The settings.php file is well commented so you shouldn’t have any issues identifying fields that require attention.

Unless I’ve missed something (and let me know if I have), Papal should now be ready to go. Ensure that mod_rewrite is enabled and restart Apache:

sudo a2enmod rewrite
sudo service apache2 restart

 

Ringfree Datacenter Design

Ringfree Datacenter Design

infrastructure

OVERVIEW

It’s often beneficial to understand how the Ringfree datacenter servers are deployed to troubleshoot various issues (such as an upstream carrier outage) or just generally understand the service you’re offering to customers.

Below is a an outline pointing out service availability and what service handles what type of traffic and how that is reconciled with the concept of a hosted cloud PBX.

To access any customer portion of the infrastructure, one must first pass one of two proxy services (of which there are multiple instances of for failover purposes).

REVNAT

The first class is known inhouse as REVNAT (Reverse NAT proxy). This service proxies access to pbxuser and pbxadmin interfaces, FOP2 websocket traffic, and Asterisk AMI access.

Any non-voice protocol will be handled by REVNAT. All REVNAT traffic is routed based on hostname rendered to the various proxy service (Apache, haproxy, Prosody, etc).

SBC

An SBC (Session Border Controller) exists in all Ringfree datacenters. It handles any traffic related to voice. Ringfree uses the open standard of SIP to provide voice service to its customers as well as trunk voice traffic back and forth between inbound and outbound carriers.

All SBC SIP registration and endpoint traffic is routed based on hostname rendered to the Sansay SBC. Call termination traffic (outbound) is first passed through a calling plan which exposes a set of prefixes allowed to be called. The calling plans (US48, US48 + Ext, US48 + Ext + International) route calls based on prefixes through our termination provider (Thinq). Call origination traffic (inbound) is routed based on DNIS and routed to the appropriate PBX by hostname per configuration.

Keeping our traffic organized provides us with not only better performance, but better security as well. It also provides us the ability to run the majority of our services behind our own private networks and exposing only the services required in a controlled manner.

Examples

So you want to register a phone and make a call.  To do this you would first create the extension by accessing the PBX’s web gui.  This traffic hits REVNAT, then is passed by hostname through the node and to the appropriate PBX you are registering the phone to.  After the PBX has been configured with that extension you register a phone to it using the SBC as the phone’s proxy.  The phone hits the proxy and by the hostname it provides for the PBX the SBC acts as a registrar validating the credentials for the phone.  When you make a call out the PBX from that phone, you are making a call through the SBC, to the PBX, and then back out the SBC to the upstream carrier.  This is handled by the SIP Trunk that the PBX is configured for.

Likewise, for an inbound call to a customer’s DID, when the number is dialed it hits the upstream carrier who routes the call to our SBC where it is then routed the PBX after the DID is matched against a route and then checked against a PBX hostname that it was paired with when the DID was entered into the SBC.  When the PBX hostname is matched, the call then hits the PBX where its call flow can be configured in any number of ways including hitting an IVR (Interactive Voice Recording), direct route to extension, or ring groups that ring several phones at once.

How to set up a Ringfree node+pbx sandbox

note: as of ringfree-pbx-utils-3.0-41, ringfree-provision-pbx is now ringfree-provision-pbx.local.  A new tool has been added, ringfree-provision-pbx.nfs and instructions for setting up a sandbox with NFS storage will be available shortly.  In a future update, both will be removed in favor of a unified tool.
This document explains how to set up a node sandbox that you can run PBX instances outside of the main Ringfree datacenter infrastructure.  While this does not let you test connections through our Sansay SBC or through our proxy services, it does allow you to quickly create development environments to work on new features, bug fixes and test staging packages.  The best manner to create a sandbox is to use a virtual machine desktop application.  VMWare Workstation and Oracle Virtualbox will suffice.
1) Download installation media to your local workstation/laptop/computer:
http://mirror1.ringfree.biz/rel/6.3.1/isos/ringfree-os-6.3.1-dvd-update1.iso
2) When creating a VM, you are commonly asked what type of Linux distribution you are virtualizing.  REL/Ringfree OS is forked from Red Hat Enterprise Linux 6 so it or CentOS 6 are the choices you should use.  Be sure to set up your virtual network adapter to bridging mode so it grabs an IP in the same subnet as your local network.  This will allow you to register a phone to it.
3) During install, let the installer use all disk space.  If there’s a warning that pops up from Anaconda complaining that you’re using an unsupported CPU, ignore and continue as an update will pull in an updated kernel.  Any suitable CPU microcode will be applied after installation.
4) After installation, the default root password is “testpass”.  Log in.
5) Modify /etc/sysconfig/network-scripts/ifcfg-eth0 and set BOOT to yes.  Restart networking.  You will now have network access.  Set IP address from DHCP to static and choose an IP not in your DHCP range.  When restarting networking, you may see that system-config-network has segfaulted when trying to configure the OpenVZ network bridge (venet0).  This is a known bug but does not affect the node or the PBX working.
6) Create an A record for your VM’s ip address.   This is used during provisioning (the PBX uses that record to talk to the DB server).  Pointing DNS A records to private ip address ranges although seemingly bizarre is perfectly valid.
7) Update all packages:
# yum -y update

8) Run ‘ringfree-license’ and copy output and send it to john@ringfree.com.  Here is a sample output from a sandbox Kendall created and emailed me:

[root@sandbox ~]# ringfree-license
 
  **********************************************************
  * Ringfree License Tool                                  *
  **********************************************************
 
    Machine Code:        eedf23309653bbca1df760eb38da9f62
    Mac Address:         08:00:27:CD:E3:9A
    Interface found:     eth0
 
    Send the above information to Ringfree so you can
    complete your node registration!
 
    Email: support@ringfree.biz
    Hours: 9am-6pm EST

9) After John has whitelisted your MAC/machinecode, run ‘ringfree-node-setup’ and put your A record in when asked for FQDN.  For licensee put in Ringfree.  for license put in whatever, it’s not actually used and will be removed later.  Here is an example of me running this utility on a new sandbox:

[root@localhost ~]# ringfree-node-setup

**************************************
 Ringfree Node Setup tool
**************************************

Enter node FQDN/hostname:
sandbox1.office.ringfree.biz

Enter company name:
Ringfree

Enter Ringfree license:
blah


Ringfree Node configuration complete!
After John inserts the machine id and licensing info on the blessed list, you’ll then be able to provision PBXs.  To provision a pbx you would run the following:
ringfree-provision-pbx.local $CTID $HOSTNAME $IP $ROOTPW $ADMINPW $SUBDOMAIN $VENDORID
$CTID: container id.  I usually set sandboxes up in the XXX range as it’s easier to type later on (vzctl enter 100)
$HOSTNAME: it’s not required to be correct in the node-centric sandboxes without proxies.  But if you wanted to access the pbx at the address sandbox1.awesomesauce.com, you would enterawesomesauce.com here.
$IP:  Enter another ip address on your private network, something outside of the range of your DHCP.
$ROOTPW:  It’s not used for much as we don’t have ssh turned on inside PBXs, but this sets the root password inside the container.
$ADMINPW:  This is the password for the pbxadmin user “admin”
$SUBDOMAIN:   it’s not required to be correct in the node-centric sandboxes without proxies.  But if you wanted to access the pbx at the address sandbox1.awesomesauce.com, you would enter sandbox1 here.
$VENDORID:  always use a 0
Let the PBX provision.  Afterwards you can enter it at http://$IP/pbxadmin/
Next, let’s run ringfree-updatepbxscripts.  (This tool will not run unless the node is registered)
# ringfree-updatepbxscripts
Since it’s not in the datacenter, you’ll need to update packages inside the container manually with a “yum -y update”

02/26 Tier 2 Update (game plan)

NOTE: This is a Tier 2 update game plan for 02/26/16, service interruptions are expected and customers have been notified.

After a one week delay, I’ve come up with a better and non-destructive way to migrate nodes to NFS storage while having a rock solid rollback plan.

First, all OS package updates promoted from staging this week should be deployed

# onallcts "yum makecache"
# onallcts "yum -y update"

This is going to take a couple of hours to complete.   After it is complete on node001, move onto other nodes during downtime (waiting on container mounts, etc)

Next, all monitoring tools (nagios and shinken) will be turned off.

Next, all containers on node001 need to be stopped:

# ringfree-stopallcts

Next, the vz config file needs to be adjusted.  Open /etc/vz/vz.conf in nano and adjust the following:

VE_ROOT=/vz/root/$VEID
VE_PRIVATE=/vz/private/$VEID

to:

VE_ROOT=/vz/nfsroot/$VEID
VE_PRIVATE=/vz/nfsprivate/$VEID

Next, another rsync is required to catch any last changes in the last two days:

# for i in $(cat /etc/ringfree/manifest.tmp) ; do rsync -avz -e "ssh" --delete /vz/private/$i/ root@atl.san1.ringfree.biz:/rf-images/pbx/$i ; done

Next, all container vz conf files will need to be adjusted. In /etc/sysconfig/vz-scripts/ which is symlinked to /etc/vz/conf/, all containers in /etc/ringfree/manifest.tmp need to have their private and root directory locations changed just like we did in /etc/vz/vz.conf.

Next, the mount directories must be created:

# mkdir /vz/nfsprivate
# mkdir /vz/nfsroot

Next, the mounting variables must be created in the following files:

/etc/ringfree/nfs (The current NFS share server, this will be set to primary)
/etc/ringfree/nfsip.1 (The primary NFS share server)
/etc/ringfree/nfsip.2 (The secondary NFS share server)
/etc/ringfree/nfsmounted (Will be seeded with value of "0")
/etc/ringfree/nfsvzconf (Will be seeded to "/rf-images/vzconf")
/etc/ringfree/nfsvzroot (Will be seeded to "/rf-images/pbxroot")
/etc/ringfree/nfsvzstor (Will be seeded to "/rf-images/pbx")

The OpenVZ service needs to be restarted to load in the new private and root variables.  We don’t normally have it on by default, but vzquota must be turned off in /etc/vz/vz.conf before continuing as well.

Next, ringfree-cloudstor will mount the NFS storage:

# ringfree-cloudstor -m

The tool ringfree-cloudstor has been updated to mount NFS storage in directories other than /vz/private and /vz/root.  This allows us to roll back to local storage in case it appears something might be wrong or the migration takes longer than the maintenance window allows.  /etc/sysconfig/vz-scripts however cannot be mounted in a different directory.  As such, I have taken a snapshot of the directory to /etc/sysconfig/vz-scripts.tar.gz.

The new version of ringfree-cloudstor will mount /etc/sysconfig/vz-scripts, /vz/nfsprivate and /vz/nfsroot from the NFS shares advertised as:

  • $NFS:/rf-images/vzconf (mounted as /etc/sysconfig/vz-scripts, symlinked to /etc/vz/conf)
  • $NFS:/rf-images/pbxroot (mounted as /vz/nfsroot)
  • $NFS:/rf-images/pbx (mounted as /vz/nfsprivate)

Therefore, in case of emergency, all NFS mounts can be safely unmounted, then /etc/sysconfig/vz-scripts.tar.gz can restore all previous configs, and all locate container storage will still be in /vz/private and /vz/root as it was originally.

It is already known that the initial container mounts will take some time.  Four Seasons hospice, our PBX system and Epsilon will be restored first.  After Asterisk, Apache and Postfix are confirmed to be running, the rest of the containers will be mounted using a list of currently mounted containers:

# for i in $(cat /etc/ringfree/manifest.tmp); do vzctl start $i ; done

Luckily, the only files over NFS needed to mount a container are Asterisk, Apache, Postfix and all libraries required by them to run.  In the future we can increase our mount times over NFS simply by placing a gigabit switch in between the nodes, NFS servers and the router.  The nodes will arp directly to the NFS servers over the gigabit switch allowing gigabit network transfer.

Once all containers are confirmed to be fully mounted and load avg has returned to normal levels, nagios and shinken will be turned back on.

To ensure Asterisk is running on all CTs (except revnat and revnat beta on node001), we need to make sure the asterisk process is running.  We can do this by using onallcts in conjunction with the asterisk binary to give us the version number.  If asterisk is running, the asterisk binary will connect and display a version name:

# onallcts "asterisk -rx core\ show\ version"

At this point, node001 will be completely migrated.  Local storage will be left until when it’s decided to be removed.

Reversion Plan

In case there are issues mounting containers which will impact call service outside of the maintenance window, existing containers mounted over NFS should be unmounted and the local storage should be used.  After the containers over NFS are unmounted, /etc/vz/vz.conf should be restored to /vz/root and /vz/private values from /vz/nfsroot and /vz/nfsprivate.  The vz service should then be restarted and then NFS shares themselves should be unmounted.  After they are unmounted, restore /etc/sysconfig/vz-scripts.tar.gz should be untarred.  Containers can at this point be mounted from local storage.