Part II: Advanced Topics Up Part II: Advanced Topics Chapter 8: PHP integration 

7 How Apache or LiteSpeed integrate with InterWorx

7.1 Basics of Webserver Integration

InterWorx distributes the RPM packages for Apache, it does not use the CentOS-provided packages. InterWorx makes some modifications to the build process so the integration is smoother, and also allows us to keep the server’s version up to date without having to rely on the operating system or a 3rd party maintainer. It is not recommended that you use packages from another source for Apache. Much like the other services that InterWorx integrates with, the daemon itself is controlled via /etc/init.d scripts (or through the service command which essentially uses the /etc/init.d scripts). In order to control the server’s behavior, config files in /etc/httpd/ are added/modified/deleted. InterWorx uses the base /etc/httpd/conf/httpd.conf which essentially remains unchanged. In order to update and change the configuration, InterWorx will add or modify Apache *.conf files in /etc/httpd/conf.d and issue a server restart or reload.

7.2 SiteWorx Websites and Domains

When adding a new website, InterWorx adds a new vhost config file in /etc/httpd/conf.d/ called vhost_[domain name].conf and uses a template to generate a <VirtualHost> configuration on the fly for that website based on the options you select when creating the site, and the IP address the site lives on. When a sub-domain or pointer domain is added to a SiteWorx domain, InterWorx will update the configuration file for that domain and add a ServerAlias. Secondary domains on the other hand get their own vhost config file since they require a different docroot.
This is also how InterWorx can tell you in the IP Management Screen in NodeWorx under System ▷ IP Management ▷ System IP’s what sites are on which IP’s. InterWorx does not base this page off of an internal database - it scans the current webserver vhost configuration to figure out what domains are mapped to what IPs.

7.3 HTTPS (Secure HTTP)

Figure 7.1 Inspecting SSL Cert on Google Chrome
figure images/inspect-ssl-cert.png
Transport Layer Security (and its predecessor Secure Sockets Layer SSL) allows for a secure channel of communication between the client’s browser and the webserver by sending data in an encrypted stream as opposed to plain text. It also allows the client’s browser to validate that the domain name being used by the client is pointing at the correct IP and hasn’t been hijacked. Now a days, TLS has taken over the role of what the SSL protocol used to do. While they function similarly and use the same concepts to encrypt data, the newer TLS protocols address security concerns found in the older SSL protocols. You can investigate yourself by visiting an HTTPS secured site with Google Chrome and inspecting the encryption certificate in the browser to see what protocol is being used (this is shown in Figure 7.1↑). For the most part, the web hosting community and certificate authorities still refer to the certificates as SSL certificates. From now on we will refer to the secure channel as “SSL” and the certificates as “SSL Certificates” to avoid confusion and maintain congruence with the rest of the SSL certificate industry.

7.3.1 Limitation of SSL

Unfortunately, the secure handshake occurs below the application layer in the transport layer. What this means is that before any HTTP protocol messages are exchanged, (e.g. the client’s browser issuing their GET /index.html HTTP/1.1 request), the handshake and communication channel has to be established with the server. Since the VirtualHost system relies on the HTTP request to figure out what website on a given IP is being requested, you are limited to one SSL certificate per IP address. Thus, in order for SSL to function correctly on a website, it must be the only website on a given IP address since its SSL certificate has to be the one used to make the secure connection. In order to enforce this limitation, InterWorx will only allow a single SiteWorx account on an IP address that is serving a website with SSL enabled. There are new extensions which will eventually allow for the client to indicate what the domain is that they are trying to visit to the server during the SSL handshake phase, but these extensions are not yet wide spread. When they become widely adopted in both client and server software, InterWorx will adapt the system to allow multiple domains with SSL on the same IP.

7.3.2 How SSL is added to the Webserver Configuration

When SSL is added to a SiteWorx domain, an additional <VirtualHost> section is added to the domain’s vhost configuration file. This allows the webserver to serve the correct data when someone connects to SSL on TCP port 443. If there are multiple SiteWorx domains in the SiteWorx Account, only one is permitted to have SSL enabled and installed on it. The other domains will be availble HTTP requests on TCP port 80 only.

7.4 Webmail and /nodeworx /siteworx redirects

Every SiteWorx domain has the cability of accessing the InterWorx-provided webmail clients at http://<domain>/webmail. Furthermore, specific webmail clients can be reached at http://<domain>/horde, /roundcube, and /squirrelmail. In order to provide an easy URL for clients to remember to access their control panel, http://<domain>/siteworx acts as a redirect to the SiteWorx control panel login for their domain. This is all done with redirects and proxy rules defined in /etc/httpd/conf.d/iworx.conf. The base server config is set to:
  • Redirect (i.e. Rewrite the URL) any URL that contains and only contains and /siteworx or /siteworx/ after the domain and port portion of the url to https://<domain>:2443/siteworx/?domain=<domain>. This is because the control panel web front-end is not served from the standard port 80/443 web server instance. It instead it is served from a different instance of Apache that is running on port 2080/2443 for HTTP/HTTPS, respectively. This rule will also trigger if the SiteWorx domain has SSL on it and the
  • Similarly, /nodeworx and /nodeworx/ will redirect to http://<domain>:2443/nodeworx which is the NodeWorx login page. NodeWorx is not domain-specific, so no ?domain=<domain> needs to be added.
Webmail functions a bit differently from the control panel redirects, though. Instead of redirecting we instead proxy from the “main” Apache instance on port 80/443 to the InterWorx Control Panel instance on port 2080. This is because the webmail clients are actually served from the InterWorx Control Panel Apache instance as they are integrated with the Panel’s software distribution. The benefit is:
  • SiteWorx users can access their webmail clients transparently on http://<domain>/webmail without really having to think about ports/etc.
  • SiteWorx accounts with SSL certificates can make use of their certificates to encrypt and authenticate the connection to the front-end Apache server on port 443, which will then proxy the connection to port 2080. Thus the connection remains encrypted over the wire and the user is unaware that they are in fact connecting to a different Apache instance.
  • As the InterWorx control panel instance is running on the local server and the proxy connection occurs all internally on the loopback address (127.0.0.1), there is no latency from the proxy connection and the data cannot be snooped from the internet without someone having unauthorized access to the server’s root user.

7.5 Bandwidth Monitoring

In order to keep users within their bandwidth constraints and also to provide users real-time-ish data on the amount of traffic they are recieving, InterWorx interfaces with both Apache and LiteSpeed to gather bandwidth data per SiteWorx domain (i.e. VirtualHost).

7.5.1 Apache

In order to collect bandwidth data from the Apache web server, InterWorx distributes the Apache webserver with a module called mod_watch which silently records bandwidth used by each virtualhost and IP address on the system. In order to make this data externally available to the control panel, a file is created for each Virtualhost in /var/lib/mod_watch/ with whitespace-delimited data. Each virtualhost and IP address on the system has a file that is updated in real time by the webserver in order to give the control panel access to the most up to date usage stats when bandwidth limits are being checked. The data is collected every 5 minutes by the iworx.pex fively cron job, which also updates the SiteWorx account’s real-time bandwidth tracking RRD graphs and the NodeWorx webserver graphs.

7.5.2 LiteSpeed

The LiteSpeed team made a plugin for their webserver that allows InterWorx to gather the number of bytes in and out of the server for each specific virtual host. The plugin puts a file on the filesystem at /var/log/httpd/interworx_traffic_log which InterWorx then scrapes for data on bandwidth usage. Unlike mod_watch for Apache, though, InterWorx can’t query the number of total requests, the number of documents from the docroot that have been served, the number of current active connections, or the bytes/second rate that data is being sent out because this data is not made available to InterWorx.

7.6 Statistics

One of the most important features to users is the ability to see statistics about how their website is performing on the internet, what their average traffic patterns are like, and what content is most often visited on their site. With tools like Google Analytics rapidly becoming more popular, most serious users trying to improve their search engine rankings will default to Google for that sort of data. Yet with many users also running script and ad blockers in their browsers, it is often desirable to also check the statistics reported by the webserver as they tend to give you a more complete picture of what data is being requested from your server.

7.6.1 The 3 Statistics Programs

Integrated into the SiteWorx control panel InterWorx offers statistics pages generated by the 3 different stats programs bundled with the Panel: Analog, Webalizer, and AWStats. All 3 parse the logs generated by your webserver and deposited in the SiteWorx domain’s /home/<linux user>/var/<domain>/logs/ directory. Since Litespeed is compatible with Apache’s configuration format, it is able to properly generate Apache-compatible logs in that directory. InterWorx has the data generated by the stats software placed in /home/<linux user>/var/<domain>/stats/. Stats are generated once a day, with weekly and monthly stats agragating the daily and weekly data such that users are able to see slices of their sites performance in the past on a weekly or monthly basis. This means that stats are generated by the iworx.pex daily cron job and no updates will be visible to a user until that cron job has run.

7.6.2 Dealing with High-Traffic Sites

For sites with extremely heavy traffic and extremely large logs (often gigabytes in size), InterWorx will try to split the file up to reduce the load on the server before running the stats programs. In addition, in order to lose as little data as possible, InterWorx will take a snapshot of the log and move it elsewhere (unless it’s too large, then slicing occurs) before running the stats such that all SiteWorx accounts have stats generated from about the exact same time in the log file. If this wasn’t done, a very busy site with a large log would delay other SiteWorx accounts from having their stats processed. As a result, the data shown on those sites that were delayed would not be for a period of 24 hours, but instead for a period of longer than 24 hours. Provided that processing time isn’t guaranteed to be constant, this could lead to unreliable data.
 Part II: Advanced Topics Up Part II: Advanced Topics Chapter 8: PHP integration 

(C) 2017 by InterWorx LLC