8770317dcbe078cee050f378792aa3e077d02c69
wiki/info/landscape/amazon-ec2.md
| ... | ... | @@ -30,9 +30,18 @@ Further ALBs may exist in addition to the default ALB and the NLB for ``sapsaili |
| 30 | 30 | |
| 31 | 31 | ### Apache httpd Webserver and Reverse Proxy |
| 32 | 32 | |
| 33 | -The web server currently exists only as one instance but could now be replicated to other availability zones (AZ)s, entering those other IPs into the ``HTTP-to-sapsailing-dot-com`` target group (and, as will be described further below, to the ``CentralWebServerHTTP*`` (for the "dynamic" ALB in eu-west-1) or ``{ALB-name}-HTTP`` (for all DNS-mapped ALBs) target group of each application load balancer (ALB) in the region). For all of sapsailing.com it does not (no longer) care about SSL and does not need to have an SSL certificate (anymore). In particular, it offers the following services: |
|
| 33 | +The web server currently exists only as one "central" reverse proxy but work is being undertaken to duplicate the essential services, |
|
| 34 | +to improve availability. Only the current central reverse proxy will be non-disposable, hosting the wiki, releases, Git and Bugzilla. |
|
| 35 | +Other services, such as jobs, static and p2 remain to be decided. Any traffic to the Hudson build server subdomain gets directed by route 53 to a `DDNSMapped` load balancer (which all route any port 80 traffic to 443), which has a rule pointing to a target group, that contains only the build server. |
|
| 36 | + |
|
| 37 | +The IPs for these servers will automatically be added to the `CentralWebServerHTTP-Dyn` target group (in the dynamic ALB in eu-west-1) |
|
| 38 | +and to the `DDNSMapped-x-HTTP` (in all the DDNSMapped servers). These are the target groups for the default rules and it ensures availability to the ARCHIVE especially. |
|
| 39 | +Currently, the new approach tags instances with `disposableProxy` to indicate it hosts no vital services. `ReverseProxy` also identifies any reverse proxies. The health check for the target groups would change to trigger a script which returns different error codes: healthy/200 if in the same AZ as the archive (or if the failover archive is in use), whilst unhealthy/503 if in different AZs. This will reduce cross-AZ, archive traffic costs, but maintain availability and load balancing. |
|
| 40 | + |
|
| 41 | +There is hope to also deploy the httpd on already existing instances, which have free resources and a certain tag permitting this |
|
| 42 | +co-deployment. |
|
| 43 | +For all of sapsailing.com it does not (no longer) care about SSL and does not need to have an SSL certificate (anymore). The central reverse proxy offers the following services: |
|
| 34 | 44 | |
| 35 | -* hudson.sapsailing.com - a Hudson installation on dev.internal.sapsailing.com |
|
| 36 | 45 | * bugzilla.sapsailing.com - a Bugzilla installation under /usr/lib/bugzilla |
| 37 | 46 | * wiki.sapsailing.com - a Gollum-based Wiki served off our git, see /home/wiki |
| 38 | 47 | * static.sapsailing.com - static content hosted under /home/trac/static |
| ... | ... | @@ -1048,6 +1057,19 @@ Follow these steps to upgrade the AMI: |
| 1048 | 1057 | |
| 1049 | 1058 | ## Terminating AWS Sailing Instances |
| 1050 | 1059 | |
| 1060 | +### Automated approach |
|
| 1061 | + |
|
| 1062 | +A lot of the below has been automated and you can archive from the admin console's landscape panel. It automates much of the procedure, |
|
| 1063 | +including the creation of a httpd `.conf file` in the `conf.d` folder on the reverse proxies, via JSCH/SSH. The file produced is named |
|
| 1064 | +after the domain for the event and it contains |
|
| 1065 | +``` |
|
| 1066 | +Use Event-ARCHIVE 49erEuros2022.sapsailing.com bee070d1-605c-4fff-9d71-7688452abe63 # last part is event uuid. |
|
| 1067 | +``` |
|
| 1068 | +which utilises an in-house macro called Event-ARCHIVE, which creates a proxy pass pointing to the archive. Upon adding to the central |
|
| 1069 | +reverse proxy, changes are pushed to the main branch of a specialised repo (must be main for script to work). Upon push completion, a git `post-receive` hook is triggered (found in `httpdHookScript.sh`) which connects to all reverse proxy instances and runs |
|
| 1070 | +`configuration/sync-repo-and-execute-cmd.sh`. This script fetches changes and merges them, whilst trying to best preserve any changes. |
|
| 1071 | +This is done because live changes can occur to some files such as the 000-macros.conf (see the cloud orchestrator page for more details). |
|
| 1072 | + |
|
| 1051 | 1073 | ### ELB Setup with replication server(s) |
| 1052 | 1074 | - Remove all Replica's from the ELB and wait at least 2 minutes until no request reaches their Apache webservers anymore. You can check this with looking at `apachetop` on the respective instances. Let only the Master server live inside the ELB. |
| 1053 | 1075 | - Login to each server instance as `root`-user and stop the java instance with `/home/sailing/servers/server/stop;` |
wiki/projects/cloud-orchestrator.md
| ... | ... | @@ -92,13 +92,11 @@ Loading the time-index fixes for a race from a single MongoDB collection quickly |
| 92 | 92 | |
| 93 | 93 | We should also consider alternatives to MongoDB, at least for the storage of the sensor fixes. [Cassandra](http://cassandra.apache.org/) seems an interesting approach that promises high availability and virtually unlimited scalability. |
| 94 | 94 | |
| 95 | -#### No automatic fail-over for archive server |
|
| 95 | +#### Automatic fail-over for archive server |
|
| 96 | 96 | |
| 97 | -When the archive server fails, a few people get an SMS/text message notification. Manually switching the central reverse proxy configuration in /etc/httpd/conf.d/000-macros.conf is then necessary, followed by a ``service httpd reload`` command to switch to the failover archive server. This process needs automation. A special configuration of "availability" checks between production and failover archive server will be required. We have to figure out where best to put this failover feature: is it something the ALB / target group set-up can do for us? How would the central reverse proxy/proxies route the requests then? |
|
| 97 | +We now automate the failover of the archive server. The approach switches a PRODUCTION_IP variable to point to either the ARCHIVE_IP or the ARCHIVE_FAILOVER_IP, within the macros file, depending on the status of the primary (checked via multiple curl requests). If changes are made then operators are notified and the config reloaded. Note that this only occurs if the status actually changes, so if it is still unhealthy, then notification/reload do not occur. |
|
| 98 | 98 | |
| 99 | -Alternatively, we could look at other mechanisms for implementing the fail-over functionality. For example, Apache can be configured in "balancer" mode where failover rules can be specified explicitly. |
|
| 100 | - |
|
| 101 | -Alternatively, consider looking at Elastic Beanstalk. |
|
| 99 | +Another note: This approach has some coupling to the archiving process of creating new event-archive macros, because that causes an auto-pull. However, the script prioritises local state, as to maintain archive failover function. |
|
| 102 | 100 | |
| 103 | 101 | #### No good approach for dynamic scale-up |
| 104 | 102 |