8f24056a6c62d353165892438431f65fcef3f8cd
configuration/on-site-scripts/clone-security-service-db-safe-exit
| ... | ... | @@ -8,7 +8,7 @@ ssh ec2-user@tokyo-ssh.sapsailing.com "set -e; cd /tmp; rm -rf /tmp/dump; mongod |
| 8 | 8 | echo 'use security_service_bak |
| 9 | 9 | db.dropDatabase() |
| 10 | 10 | db.copyDatabase("security_service", "security_service_bak") |
| 11 | -quit()' | mongo "mongodb://localhost/security_service_bak?replicaSet=security_service&retryWrites=true&readPreference=nearest" && logger -t sailing "Succesfull, continuing..." || ( logger -t sailing "SEVERE: mongo finished with $?"; exit 1 ) |
|
| 11 | +quit()' | mongo "mongodb://localhost/security_service_bak?replicaSet=security_service&retryWrites=true&readPreference=nearest" && logger -t sailing "Succesful, continuing..." || ( logger -t sailing "SEVERE: mongo finished with $?"; exit 1 ) |
|
| 12 | 12 | mongorestore --drop --host security_service/localhost && logger -t sailing "mongorestore finished with $?. Done cloning security_service DB from eu-west-1 live replica set to local tokyo2020 replica set." || ( logger -t sailing "SEVERE: mongorestore finished with $?. Aborting..."; echo 'use security_service |
| 13 | 13 | db.dropDatabase() |
| 14 | 14 | db.copyDatabase("security_service_bak", "security_service") |
wiki/info/landscape/olympic-plan-for-paris-marseille-2024.md
| ... | ... | @@ -8,7 +8,7 @@ Many of these scripts and configuration files contain an explicit reference to t |
| 8 | 8 | |
| 9 | 9 | ## VPCs and VPC Peering |
| 10 | 10 | |
| 11 | -From Tokyo2020 we still have the VPCs around in five regions (``eu-west-3``, ``us-west-1``, ``us-east-1``, ``ap-northeast-1``, and ``ap-southeast-2``). But they are named ``Tokyo2020`` and our scripts currently depend on this. But VPCs can easily be renamed, and with that we may save a lot of work regarding re-peering those VPCs. We will, though need routes to the new "primary" VPC ``eu-west-3`` from everywhere because the ``paris2024-ssh.sapsailing.com`` jump host will be based there. Note the inconsistency in capitalization: for the VPC name and as part of instance names such as ``SL Tokyo2020 (Upgrade Replica)`` we use ``Tokyo2020``, for basically everything else it's ``tokyo2020`` (lowercase). When switching to a parameterized approach we should probably harmonize this and use the lowercase name consistently throughout. |
|
| 11 | +From Tokyo2020 we still have the VPCs around in five regions (``eu-west-3``, ``us-west-1``, ``us-east-1``, ``ap-northeast-1``, and ``ap-southeast-2``). They were named ``Tokyo2020`` and our scripts currently depend on this. But VPCs can easily be renamed, and with that we may save a lot of work regarding re-peering those VPCs. We will, though need routes to the new "primary" VPC ``eu-west-3`` from everywhere because the ``paris-ssh.sapsailing.com`` jump host will be based there. Note the inconsistency in capitalization: for the VPC name and as part of instance names such as ``SL Tokyo2020 (Upgrade Replica)`` we use ``Tokyo2020``, for basically everything else it's ``tokyo2020`` (lowercase). When switching to a parameterized approach we should probably harmonize this and use the lowercase name consistently throughout. |
|
| 12 | 12 | |
| 13 | 13 | I've started with re-naming the VPCs and their routing tables from ``Tokyo2020`` to ``Paris2024``. I've also added VPC peering between Paris (``eu-west-3``) and California (``us-west-1``), Virginia (``us-east-1``), and Sydney (``ap-southeast-2``). The peering between Paris and Tokyo (``ap-northeast-1``) already existed because for Tokyo 2020, Paris hosted replicas that needed to access the jump host in the Tokyo region. |
| 14 | 14 | |
| ... | ... | @@ -20,12 +20,43 @@ We will use one laptop as production master, the other as "shadow master." The r |
| 20 | 20 | |
| 21 | 21 | Both laptops shall run their local RabbitMQ instance. Each of the two master processes can optionally write into its local RabbitMQ through an SSH tunnel which may instead redirect to the cloud-based RabbitMQ for an active Internet/Cloud connection. |
| 22 | 22 | |
| 23 | -This will require to set up two MongoDB databases (not separate processes, just different DB names). |
|
| 23 | +This will require to set up two MongoDB databases (not separate processes, just different DB names), e.g., "paris2024" and "paris2024-shadow". Note that for the shadow master this means that the DB name does not follow the typical naming convention where the ``SERVER_NAME`` property ("paris2024" for both, the primary and the shadow master) also is used as the default MongoDB database name. |
|
| 24 | 24 | |
| 25 | 25 | Note: The shadow master must have at least one registered replica because otherwise it would not send any operations into the RabbitMQ replication channel. This can be a challenge for a shadow master that has never seen any replica. We could, for example, simulate a replica registration when the shadow master is still basically empty, using, e.g., a CURL request and then ignoring and later deleting the initial load queue on the local RabbitMQ. |
| 26 | 26 | |
| 27 | 27 | Furthermore, the shadow master must not send into the production RabbitMQ replication channel that is used by the production master instance while it is not in production itself, because it would duplicate the operations sent. Instead, the shadow master shall use a local RabbitMQ instance to which an SSH tunnel forwards. |
| 28 | 28 | |
| 29 | +## Cloud RabbitMQ |
|
| 30 | + |
|
| 31 | +Instead of ``rabbit-ap-northeast-1.sapsailing.com`` we will use ``rabbit-eu-west-3.sapsailing.com`` pointing to the internal IP address of the RabbitMQ installation in ``eu-west-3`` that is used as the default for the on-site master processes as well as for all cloud replicas. |
|
| 32 | + |
|
| 33 | +## ALB and Target Group Set-Up |
|
| 34 | + |
|
| 35 | +Like for Tokyo2020, a separate ALB for the Paris2024 event will be set up in each of the regions supported. They will all be registered with the Global Accelerator to whose anycast-IP adresses the DNS alias record for ``paris2024.sapsailing.com`` will point. Different from Tokyo2020 where we used a static "404 - Not Found" rule as the default rule for all of these ALBs, we can and should use an IP-based target group for the default rule's forwarding and should registed the ``eu-west-1`` "Webserver" (Central Reverse Proxy)'s internal IP address in these target groups. This way, when archiving the event, cached DNS records can still resolve to the Global Accelerator and from there to the ALB(s) and from there, via these default rules, back to the central reverse proxy which then should now where to find the ``paris2024.sapsailing.com`` content in the archive. |
|
| 36 | + |
|
| 37 | +Target group naming conventions have changed slightly since Tokyo2020: instead of ``S-ded-tokyo2020`` we will use only ``S-paris2024`` for the public target group containing all the cloud replicas. |
|
| 38 | + |
|
| 39 | +## Cloud Replica Set-Up |
|
| 40 | + |
|
| 41 | +Based on the cloud replica set-up for Tokyo2020 we can derive the following user data for Paris2024 cloud replicas: |
|
| 42 | + |
|
| 43 | +``` |
|
| 44 | +INSTALL_FROM_RELEASE=build-............. |
|
| 45 | +SERVER_NAME=paris2024 |
|
| 46 | +MONGODB_URI="mongodb://localhost/paris2024-replica?replicaSet=replica&retryWrites=true&readPreference=nearest" |
|
| 47 | +USE_ENVIRONMENT=live-replica-server |
|
| 48 | +REPLICATION_CHANNEL=paris2024-replica |
|
| 49 | +REPLICATION_HOST=rabbit-eu-west-3.sapsailing.com |
|
| 50 | +REPLICATE_MASTER_SERVLET_HOST=paris-ssh.internal.sapsailing.com |
|
| 51 | +REPLICATE_MASTER_SERVLET_PORT=8888 |
|
| 52 | +REPLICATE_MASTER_EXCHANGE_NAME=paris2024 |
|
| 53 | +REPLICATE_MASTER_QUEUE_HOST=rabbit-eu-west-3.sapsailing.com |
|
| 54 | +REPLICATE_MASTER_BEARER_TOKEN="***" |
|
| 55 | +ADDITIONAL_JAVA_ARGS="${ADDITIONAL_JAVA_ARGS} -Dcom.sap.sse.debranding=true" |
|
| 56 | +``` |
|
| 57 | + |
|
| 58 | +Make sure to align the ``INSTALL_FROM_RELEASE`` parameter to match up with the release used on site. |
|
| 59 | + |
|
| 29 | 60 | ## Switching |
| 30 | 61 | |
| 31 | 62 | ### Production Master Failure |
| ... | ... | @@ -59,3 +90,9 @@ This can be caused by a deadlock, VM crash, Full GC phase, massive performance d |
| 59 | 90 | ### Test Primary Mater Failures with no Internet Connection |
| 60 | 91 | |
| 61 | 92 | Combine the above scenarios: a failing production master (hardware or VM-only) will require different tunnel re-configurations, especially regarding the then local security-service.sapsailing.com environment which may need to move to the shadow laptop. |
| 93 | + |
|
| 94 | +## TODO Before / During On-Site Set-Up (Both, Test Event and OSG2024) |
|
| 95 | + |
|
| 96 | +* Add SSH public keys for password-less private keys of ``sap-p1-1`` and ``sap-p1-2`` to ``ec2-user@paris-ssh.sapsailing.com:.ssh/authorized_keys.org`` so that when the authorized_keys file is updated automatically, the on-site keys are still preserved. |
|
| 97 | +* Create LetsEncrypt certificates for the NGINX installations for paris2024.sapsailing.com and security-service.sapsailing.com and install to the two on-site laptops' NGINX environments |
|
| 98 | +* Ensure the MongoDB installations on both laptops use |
|
| ... | ... | \ No newline at end of file |
wiki/info/landscape/olympic-setup.md
| ... | ... | @@ -552,7 +552,7 @@ REPLICATE_MASTER_BEARER_TOKEN="***" |
| 552 | 552 | ADDITIONAL_JAVA_ARGS="${ADDITIONAL_JAVA_ARGS} -Dcom.sap.sse.debranding=true" |
| 553 | 553 | ``` |
| 554 | 554 | |
| 555 | -(Adjust the release accordingly, of course). |
|
| 555 | +(Adjust the release accordingly, of course). (NOTE: During the first production days of the event we noticed that it was really a BAD IDEA to have all replicas use the same DB set-up, all writing to the MongoDB PRIMARY of the "live" replica set in eu-west-1. With tens of replicas running concurrently, this led to a massive block-up based on MongoDB not writing fast enough. This gave rise to a new application server AMI which now has a MongoDB set-up included, using "replica" as the MongoDB replica set name. Now, each replica hence can write into its own MongoDB instance, isolated from all others and scaling linearly.) |
|
| 556 | 556 | |
| 557 | 557 | In other regions, instead an instance-local MongoDB shall be used for each replica, not interfering with each other or with other databases: |
| 558 | 558 |