Recover Discourse from a backup, adjust domain name

Roughly based on Move your Discourse instance to a different server post. But without using the web UI. Because sysadmins prefers terminal.

IMPORTANT: Make SURE the backup dump was generated from the same Discourse version the one you’ll import it into.

Copy from host backup into shared folder. Imagine you uploaded via SSH in your host home directory.

cp ~/snapshot.tar.gz /srv/webplatform/shared/backups/

Note that the folder /srv/webplatform/discuss/shared/standalone/backups/ from the docker host would end up to be in /shared/backups/ inside the container.

Enter the VM, make sure enable_restore is run from discourse cli utility

./launcher enter app
discourse enable_restore

Find the mounted backup file from within the container

ll /shared/backups/default/

Make sure /shared/backups/foo.tar.gz is readable by can read

chmod o+r /shared/backups/default/foo.tar.gz
discourse restore foo.tar.gz
discourse disable_restore

Remap domain name

discourse remap discourse.specifiction.org discuss.webplatform.org

Then, work on user uploads and regenerate assets. That’ll make sure ;

rake uploads:clean_up
rake posts:rebake

Refer to

Add OpenStack instance meta-data info in your salt grains

During a work session on my salt-states for WebPlatform.org I wanted to shape be able to query the OpenStack cluster meta-data so that I can adjust more efficiently my salt configuration.

What are grains? Grains are structured data that describes what a minion has such as which version of GNU/Linux its running, what are the network adapters, etc.

The following is a Python script that adds data in Salt Stack’ internal database called grains.

I have to confess that I didn’t write the script but adapted it to work within an OpenStack cluster. More precisely on DreamHost’s DreamCompute cluster. The original script came from saltstack/salt-contrib and the original file was ec2_info.py to read data from EC2.

The original script wasn’t getting any data in the cluster. Most likely due to API changes and that EC2 API exposes dynamic meta-data that the DreamCompute/OpenStack cluster don’t.

In the end, I edited the file to make it work on DreamCompute and also truncated some data that the grains subsystem already has.

My original objective was to get a list of security-groups the VM was assigned. Unfortunately the API doesn’t give that information yet. Hopefully I’ll find a way to get that information some day.

Get OpenStack instance detail using Salt

Locally

salt-call grains.get dreamcompute:uuid
local:
    10a4f390-7c55-4dd3-0000-a00000000000

Or for another machine

salt app1 grains.get dreamcompute:uuid
app1:
    510f5f24-217b-4fd2-0000-f00000000000

What size did we create a particular VM?

salt app1 grains.get dreamcompute:instance_type
app1:
    lightspeed

What data you can get

Here is a sample of the grain data that will be added to every salt minion you manage.

You might notice that some data will be repeated such as the ‘hostname’, but the rest can be very useful if you want to use the data within your configuration management.

dreamcompute:
    ----------
    availability_zone:
        iad-1
    block_device_mapping:
        ----------
        ami:
            vda
        ebs0:
            /dev/vdb
        ebs1:
            vda
        root:
            /dev/vda
    hostname:
        salt.novalocal
    instance_action:
        none
    instance_id:
        i-00000000
    instance_type:
        lightspeed
    launch_index:
        0
    local_ipv4:
        10.10.10.11
    name:
        salt
    network_config:
        ----------
        content_path:
            /content/0000
        name:
            network_config
    placement:
        ----------
        availability_zone:
            iad-1
    public_ipv4:
        203.0.113.11
    public_keys:
        ----------
        someuser:
            ssh-rsa ...an rsa public key... [email protected]
    ramdisk_id:
        None
    reservation_id:
        r-33333333
    security-groups:
        None
    uuid:
        10a4f390-7c55-4dd3-0000-a00000000000

What does the script do?

The script basically scrapes OpenStack meta-data service and serializes into saltstack grains system the data it gets.

OpenStack’s meta-data service is similar to what you’d get from AWS, but doesn’t expose exactly the same data. This is why I had to adapt the original script.

To get data from an instance you simply (really!) need to make an HTTP call to an internal IP address that OpenStack nova answers.

For example, from an AWS/OpenStack VM, you can know the instance hostname by doing

curl http://169.254.169.254/latest/meta-data/hostname
salt.novalocal

To know what the script calls, you can add a line at _call_aws(url) method like so

diff --git a/_grains/dreamcompute.py b/_grains/dreamcompute.py
index 682235d..c3af659 100644
--- a/_grains/dreamcompute.py
+++ b/_grains/dreamcompute.py
@@ -25,6 +25,7 @@ def _call_aws(url):

     """
     conn = httplib.HTTPConnection("169.254.169.254", 80, timeout=1)
+    LOG.info('API call to ' + url )
     conn.request('GET', url)
     return conn.getresponse()

When you saltutil.sync_all (i.e. refresh grains and other data), the log will tell you which endpoints it queried.

In my case they were:

[INFO    ] API call to /openstack/2012-08-10/meta_data.json
[INFO    ] API call to /latest/meta-data/
[INFO    ] API call to /latest/meta-data/block-device-mapping/
[INFO    ] API call to /latest/meta-data/block-device-mapping/ami
[INFO    ] API call to /latest/meta-data/block-device-mapping/ebs0
[INFO    ] API call to /latest/meta-data/block-device-mapping/ebs1
[INFO    ] API call to /latest/meta-data/block-device-mapping/root
[INFO    ] API call to /latest/meta-data/hostname
[INFO    ] API call to /latest/meta-data/instance-action
[INFO    ] API call to /latest/meta-data/instance-id
[INFO    ] API call to /latest/meta-data/instance-type
[INFO    ] API call to /latest/meta-data/local-ipv4
[INFO    ] API call to /latest/meta-data/placement/
[INFO    ] API call to /latest/meta-data/placement/availability-zone
[INFO    ] API call to /latest/meta-data/public-ipv4
[INFO    ] API call to /latest/meta-data/ramdisk-id
[INFO    ] API call to /latest/meta-data/reservation-id
[INFO    ] API call to /latest/meta-data/security-groups
[INFO    ] API call to /openstack/2012-08-10/meta_data.json
[INFO    ] API call to /latest/meta-data/
[INFO    ] API call to /latest/meta-data/block-device-mapping/
[INFO    ] API call to /latest/meta-data/block-device-mapping/ami
[INFO    ] API call to /latest/meta-data/block-device-mapping/ebs0
[INFO    ] API call to /latest/meta-data/block-device-mapping/ebs1
[INFO    ] API call to /latest/meta-data/block-device-mapping/root
[INFO    ] API call to /latest/meta-data/hostname
[INFO    ] API call to /latest/meta-data/instance-action
[INFO    ] API call to /latest/meta-data/instance-id
[INFO    ] API call to /latest/meta-data/instance-type
[INFO    ] API call to /latest/meta-data/local-ipv4
[INFO    ] API call to /latest/meta-data/placement/
[INFO    ] API call to /latest/meta-data/placement/availability-zone
[INFO    ] API call to /latest/meta-data/public-ipv4
[INFO    ] API call to /latest/meta-data/ramdisk-id
[INFO    ] API call to /latest/meta-data/reservation-id
[INFO    ] API call to /latest/meta-data/security-groups

Its quite heavy.

Hopefully the script respects HTTP headers and don’t bypass 304 Not Modified responses. Otherwise it’ll add load to nova. Maybe I should check that (note-to-self).

Install

You can add this feature by adding a file in your salt states repository in the _grains/ folder. The file can have any name ending by .py.

You can grab the grain python code in this gist.

enjoy!

Run a NodeJS process through forever from within a Docker container

One of the components, Publican, that I had to manage has many moving parts. The end product of that component is basically static HTML documents that ends up on specs.webplatform.org.

Since we need to have many packages installed in very specific version, and automating the installation wouldn’t bring any more benefit than being self-contained, I thought it would be best to go through the steps of converting it into a Docker container.

The following is a procedure I wrote to teach my colleague, Robin Berjon, how to run his system called Publican from within a Docker container. Publican is basically a GitHub hook listener that generates specs written to be parsed by ReSpec or Bikeshed

Run publican inside Docker

What this’ll do is basically build a VM that’ll run a Docker container. The container will write in files outside of it.

You’ll quickly notice that the paths will look the same, its confusing, sorry about that. Fortunately for us, the paths in the procedure are the ones that will be mounted through Docker Volume (the -v option when you call docker) and will, in the end, be the same files.

Once you have a Docker container running on a VM, it’ll replicate how a production VM will run the tasks. Since we know where the container will write files, we’ll have our frontend servers to forward requests to publican, and serve files it generated.

Doing all this removes the need to do any rsync. NGINX within the VM that’ll run Docker will take care of serving static files, and frontend server will expose it to the public.

Steps

  1. Have Vagrant and VirtualBox installed

  2. Follow what’s in renoirb/salt-basesystem README.md

  • Make sure you follow Vagrant Sandbox utilities part

        vagrant ssh
        sudo salt-call state.highstate
        sudo salt-call state.sls vagrantsandbox.docker
        exit
    
  • Reboot the VM by doing vagrant reload

        vagrant reload
    
  1. No need to follow what’s in webplatform/publican DOCKER.md file. Those are notes to show how to build a container. For this time, we’ll use a container I already built and pushed on Docker hub!

  2. Setup what’s required to run the container

        vagrant ssh
    
  • Prepare the folders;

        sudo -s
        su webapps
        id
    
  • You should see

        uid=990(webapps) gid=990(webapps) groups=990(webapps),33(www-data),998(docker)
    
  • Prepare the folders

        cd /srv/webapps
        mkdir publican/data
        cd publican
    
    • If all went well so far; you should be able to do docker ps as the webapps user. Otherwise reboot and/or run salt-call with both state.highstate state.sls vagrantsandbox.docker states. There should be nothing left to do.

      docker ps
      CONTAINER ID        IMAGE                   COMMAND                CREATED...
      
  • Pull the publican Docker image I built (it’ll take about 10 minutes)

        docker pull webspecs/publican:wip
    
  1. Copy the other files in this Gist in your local coputer where you cloned the salt-basesystem repository. From that folder you can move them inside the Vagrant VM where you need.
  • Copy publican config

        cp /vagrant/config.json data/
    
  • Download Bikeshed stuff that i didn’t figure out yet what’s important to keep, extract it in /srv/webapps/publican/spec-data/

        wget https://renoirboulanger.com/spec-data.tar.bz2
        tar xfj spec-data.tar.bz2
    
  • You can open up another terminal session and connect to the Vagrant VM vagrant ssh (e.g. if you don’t use tmux or screen)

        mkdir -p spec-data/readonly/
        mkdir -p data/{gits,logs,publish,queue,temp}
    
  1. Run the container

        docker run -it --rm -v "$(pwd)/data":/srv/webapps/publican/data \
                   -v "$(pwd)/spec-data":/opt/bikeshed/bikeshed/spec-data \
                   -p 7002:7002 webspecs/publican:wip
    
  • If you see the following, you’re in the Docker container!!

        [email protected]:~$
    
  • Initiate the empty shell we just created (it’ll create stuff in the data/ folder outside of the container)

        publican.js init
    
  • It should look like this

    publican-init

  • Once done, exit the container. Notice that by doing this, you lose the state of the VM and anything that has been written in the container. But, since we use volumes (notice the -v /host/path:/container/path), we actually wrote outside of the container.

  • We can exit the container

        exit
    
  • At this stage, we had publican and bikeshed to generate files (we may call this a “cache warmup” of softs). Now, let’s prepare the Vagrant VM to serve the static content. Notice that the next commands are there only for the purpose of a local workspace, in production this step will also be managed automatically.

  • Let’s get back as the root user, and create a quick web server;

        exit
        apt-get -yqq install nginx
        mv /vagrant/default.conf /etc/nginx/sites-available/default
        service restart nginx
    
  • Let’s return back to as the webapps user and launch the runner

        su webapps
        cd /srv/webapps/publican/
    
  • Launch the container; this will also be managed automatically in production.

        docker run -it --rm -v "$(pwd)/data":/srv/webapps/publican/data \
                       -v "$(pwd)/spec-data":/opt/bikeshed/bikeshed/spec-data \
                       -p 7002:7002 webspecs/publican:wip bin/run.sh
    

    It should look like this

    publican-run-hook

  • get your Vagrant VM IP address

        ifconfig
    
  • Should start by 172... or 192...; visit a browser to that address

Gists

Here are the files mentioned in this post

config.json

Publican expects this file as data/config.json.

{
    "bikeshed":     "/opt/bikeshed/bikeshed.py"
,   "rsyncPath":    "/srv/webapps/publican/"
,   "python":       "python2"
,   "logFile":      "logs/all.log"
,   "email":        {
        "to":       "[email protected]"
    ,   "from":     "[email protected]"
    ,   "host":     "localhost"
    ,   "level":    "error"
    ,   "handleExceptions": true
    }
,   "purgeAllURL":  "https://api.fastly.com/service/fooo/purge_all"
,   "purgeAllKey":  "baar"
}

default.conf

A minimal NGINX web server digging for static content that Publican generates.

# file: /etc/nginx/sites-enabled/default

server {
  listen 80 default_server;
  root /srv/webapps/publican/data/publish;
  index index.html index.htm;
  server_name localhost;
  location / { try_files $uri $uri/ =404; }
}
Dockerfile

Here is the project’s Dockerfile I created. I think it should be smaller, but Publican works with the following script.

Each step in a Dockerfile creates a “commit”, make sure you have as few of them as possible, and also make sure that you clean after yourself. Remember that a Docker container is re-deployable and smallest the size of the container, the better!

Notice a few details;

  • ENV DEBIAN_FRONTEND=noninteractive helps with dialogs
  • USER webapps tells “where” the rest of the script will make commands as a different user than root. Make sure what’s required by root to be done before!
  • COPY ... this is basically how you import content inside the container (i.e. make the container heavier)
#
# Publican Docker runner
#
# See also:
#   * https://github.com/nodesource/docker-node/blob/master/ubuntu/trusty/node/0.10.36/Dockerfile

FROM nodesource/trusty:0.10.36

MAINTAINER Renoir Boulanger <[email protected]>

ENV DEBIAN_FRONTEND=noninteractive

# Dependencies: Bikeshed, PhantomJS, Bikshed’s lxml
RUN apt-get update && apt-get -y upgrade && \
    apt-get install -yqq git python2.7 python-dev python-pip libxslt1-dev libxml2-dev zlib1g-dev && \
    apt-get install -yqq libfontconfig1 libfreetype6 curl && \
    apt-get autoremove -yqq --purge && \
    pip install --upgrade lxml

# Copy everything we have locally into the container
# REMINDER: Make sure you run `make clone-bikeshed`, we prefer to keep a copy locally outside
# of the data volume. Otherwise it would make problems saying that bikeshed clone is not in the
# same filesystem.
COPY . /srv/webapps/publican/

# Make sure we have a "non root" user and
# delete any local workbench data/ directory
RUN /usr/sbin/groupadd --system --gid 990 webapps && \
    /usr/sbin/useradd --system --gid 990 --uid 990 -G sudo --home-dir /srv/webapps --shell /bin/bash webapps && \
    sed -i '/^%sudo/d' /etc/sudoers && \
    echo '%sudo ALL=NOPASSWD: ALL' >> /etc/sudoers && \
    mv /srv/webapps/publican/bikeshed /opt && \
    rm -rf data && \
    mkdir -p data/temp && \
    rm -rf Dockerfile Makefile .git .gitignore DOCKER.md && \
    chown -R webapps:webapps /srv/webapps/publican && \
    chown -R webapps:webapps /opt/bikeshed

# Switch from root to webapps system user
# It **HAS to be** the SAME uid/gid as the owner on the host from which we’ll use as volume
USER webapps

# Where the session will start from
WORKDIR /srv/webapps/publican

# Environment variables
ENV PATH /srv/webapps/publican/node_modules/.bin:/srv/webapps/publican/bin:/srv/webapps/publican/.local/bin:$PATH
ENV HOME /srv/webapps/publican
ENV TMPDIR /srv/webapps/publican/data/temp
ENV NODE_ENV production
ENV GIT_DISCOVERY_ACROSS_FILESYSTEM true

# Run what `make deps` would do
RUN pip install --upgrade --user --editable /opt/bikeshed && \
    mkdir -p node_modules && npm install

# Declare which port we expect to expose
EXPOSE 7002

# Allow cli entry for debug, but make sure docker-compose.yml uses "command: bin/run.sh"
ENTRYPOINT ["/bin/bash"]

# Note leftover: Ideally, it should exclusively run
#ENTRYPOINT ["/bin/bash", "/srv/webapps/publican/bin/run.sh"]

# Note leftover: What it ends up doing
#CMD ["node_modules/forever/bin/forever", "--fifo", "logs", "0"]

Forever start script

If you notice in the Docker run command, I call a file bin/run.sh, here it is.

docker run -it --rm -p 7002:7002 \
           webspecs/publican:latest bin/run.sh

Publican runs its process using Forever. The objective of forever is to keep a process to run at all times.

While this isn’t ideal for NodeJS services, in the present use-case of a Docker container who has the only purpose to run a process; Forever apt for the job!

#!/bin/bash

export RUNDIR="/srv/webapps/publican"

cd $RUNDIR

node_modules/forever/bin/forever start $RUNDIR/bin/server.js
node_modules/forever/bin/forever --fifo logs 0

More to come

I have more notes to put up, but not enough time to give more context. Come back later for more!

Project idea: Creating a home made OpenStack cluster for development purposes

Think about it. How about using spare computers to create a homemade OpenStack cluster for development.

We can do that from our cloud provider, or create a separate project or even use Wikimedia’s OpenStack infrastructure allowance for the project.

With such setup, one could work locally with his Salt stack (or Puppet, or Ansible) deployment schemes, try them, trash VMs, rebuild.

The beauty of it would be that it could be made in a fashion that would not even modify the computer running the VMs. The cluster member running OpenStack hypervisor would be installed seeded through net boot. Not booting from the network would revert the computer back as if it never been used.

Here is what I think would require to make this happen.

Limitations

  • Not use Computer/Laptop local hard drive
  • Rely only on net boot

Material

  • 1..n Computers/laptop supporting netboot
  • 1 Storage device supporting one or more storage protocol (nfs, samba, sshfs)

Hardware requirements

  • 1 VM providing tftp, dhcp, dns to serve as net boot server that should run outide of the cluster (“Networking node”)
  • 1 VM image of OpenStack controller (“OpS controller”)
  • 1 LiveCD+persistent image with OpenStack preinstalled, configured to use storage device credentials as it’s root filesystem (“OpS Hypervisor”)

Distribution choice factors

  • Networking node can be the smallest Linux possible, on a RaspberryPI, or a modified Router or Network Attached storage device?
  • OpS Hypervisor to be among the supported OpenStack distributions (I think a Ubuntu precise 12.04 LTS or a variant such as Puppy Linux might work too)

To be continued…

I will keep you posted whenever possible on the outcome of my research.

Did you ever do this in your infra. Leave a comment.

Enfin! J’ai refait mon site

Après plusieurs mois a avoir une partie du projet de fait, et laissé là. J’ai mis en ligne.

C’est la saison des refontes de sites personnels à ce qu’il paraît, beaucoup de monde l’ont refait.

Il n’est pas parfait. Il n’est pas final. Mais il est utilisable et en meilleur état que la version précédente.

Pourtant ça faisait un bon moment que j’avait déjà de quoi de fait. Mais je voulait faire plus grand!

Comme je disait, l’intégration était déjà faite avec mon espace de travail sous Yeoman (Grunt), les patterns de RoughDraft.js et Twitter Bootstrap, tout était prêt sous forme statique (voir le styleguide).

Avant de publier, je voulait en plus;

Puis un moment donné je me suis dit:

Je veut juste publier!

Deux soirs de travail plus tard, tout a été intégré dans WordPress!

C’est une leçon de vie: apprendre en utilisant de nouvelles techniques, mais c’est important aussi de livrer.

Même pour des projets personnels.

Fonctionnalités

  • Écriture sous format Markdown
  • Servi via SSL seulement
  • Caching du contenu avec Memcached
  • Billets balisés avec micro-format RDFa

Outils de travail utilisé

  • Thème enfant de Roots, adapté
  • Twitter Bootstrap, avec mon propre thème utilisant LESS

J’ai aussi crée une maquette statique spécifiquement pour le balisage HTML. Tout ceci dans le but de me concentrer sur l’aspect intégration web du projet, à coté de la partie blogue, base de donnée ou serveur.

Pour ceux que ça intéresse, j’ai commencé a utiliser intensivement Yeoman et RoughDraft.js.

Voir le Styleguide statique de mon site web

À venir

Maintenant que le site est en ligne, j’ai déjà prévu d’autres choses pour l’améliorer.

Notamment:

  • Migration du serveur web sous NGINX avec module SPDY (au lieu d’Apache)
  • Caching HTTP seulement (Varnish ou Memcached, a déterminer)
  • Quelques effets Javascript chargés progressivement
  • Relire, et corriger tout le contenu (ce billet n’a pas été révisé, je l’ai écrit rapidement)
  • Ajouter de nouvelles vues

Mercis

Je voulait donner un beau merci à Gabriela Viana qui est la responsable du design de mon site.

J’ai décidé d’arrêter de participer au Conseil d’administration du W3Qc

J’annonce ma démission au conseil d’administration du W3Québec et a mes fonctions attachées.

La raison de ma démission est pour me permettre de me concentrer sur des aspects professionnels et personnels. Outre le fait que j’ai commencé des cours à l’université, j’ai aussi décidé d’alléger mes engagements, car je n’ai su répondre a tous les engagements passés et je désire avoir l’esprit tranquille sur le niveau de ma performance en réduisant ces derniers.

Je voudrais remercier tous les membres, car j’ai vraiment aimé faire partie du conseil d’administration du W3Qc et j’en garde de bons souvenirs.