Building a Simple Web App Part 4 of N : Nginx/Apache

Now we get to the actual servers that the browser will connect to directly. Like a lot of guides online I’ve chosen to put Django behind nginx so that nginx is serving up all the static files and Django is served by gunicorn.

The trick here is though that I’m also proxying nginx behind apache because the box I am running all this on is already running an apache server that has virtual servers for multiple domains on the box.

This isn’t really recommended or anything but it was convenient for me and I learned a few things.

Apache

Apache can be set up as a reverse proxy like so by putting this in your virtual host directive.

ProxyPreserveHost On
ProxyPass / http://127.0.0.1:8000/
ProxyPassReverse / http://127.0.0.1:8000/

# forward info about the request upstream
RequestHeader set "X-Forwarded-Proto" expr=%{REQUEST_SCHEME}
RequestHeader set "X-Forwarded-SSL" expr=%{HTTPS}

This config takes the requests for the virtual host on to the upstream server at http://127.0.0.1:8000/. Since I’m currently not running the upstream nginx server as SSL this works fine.

Nginx

Now for the nginx server. Here’s what I am working with right now for the nginx.conf. Its not fully optimized or anything.

# this is the system user created by the Dockerfile
user django django;

# A good default here is the number of CPU cores.
worker_processes 1;

# all the errors go into the same log
error_log /tmp/nginx.error.log;

events {
    #  Sets the maximum number of simultaneous connections that can be opened by a worker process
    worker_connections 1024;
}

http {
    include mime.types;

    # just picking the default time for when there is no extension specified
    # text/plain is the default but we'd rather for a download for anything unknown
    default_type application/octet-stream;

    # Update the log format to include the request id
    log_format req_id_log '$remote_addr - $remote_user [$time_local] $request_id "$request" '
                 '$status $body_bytes_sent "$http_referer" "$http_user_agent" '
                 '"$http_x_forwarded_for"';

    # Update the access log to use the new format
    access_log /tmp/nginx.access.log req_id_log;

    # By default, NGINX handles file transmission itself and copies the file
    # into the buffer before sending it. Enabling the sendfile directive eliminates the step of copying
    #  the data into the  buffer and enables direct copying data from one file descriptor to another.
    sendfile on;

    # Define where the upstream server(s) are for the proxy
    upstream app_server {
        # Set up the link to the django server.
        # lots of possible options here for controlling the server
        # including: weight, max_fails, fail_timeout
        server 0.0.0.0:9000 fail_timeout=0;
    }

    # Definitions for the main nginx server
    server {
        # We are listening at 0.0.0.0 because currently we have Apache in front

        listen  0.0.0.0:8000  default;

        # setting this to a large number to allow larger file uploads.
        # TODO: decide if this just just too big/unsafe
        client_max_body_size 4G;

        # maybe we don't even need this since we only have one server?
        server_name 127.0.0.1;

        keepalive_timeout 5;


        # these are settings for the static files from django
        location /static/ {
           autoindex on;
           alias /code/static_root/;
        }

        # these are settings for the media (uploaded) files from django
        location /media/ {
            autoindex on;
            alias /code/media/;
        }

        # this is the main location for the app

        location / {
            # checks for static file, if not found proxy to app
            try_files $uri @proxy_to_app;
        }

        location @proxy_to_app {
            # We don't set anything because currently we are running behind an apache server already
            # proxy_set_header X-Forwarded-Host $host:$server_port;
            # proxy_set_header X-Forwarded-Server $host;
            # proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
             proxy_set_header Host $http_host;

            # Set the request ID in a header so we can trace it later
            # Note: On the django side we fetch this with request.headers.get('X-Request-Id', None)
            proxy_set_header X-Request-ID $request_id;

            # We wanted to get realtime responses from the LLM withouth buffering so we tried
            # the command  proxy_buffering off;
            # we turned this off in favor of response['X-Accel-Buffering'] = 'no' in the response
            # headers from the django side

            proxy_pass   http://app_server;
        }
    }
}

Most of this is explained in the comments but there’s a few points to make:

  • log format – I modified this so I can see what information gets passed all the way through all the proxies. I’d like to be able to use request id later for debugging etc.
  • Proxy settings – I had to do a lot of futzing around here to make the double proxy set up work. All I can really say is the above works in my setup
  • proxy_buffering off – For one of my apps I was using Server Sent Events and wanted them to stream. I got this working first with this setting but then changed it to use the X-Accel-Buffering response header since I only wanted it for certain responses. Anyway with all these servers its important to think about what kind of buffering is going on.

Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *