# Sitemap Generator A Python-based utility that automatically generates an XML sitemap for the blog website. The sitemap helps search engines discover and index the blog's content more efficiently. ## Features - Automatically generates a sitemap.xml file following the [Sitemap Protocol](https://www.sitemaps.org/protocol.html) - Includes both static pages and dynamic blog post entries - Regularly updates the sitemap on a scheduled basis using cron jobs - Containerized for easy deployment ## Requirements - Python 3.x - Docker (for containerized deployment) - Dependencies: - pydantic - requests ## Configuration The sitemap generator uses the following environment variables: | Variable | Description | Default | | -------------- | --------------------------------------------------- | ---------- | | `API_BASE_URL` | Base URL of the blog's API | (required) | | `FRONTEND_URL` | Base URL of the frontend website | (required) | | `STORAGE_PATH` | Path where the generated sitemap.xml will be stored | `./static` | ## Usage ### Local Execution 1. Install the required dependencies: ```bash pip install -r requirements.txt ``` 2. Set the environment variables: ```bash export API_BASE_URL=http://api.example.com export FRONTEND_URL=http://www.example.com ``` 3. Run the generator: ```bash python gen_sitemap.py ``` ### Docker Deployment 1. Build the Docker image: ```bash docker build -t blog-sitemap-generator . ``` 2. Create a directory on the host to store the generated sitemap: ```bash mkdir -p /path/to/host/dir touch /path/to/host/dir/sitemap.xml ``` 3. Run the container: ```bash docker run -d \ -e API_BASE_URL=http://api.example.com \ -e FRONTEND_URL=http://www.example.com \ -v /path/to/host/dir/sitemap.xml:/app/static/sitemap.xml \ blog-sitemap-generator ``` ## Scheduled Execution The sitemap is automatically generated according to the schedule defined in the `crontab` file: - Every day at 00:00, 08:00, and 16:00 (UTC) ## File Structure - `gen_sitemap.py`: Main script that generates the sitemap - `requirements.txt`: Python dependencies - `crontab`: Cron job schedule configuration - `Dockerfile`: Container configuration for deployment - `README.md`: Documentation (this file) ## Output The generator produces a standard XML sitemap at the specified `STORAGE_PATH` containing: - The homepage URL - The posts listing page URL - Individual post URLs with their last modification dates