diff --git a/sitemap/README.md b/sitemap/README.md index e69de29..f0f59b2 100644 --- a/sitemap/README.md +++ b/sitemap/README.md @@ -0,0 +1,98 @@ +# Sitemap Generator + +A Python-based utility that automatically generates an XML sitemap for the blog website. The sitemap helps search engines discover and index the blog's content more efficiently. + +## Features + +- Automatically generates a sitemap.xml file following the [Sitemap Protocol](https://www.sitemaps.org/protocol.html) +- Includes both static pages and dynamic blog post entries +- Regularly updates the sitemap on a scheduled basis using cron jobs +- Containerized for easy deployment + +## Requirements + +- Python 3.x +- Docker (for containerized deployment) +- Dependencies: + - pydantic + - requests + +## Configuration + +The sitemap generator uses the following environment variables: + +| Variable | Description | Default | +| -------------- | --------------------------------------------------- | ---------- | +| `API_BASE_URL` | Base URL of the blog's API | (required) | +| `FRONTEND_URL` | Base URL of the frontend website | (required) | +| `STORAGE_PATH` | Path where the generated sitemap.xml will be stored | `./static` | + +## Usage + +### Local Execution + +1. Install the required dependencies: + + ```bash + pip install -r requirements.txt + ``` + +2. Set the environment variables: + + ```bash + export API_BASE_URL=http://api.example.com + export FRONTEND_URL=http://www.example.com + ``` + +3. Run the generator: + + ```bash + python gen_sitemap.py + ``` + +### Docker Deployment + +1. Build the Docker image: + + ```bash + docker build -t blog-sitemap-generator . + ``` + +2. Create a directory on the host to store the generated sitemap: + + ```bash + mkdir -p /path/to/host/dir + touch /path/to/host/dir/sitemap.xml + ``` + +3. Run the container: + + ```bash + docker run -d \ + -e API_BASE_URL=http://api.example.com \ + -e FRONTEND_URL=http://www.example.com \ + -v /path/to/host/dir/sitemap.xml:/app/static/sitemap.xml \ + blog-sitemap-generator + ``` + +## Scheduled Execution + +The sitemap is automatically generated according to the schedule defined in the `crontab` file: + +- Every day at 00:00, 08:00, and 16:00 (UTC) + +## File Structure + +- `gen_sitemap.py`: Main script that generates the sitemap +- `requirements.txt`: Python dependencies +- `crontab`: Cron job schedule configuration +- `Dockerfile`: Container configuration for deployment +- `README.md`: Documentation (this file) + +## Output + +The generator produces a standard XML sitemap at the specified `STORAGE_PATH` containing: + +- The homepage URL +- The posts listing page URL +- Individual post URLs with their last modification dates