Compare commits
5 Commits
bf2ca1056b
...
c1caa10f2c
Author | SHA1 | Date | |
---|---|---|---|
c1caa10f2c | |||
68a175b704 | |||
05112e4df2 | |||
53ac90e474 | |||
b9188a026d |
@ -43,3 +43,13 @@ jobs:
|
|||||||
tags: |
|
tags: |
|
||||||
${{ vars.REGISTRY }}/${{ vars.IMAGE_REPO_BACKEND }}:latest
|
${{ vars.REGISTRY }}/${{ vars.IMAGE_REPO_BACKEND }}:latest
|
||||||
${{ vars.REGISTRY }}/${{ vars.IMAGE_REPO_BACKEND }}:${{ gitea.event.release.tag_name }}
|
${{ vars.REGISTRY }}/${{ vars.IMAGE_REPO_BACKEND }}:${{ gitea.event.release.tag_name }}
|
||||||
|
|
||||||
|
- name: Build and push (Sitemap Generator)
|
||||||
|
uses: docker/build-push-action@v6
|
||||||
|
with:
|
||||||
|
push: true
|
||||||
|
provenance: false
|
||||||
|
context: ./sitemap
|
||||||
|
tags: |
|
||||||
|
${{ vars.REGISTRY }}/${{ vars.IMAGE_REPO_SITEMAP }}:latest
|
||||||
|
${{ vars.REGISTRY }}/${{ vars.IMAGE_REPO_SITEMAP }}:${{ gitea.event.release.tag_name }}
|
||||||
|
1
sitemap/.gitignore
vendored
Normal file
1
sitemap/.gitignore
vendored
Normal file
@ -0,0 +1 @@
|
|||||||
|
.venv
|
17
sitemap/Dockerfile
Normal file
17
sitemap/Dockerfile
Normal file
@ -0,0 +1,17 @@
|
|||||||
|
FROM python:3.13-alpine AS base
|
||||||
|
ENV PYTHONUNBUFFERED=1
|
||||||
|
WORKDIR /app
|
||||||
|
COPY crontab /etc/crontabs/root
|
||||||
|
COPY requirements.txt ./
|
||||||
|
RUN apk add --no-cache gcc musl-dev libffi-dev cronie && \
|
||||||
|
pip install --no-cache-dir -r requirements.txt
|
||||||
|
|
||||||
|
FROM base AS runner
|
||||||
|
WORKDIR /app
|
||||||
|
COPY . .
|
||||||
|
ENV API_BASE_URL=
|
||||||
|
ENV FRONTEND_URL=
|
||||||
|
ENV STORAGE_PATH=/app/static
|
||||||
|
RUN touch /var/log/cron.log && chmod 0644 /var/log/cron.log
|
||||||
|
VOLUME [ "/app/static/sitemap.xml" ]
|
||||||
|
CMD ["/bin/sh", "-c", "python /app/gen_sitemap.py && crond -f && tail -f /var/log/cron.log"]
|
98
sitemap/README.md
Normal file
98
sitemap/README.md
Normal file
@ -0,0 +1,98 @@
|
|||||||
|
# Sitemap Generator
|
||||||
|
|
||||||
|
A Python-based utility that automatically generates an XML sitemap for the blog website. The sitemap helps search engines discover and index the blog's content more efficiently.
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- Automatically generates a sitemap.xml file following the [Sitemap Protocol](https://www.sitemaps.org/protocol.html)
|
||||||
|
- Includes both static pages and dynamic blog post entries
|
||||||
|
- Regularly updates the sitemap on a scheduled basis using cron jobs
|
||||||
|
- Containerized for easy deployment
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
- Python 3.x
|
||||||
|
- Docker (for containerized deployment)
|
||||||
|
- Dependencies:
|
||||||
|
- pydantic
|
||||||
|
- requests
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
The sitemap generator uses the following environment variables:
|
||||||
|
|
||||||
|
| Variable | Description | Default |
|
||||||
|
| -------------- | --------------------------------------------------- | ---------- |
|
||||||
|
| `API_BASE_URL` | Base URL of the blog's API | (required) |
|
||||||
|
| `FRONTEND_URL` | Base URL of the frontend website | (required) |
|
||||||
|
| `STORAGE_PATH` | Path where the generated sitemap.xml will be stored | `./static` |
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
### Local Execution
|
||||||
|
|
||||||
|
1. Install the required dependencies:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install -r requirements.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Set the environment variables:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export API_BASE_URL=http://api.example.com
|
||||||
|
export FRONTEND_URL=http://www.example.com
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Run the generator:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python gen_sitemap.py
|
||||||
|
```
|
||||||
|
|
||||||
|
### Docker Deployment
|
||||||
|
|
||||||
|
1. Build the Docker image:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker build -t blog-sitemap-generator .
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Create a directory on the host to store the generated sitemap:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
mkdir -p /path/to/host/dir
|
||||||
|
touch /path/to/host/dir/sitemap.xml
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Run the container:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker run -d \
|
||||||
|
-e API_BASE_URL=http://api.example.com \
|
||||||
|
-e FRONTEND_URL=http://www.example.com \
|
||||||
|
-v /path/to/host/dir/sitemap.xml:/app/static/sitemap.xml \
|
||||||
|
blog-sitemap-generator
|
||||||
|
```
|
||||||
|
|
||||||
|
## Scheduled Execution
|
||||||
|
|
||||||
|
The sitemap is automatically generated according to the schedule defined in the `crontab` file:
|
||||||
|
|
||||||
|
- Every day at 00:00, 08:00, and 16:00 (UTC)
|
||||||
|
|
||||||
|
## File Structure
|
||||||
|
|
||||||
|
- `gen_sitemap.py`: Main script that generates the sitemap
|
||||||
|
- `requirements.txt`: Python dependencies
|
||||||
|
- `crontab`: Cron job schedule configuration
|
||||||
|
- `Dockerfile`: Container configuration for deployment
|
||||||
|
- `README.md`: Documentation (this file)
|
||||||
|
|
||||||
|
## Output
|
||||||
|
|
||||||
|
The generator produces a standard XML sitemap at the specified `STORAGE_PATH` containing:
|
||||||
|
|
||||||
|
- The homepage URL
|
||||||
|
- The posts listing page URL
|
||||||
|
- Individual post URLs with their last modification dates
|
3
sitemap/crontab
Normal file
3
sitemap/crontab
Normal file
@ -0,0 +1,3 @@
|
|||||||
|
0 0 * * * /usr/local/bin/python /app/gen_sitemap.py
|
||||||
|
0 8 * * * /usr/local/bin/python /app/gen_sitemap.py
|
||||||
|
0 16 * * * /usr/local/bin/python /app/gen_sitemap.py
|
67
sitemap/gen_sitemap.py
Normal file
67
sitemap/gen_sitemap.py
Normal file
@ -0,0 +1,67 @@
|
|||||||
|
import os
|
||||||
|
from urllib.parse import urljoin
|
||||||
|
import xml.etree.ElementTree as ET
|
||||||
|
|
||||||
|
import requests
|
||||||
|
from pydantic import BaseModel
|
||||||
|
|
||||||
|
|
||||||
|
API_BASE_URL = os.environ.get("API_BASE_URL")
|
||||||
|
FRONTEND_URL = os.environ.get("FRONTEND_URL")
|
||||||
|
STORAGE_PATH = os.environ.get("STORAGE_PATH", "./static")
|
||||||
|
|
||||||
|
|
||||||
|
class SitemapItem(BaseModel):
|
||||||
|
loc: str
|
||||||
|
lastmod: str = None
|
||||||
|
|
||||||
|
|
||||||
|
def get_posts():
|
||||||
|
url = urljoin(API_BASE_URL, "post")
|
||||||
|
response = requests.get(url)
|
||||||
|
response.raise_for_status()
|
||||||
|
|
||||||
|
return response.json()
|
||||||
|
|
||||||
|
|
||||||
|
def map_post_to_sitemap_item(post: dict) -> SitemapItem:
|
||||||
|
loc = urljoin(FRONTEND_URL, f"post/{post['id']}")
|
||||||
|
lastmod = post["published_time"][:10]
|
||||||
|
return SitemapItem(loc=loc, lastmod=lastmod)
|
||||||
|
|
||||||
|
|
||||||
|
def generate_sitemap(items: list[SitemapItem]) -> str:
|
||||||
|
header = '<?xml version="1.0" encoding="UTF-8"?>\n'
|
||||||
|
root = ET.Element("urlset", xmlns="http://www.sitemaps.org/schemas/sitemap/0.9")
|
||||||
|
|
||||||
|
for item in items:
|
||||||
|
url_element = ET.SubElement(root, "url")
|
||||||
|
ET.SubElement(url_element, "loc").text = item.loc
|
||||||
|
if item.lastmod:
|
||||||
|
ET.SubElement(url_element, "lastmod").text = item.lastmod
|
||||||
|
|
||||||
|
return header + ET.tostring(root, encoding="unicode")
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
if not API_BASE_URL:
|
||||||
|
raise ValueError("API_BASE_URL environment variable is not set.")
|
||||||
|
if not FRONTEND_URL:
|
||||||
|
raise ValueError("FRONTEND_URL environment variable is not set.")
|
||||||
|
|
||||||
|
posts = get_posts()
|
||||||
|
static_pages = [
|
||||||
|
SitemapItem(loc=FRONTEND_URL),
|
||||||
|
SitemapItem(loc=urljoin(FRONTEND_URL, "post")),
|
||||||
|
]
|
||||||
|
sitemap_items = [*static_pages, *map(map_post_to_sitemap_item, posts)]
|
||||||
|
|
||||||
|
sitemap = generate_sitemap(sitemap_items)
|
||||||
|
sitemap_path = os.path.join(STORAGE_PATH, "sitemap.xml")
|
||||||
|
os.makedirs(STORAGE_PATH, exist_ok=True)
|
||||||
|
with open(sitemap_path, "w", encoding="utf-8") as f:
|
||||||
|
f.write(sitemap)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
2
sitemap/requirements.txt
Normal file
2
sitemap/requirements.txt
Normal file
@ -0,0 +1,2 @@
|
|||||||
|
pydantic
|
||||||
|
requests
|
Loading…
x
Reference in New Issue
Block a user