Compare commits

...

13 Commits

Author SHA1 Message Date
b27d613143 Merge pull request 'fix: remove mcp-auth from yml-file' (#20) from fix/remove-mcp-auth-compose into main
Reviewed-on: #20
2026-05-08 09:33:17 +00:00
Anton
a1ab1c0319 fix: remove mcp-auth from yml-file 2026-05-08 12:32:40 +03:00
0b4e04544d Merge pull request 'fix: remove MCP application-level authorization' (#19) from fix/remove-mcp-auth into main
Reviewed-on: #19
2026-05-08 09:15:18 +00:00
Anton
7593a460c7 fix: remove MCP application-level authorization 2026-05-08 12:14:19 +03:00
a4e7388bcf Merge pull request 'fix: use direct onclick handlers for run rows' (#18) from fix/direct-run-row-click-handler into main
Reviewed-on: #18
2026-05-07 15:25:26 +00:00
Anton
ac319b3ee5 fix: use direct onclick handlers for run rows 2026-05-07 18:23:14 +03:00
8e004c46ef Merge pull request 'fix: move run navigation from id link to table row' (#17) from fix/run-row-link-target into main
Reviewed-on: #17
2026-05-07 14:04:07 +00:00
Anton
7fa28e8e47 fix: move run navigation from id link to table row 2026-05-07 17:03:36 +03:00
1c4ad0bd9d Merge pull request 'fix: make run rows clickable and limit dashboard runs' (#16) from fix/dashboard-run-row-clicks into main
Reviewed-on: #16
2026-05-07 13:24:25 +00:00
Anton
52c5cc1af1 fix: make run rows clickable and limit dashboard runs 2026-05-07 16:23:39 +03:00
c97ced52b4 Merge pull request 'feat: make dashboard metrics and run rows clickable' (#15) from feature/dashboard-clickable-metrics into main
Reviewed-on: #15
2026-05-07 06:36:27 +00:00
Anton
deaecd8d3b feat: make dashboard metrics and run rows clickable 2026-05-07 09:35:44 +03:00
e4d4271e32 Merge pull request 'feat: track crawl run employee changes and verify dismissals' (#14) from feature/crawl-run-change-details into main
Reviewed-on: #14
2026-05-06 12:14:51 +00:00
20 changed files with 130 additions and 404 deletions

View File

@@ -14,13 +14,5 @@ PARSER_USE_PLAYWRIGHT=false
ADMIN_USERNAME=admin ADMIN_USERNAME=admin
ADMIN_PASSWORD=change-me ADMIN_PASSWORD=change-me
SESSION_SECRET=change-me-session-secret SESSION_SECRET=change-me-session-secret
MCP_TOKEN=change-me-mcp-token
MCP_AUTH_MODE=oauth
MCP_RESOURCE_URL=http://localhost:8001/mcp
MCP_OAUTH_ISSUER=
MCP_OAUTH_AUDIENCE=
MCP_OAUTH_JWKS_URL=
MCP_OAUTH_REQUIRED_SCOPE=mcp:tools
API_PORT=8000 API_PORT=8000
MCP_PORT=8001 MCP_PORT=8001

View File

@@ -6,7 +6,7 @@
- `api`: FastAPI, REST API, HTML-админка, healthcheck. - `api`: FastAPI, REST API, HTML-админка, healthcheck.
- `worker`: weekly scheduler, который запускает парсинг по `CRAWL_CRON`. - `worker`: weekly scheduler, который запускает парсинг по `CRAWL_CRON`.
- `mcp`: HTTP MCP endpoint с OAuth/OIDC access token для внешних агентов или legacy static token для локального режима. - `mcp`: открытый HTTP MCP endpoint для ИИ-агентов.
- `postgres`: основная БД. - `postgres`: основная БД.
Парсер использует фиксированный источник сотрудников, по умолчанию `https://miem.hse.ru/persons`. Для каждой карточки сохраняются ФИО, должности, год начала работы, контакты, идентификаторы, вкладки профиля, секции, публикации, курсы, ВКР, JSON-снапшот и сжатый HTML-снапшот. Ссылки обходятся только из меню профиля самого сотрудника (`person-menu`), например `#sci`, `#teaching`, `#main`. Парсер использует фиксированный источник сотрудников, по умолчанию `https://miem.hse.ru/persons`. Для каждой карточки сохраняются ФИО, должности, год начала работы, контакты, идентификаторы, вкладки профиля, секции, публикации, курсы, ВКР, JSON-снапшот и сжатый HTML-снапшот. Ссылки обходятся только из меню профиля самого сотрудника (`person-menu`), например `#sci`, `#teaching`, `#main`.
@@ -27,13 +27,6 @@ cp .env.example .env
- `CRAWL_LIMIT`: опциональный лимит профилей для тестового запуска. - `CRAWL_LIMIT`: опциональный лимит профилей для тестового запуска.
- `ADMIN_USERNAME`, `ADMIN_PASSWORD`: логин и пароль админки. - `ADMIN_USERNAME`, `ADMIN_PASSWORD`: логин и пароль админки.
- `SESSION_SECRET`: секрет подписи cookie. - `SESSION_SECRET`: секрет подписи cookie.
- `MCP_TOKEN`: статический bearer token для legacy/local режима `MCP_AUTH_MODE=token`.
- `MCP_AUTH_MODE`: режим авторизации MCP: `oauth` для внешних агентов или `token` для локальной отладки.
- `MCP_RESOURCE_URL`: публичный URL MCP endpoint, например `https://example.com/mcp`.
- `MCP_OAUTH_ISSUER`: issuer внешнего OIDC-провайдера.
- `MCP_OAUTH_AUDIENCE`: ожидаемый `aud` в OAuth access token.
- `MCP_OAUTH_JWKS_URL`: JWKS endpoint; если не задан, используется `<issuer>/.well-known/jwks.json`.
- `MCP_OAUTH_REQUIRED_SCOPE`: scope для доступа к MCP tools, по умолчанию `mcp:tools`.
- `PARSER_USE_PLAYWRIGHT`: включение Playwright-рендера динамических вкладок. - `PARSER_USE_PLAYWRIGHT`: включение Playwright-рендера динамических вкладок.
## Локальный запуск ## Локальный запуск
@@ -88,9 +81,7 @@ curl -X POST http://localhost:8000/api/crawl-runs --cookie "miem_admin_session=.
## MCP ## MCP
Endpoint: `POST /mcp`, авторизация `Authorization: Bearer <token>`. Endpoint: `POST /mcp`, без авторизации на уровне приложения.
Для внешних ИИ-агентов используйте `MCP_AUTH_MODE=oauth`. В этом режиме статический `MCP_TOKEN` не принимается: клиент должен передать OAuth/OIDC access token с нужным scope.
Поддерживаемые tools: Поддерживаемые tools:
@@ -104,23 +95,11 @@ Endpoint: `POST /mcp`, авторизация `Authorization: Bearer <token>`.
```bash ```bash
curl http://localhost:8001/mcp \ curl http://localhost:8001/mcp \
-H "Authorization: Bearer change-me-mcp-token" \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' -d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}'
``` ```
Для production OAuth/OIDC настройте внешний authorization server и включите режим `oauth`: Если MCP нужно ограничить, делайте это на сетевом уровне: localhost binding, VPN, firewall, reverse proxy или другой внешний контур доступа.
```env
MCP_AUTH_MODE=oauth
MCP_RESOURCE_URL=https://example.com/mcp
MCP_OAUTH_ISSUER=https://auth.example.com
MCP_OAUTH_AUDIENCE=miem-mcp
MCP_OAUTH_JWKS_URL=https://auth.example.com/.well-known/jwks.json
MCP_OAUTH_REQUIRED_SCOPE=mcp:tools
```
MCP server работает как OAuth protected resource: он не выдает токены, а проверяет JWT access token по JWKS, `issuer`, `audience`, сроку действия и scope. Metadata для MCP-клиентов доступна по `GET /.well-known/oauth-protected-resource`.
## Обслуживание ## Обслуживание
@@ -131,4 +110,4 @@ docker compose exec postgres pg_dump -U miem miem_workers > backup.sql
docker compose down docker compose down
``` ```
Версия сервиса: `0.4.0`. Админка всегда показывает версии backend и frontend в footer. Версия сервиса: `0.4.5`. Админка всегда показывает версии backend и frontend в footer.

View File

@@ -29,7 +29,7 @@ def dashboard(request: Request, db: Session = Depends(get_db), settings: Setting
counts = stats_payload(db) counts = stats_payload(db)
counts["runs"] = db.scalar(select(func.count()).select_from(CrawlRun)) or 0 counts["runs"] = db.scalar(select(func.count()).select_from(CrawlRun)) or 0
counts["errors"] = db.scalar(select(func.count()).select_from(CrawlError)) or 0 counts["errors"] = db.scalar(select(func.count()).select_from(CrawlError)) or 0
run_models = db.scalars(select(CrawlRun).order_by(desc(CrawlRun.started_at)).limit(10)).all() run_models = db.scalars(select(CrawlRun).order_by(desc(CrawlRun.started_at)).limit(5)).all()
runs = [run_payload(run) for run in run_models] runs = [run_payload(run) for run in run_models]
return _render(request, "dashboard.html", {"counts": counts, "runs": runs, "latest_run": runs[0] if runs else None}) return _render(request, "dashboard.html", {"counts": counts, "runs": runs, "latest_run": runs[0] if runs else None})

View File

@@ -1,6 +1,4 @@
from functools import lru_cache from functools import lru_cache
from typing import Literal
from pydantic import Field, field_validator from pydantic import Field, field_validator
from pydantic_settings import BaseSettings, SettingsConfigDict from pydantic_settings import BaseSettings, SettingsConfigDict
@@ -19,13 +17,6 @@ class Settings(BaseSettings):
admin_username: str = "admin" admin_username: str = "admin"
admin_password: str = "admin" admin_password: str = "admin"
session_secret: str = Field(default="dev-session-secret", min_length=8) session_secret: str = Field(default="dev-session-secret", min_length=8)
mcp_token: str = "dev-mcp-token"
mcp_auth_mode: Literal["token", "oauth"] = "oauth"
mcp_resource_url: str = "http://localhost:8001/mcp"
mcp_oauth_issuer: str = ""
mcp_oauth_audience: str = ""
mcp_oauth_jwks_url: str = ""
mcp_oauth_required_scope: str = "mcp:tools"
@field_validator("crawl_limit", mode="before") @field_validator("crawl_limit", mode="before")
@classmethod @classmethod
@@ -34,15 +25,6 @@ class Settings(BaseSettings):
return None return None
return value return value
def oauth_jwks_url(self) -> str:
if self.mcp_oauth_jwks_url:
return self.mcp_oauth_jwks_url
issuer = self.mcp_oauth_issuer.rstrip("/")
if not issuer:
return ""
return f"{issuer}/.well-known/jwks.json"
@lru_cache @lru_cache
def get_settings() -> Settings: def get_settings() -> Settings:
return Settings() return Settings()

View File

@@ -4,7 +4,6 @@ from fastapi.staticfiles import StaticFiles
from app.admin import router as admin_router from app.admin import router as admin_router
from app.api import router as api_router from app.api import router as api_router
from app.db import init_db from app.db import init_db
from app.mcp import metadata_router as mcp_metadata_router
from app.mcp import router as mcp_router from app.mcp import router as mcp_router
from app.version import BACKEND_VERSION from app.version import BACKEND_VERSION
@@ -13,7 +12,6 @@ app.mount("/static", StaticFiles(directory="app/static"), name="static")
app.include_router(api_router) app.include_router(api_router)
app.include_router(admin_router) app.include_router(admin_router)
app.include_router(mcp_router) app.include_router(mcp_router)
app.include_router(mcp_metadata_router)
@app.on_event("startup") @app.on_event("startup")

View File

@@ -4,15 +4,12 @@ from fastapi import APIRouter, Depends, Request
from sqlalchemy import desc, or_, select from sqlalchemy import desc, or_, select
from sqlalchemy.orm import Session from sqlalchemy.orm import Session
from app.config import Settings, get_settings
from app.db import get_db from app.db import get_db
from app.models import CrawlRun, Employee from app.models import CrawlRun, Employee
from app.security import mcp_protected_resource_metadata, require_mcp_auth
from app.services.admin_data import run_detail_payload from app.services.admin_data import run_detail_payload
from app.version import BACKEND_VERSION from app.version import BACKEND_VERSION
router = APIRouter(prefix="/mcp") router = APIRouter(prefix="/mcp")
metadata_router = APIRouter()
TOOLS = [ TOOLS = [
@@ -65,9 +62,7 @@ TOOLS = [
async def mcp_http( async def mcp_http(
request: Request, request: Request,
db: Session = Depends(get_db), db: Session = Depends(get_db),
settings: Settings = Depends(get_settings),
) -> dict: ) -> dict:
require_mcp_auth(request, settings)
payload = await request.json() payload = await request.json()
method = payload.get("method") method = payload.get("method")
request_id = payload.get("id") request_id = payload.get("id")
@@ -183,8 +178,3 @@ def _run_payload(run: CrawlRun) -> dict:
def _tool_response(data: object) -> dict: def _tool_response(data: object) -> dict:
return {"content": [{"type": "text", "text": json.dumps(data, ensure_ascii=False, default=str)}]} return {"content": [{"type": "text", "text": json.dumps(data, ensure_ascii=False, default=str)}]}
@metadata_router.get("/.well-known/oauth-protected-resource")
def oauth_protected_resource(settings: Settings = Depends(get_settings)) -> dict:
return mcp_protected_resource_metadata(settings)

View File

@@ -3,10 +3,7 @@ import hashlib
import hmac import hmac
import json import json
import time import time
from functools import lru_cache
import jwt
from jwt import PyJWKClient, PyJWTError
from fastapi import HTTPException, Request, status from fastapi import HTTPException, Request, status
from app.config import Settings from app.config import Settings
@@ -47,93 +44,3 @@ def require_admin(request: Request, settings: Settings) -> str:
if not username: if not username:
raise HTTPException(status_code=status.HTTP_303_SEE_OTHER, headers={"Location": "/admin/login"}) raise HTTPException(status_code=status.HTTP_303_SEE_OTHER, headers={"Location": "/admin/login"})
return username return username
def require_mcp_auth(request: Request, settings: Settings) -> None:
auth = request.headers.get("authorization", "")
if not auth.startswith("Bearer "):
raise _mcp_unauthorized(settings, "Missing bearer token")
token = auth.removeprefix("Bearer ").strip()
if _mcp_static_token_allowed(settings) and hmac.compare_digest(token, settings.mcp_token):
return
if _mcp_oauth_allowed(settings):
_validate_mcp_oauth_token(token, settings)
return
raise _mcp_unauthorized(settings, "Invalid MCP token")
def require_mcp_token(request: Request, settings: Settings) -> None:
require_mcp_auth(request, settings)
def mcp_protected_resource_metadata(settings: Settings) -> dict:
authorization_servers = [settings.mcp_oauth_issuer.rstrip("/")] if settings.mcp_oauth_issuer else []
return {
"resource": settings.mcp_resource_url,
"authorization_servers": authorization_servers,
"bearer_methods_supported": ["header"],
"scopes_supported": [settings.mcp_oauth_required_scope],
"resource_documentation": settings.mcp_resource_url,
}
def _mcp_static_token_allowed(settings: Settings) -> bool:
return settings.mcp_auth_mode == "token"
def _mcp_oauth_allowed(settings: Settings) -> bool:
return settings.mcp_auth_mode == "oauth"
def _validate_mcp_oauth_token(token: str, settings: Settings) -> None:
if not settings.mcp_oauth_issuer or not settings.mcp_oauth_audience or not settings.oauth_jwks_url():
raise _mcp_unauthorized(settings, "MCP OAuth is not configured")
try:
signing_key = _get_mcp_oauth_signing_key(token, settings).key
claims = jwt.decode(
token,
signing_key,
algorithms=["RS256", "RS384", "RS512", "ES256", "ES384", "ES512"],
audience=settings.mcp_oauth_audience,
issuer=settings.mcp_oauth_issuer.rstrip("/"),
)
except PyJWTError as exc:
raise _mcp_unauthorized(settings, "Invalid OAuth access token") from exc
if not _claims_have_scope(claims, settings.mcp_oauth_required_scope):
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Missing required MCP OAuth scope")
def _claims_have_scope(claims: dict, required_scope: str) -> bool:
scopes: set[str] = set()
scope = claims.get("scope")
if isinstance(scope, str):
scopes.update(scope.split())
scp = claims.get("scp")
if isinstance(scp, str):
scopes.update(scp.split())
elif isinstance(scp, list):
scopes.update(str(item) for item in scp)
return required_scope in scopes
@lru_cache(maxsize=16)
def _get_jwk_client(jwks_url: str) -> PyJWKClient:
return PyJWKClient(jwks_url)
def _get_mcp_oauth_signing_key(token: str, settings: Settings):
return _get_jwk_client(settings.oauth_jwks_url()).get_signing_key_from_jwt(token)
def _mcp_unauthorized(settings: Settings, detail: str) -> HTTPException:
headers = {}
if _mcp_oauth_allowed(settings):
headers["WWW-Authenticate"] = f'Bearer resource_metadata="{_mcp_metadata_url(settings)}"'
return HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail=detail, headers=headers)
def _mcp_metadata_url(settings: Settings) -> str:
resource_url = settings.mcp_resource_url.rstrip("/")
base_url = resource_url[: -len("/mcp")] if resource_url.endswith("/mcp") else resource_url
return f"{base_url}/.well-known/oauth-protected-resource"

View File

@@ -1,6 +1,8 @@
.admin { .admin {
margin: 0; margin: 0;
min-height: 100vh; min-height: 100vh;
display: flex;
flex-direction: column;
color: #1f2937; color: #1f2937;
background: #f6f7f9; background: #f6f7f9;
font-family: Arial, sans-serif; font-family: Arial, sans-serif;
@@ -21,6 +23,11 @@
font-size: 20px; font-size: 20px;
} }
.admin__brand-link {
color: inherit;
text-decoration: none;
}
.admin__nav { .admin__nav {
display: flex; display: flex;
align-items: center; align-items: center;
@@ -34,6 +41,7 @@
} }
.admin__main { .admin__main {
flex: 1;
width: min(1180px, calc(100% - 32px)); width: min(1180px, calc(100% - 32px));
margin: 28px auto; margin: 28px auto;
} }
@@ -52,18 +60,30 @@
} }
.metric { .metric {
display: block;
padding: 18px; padding: 18px;
background: #ffffff; background: #ffffff;
border: 1px solid #d9dee7; border: 1px solid #d9dee7;
border-radius: 8px; border-radius: 8px;
} }
.metric--link {
color: inherit;
text-decoration: none;
}
.metric--link:hover {
border-color: #0f766e;
}
.metric__label { .metric__label {
display: block;
color: #6b7280; color: #6b7280;
font-size: 13px; font-size: 13px;
} }
.metric__value { .metric__value {
display: block;
margin-top: 8px; margin-top: 8px;
font-size: 28px; font-size: 28px;
font-weight: 700; font-weight: 700;
@@ -87,6 +107,14 @@
border-collapse: collapse; border-collapse: collapse;
} }
.table__row {
cursor: pointer;
}
.table__row:hover {
background: #f0fdfa;
}
.table__cell, .table__cell,
.table__head { .table__head {
padding: 10px 8px; padding: 10px 8px;
@@ -331,12 +359,22 @@
} }
.stats-strip__item { .stats-strip__item {
display: block;
padding: 14px 16px; padding: 14px 16px;
background: #ffffff; background: #ffffff;
border: 1px solid #d9dee7; border: 1px solid #d9dee7;
border-radius: 8px; border-radius: 8px;
} }
.stats-strip__item--link {
color: inherit;
text-decoration: none;
}
.stats-strip__item--link:hover {
border-color: #0f766e;
}
.stats-strip__label { .stats-strip__label {
display: block; display: block;
color: #6b7280; color: #6b7280;

View File

@@ -59,10 +59,23 @@
applyColumns(columns); applyColumns(columns);
}); });
}); });
}
function setupClickableRows() {
const openRow = (row) => {
window.location.href = row.dataset.rowHref;
};
document.querySelectorAll("[data-row-href]").forEach((row) => { document.querySelectorAll("[data-row-href]").forEach((row) => {
row.addEventListener("click", (event) => { row.addEventListener("click", (event) => {
if (event.target.closest("a, button, input, select, label")) return; if (event.target.closest("a, button, input, select, label")) return;
window.location.href = row.dataset.rowHref; openRow(row);
});
row.addEventListener("keydown", (event) => {
if (!["Enter", " "].includes(event.key)) return;
if (event.target.closest("a, button, input, select, label")) return;
event.preventDefault();
openRow(row);
}); });
}); });
} }
@@ -107,5 +120,6 @@
} }
setupColumns(); setupColumns();
setupClickableRows();
setupProgress(); setupProgress();
})(); })();

View File

@@ -8,7 +8,7 @@
</head> </head>
<body class="admin"> <body class="admin">
<header class="admin__header"> <header class="admin__header">
<h1 class="admin__brand">MIEM Employees</h1> <h1 class="admin__brand"><a class="admin__brand-link" href="/admin">MIEM Employees</a></h1>
<nav class="admin__nav"> <nav class="admin__nav">
<a class="admin__link" href="/admin">Обзор</a> <a class="admin__link" href="/admin">Обзор</a>
<a class="admin__link" href="/admin/directory">Сотрудники</a> <a class="admin__link" href="/admin/directory">Сотрудники</a>

View File

@@ -2,10 +2,10 @@
{% block title %}Обзор · MIEM Employees{% endblock %} {% block title %}Обзор · MIEM Employees{% endblock %}
{% block content %} {% block content %}
<section class="admin__grid"> <section class="admin__grid">
<div class="metric"><div class="metric__label">Всего в базе</div><div class="metric__value">{{ counts.total }}</div></div> <a class="metric metric--link" href="/admin/directory"><span class="metric__label">Всего в базе</span><span class="metric__value">{{ counts.total }}</span></a>
<div class="metric"><div class="metric__label">Работают</div><div class="metric__value">{{ counts.active }}</div></div> <a class="metric metric--link" href="/admin/directory?status=active"><span class="metric__label">Работают</span><span class="metric__value">{{ counts.active }}</span></a>
<div class="metric"><div class="metric__label">Новые за запуск</div><div class="metric__value">{{ counts.new_in_last_run }}</div></div> <a class="metric metric--link" href="{% if latest_run %}/admin/runs/{{ latest_run.id }}#new-employees{% else %}/admin/runs{% endif %}"><span class="metric__label">Новые за запуск</span><span class="metric__value">{{ counts.new_in_last_run }}</span></a>
<div class="metric"><div class="metric__label">Уволены</div><div class="metric__value">{{ counts.dismissed }}</div></div> <a class="metric metric--link" href="/admin/directory?status=dismissed"><span class="metric__label">Уволены</span><span class="metric__value">{{ counts.dismissed }}</span></a>
</section> </section>
<section class="stats-strip"> <section class="stats-strip">
<div class="stats-strip__item"> <div class="stats-strip__item">
@@ -16,10 +16,10 @@
<span class="stats-strip__value">Сотрудников пока нет</span> <span class="stats-strip__value">Сотрудников пока нет</span>
{% endif %} {% endif %}
</div> </div>
<div class="stats-strip__item"> <a class="stats-strip__item stats-strip__item--link" href="/admin/runs">
<span class="stats-strip__label">Запуски</span> <span class="stats-strip__label">Запуски</span>
<span class="stats-strip__value">{{ counts.runs }}</span> <span class="stats-strip__value">{{ counts.runs }}</span>
</div> </a>
<div class="stats-strip__item"> <div class="stats-strip__item">
<span class="stats-strip__label">Ошибки</span> <span class="stats-strip__label">Ошибки</span>
<span class="stats-strip__value">{{ counts.errors }}</span> <span class="stats-strip__value">{{ counts.errors }}</span>
@@ -51,7 +51,7 @@
<thead><tr><th class="table__head">ID</th><th class="table__head">Статус</th><th class="table__head">Обработано</th><th class="table__head">Ошибки</th><th class="table__head">Старт</th></tr></thead> <thead><tr><th class="table__head">ID</th><th class="table__head">Статус</th><th class="table__head">Обработано</th><th class="table__head">Ошибки</th><th class="table__head">Старт</th></tr></thead>
<tbody> <tbody>
{% for run in runs %} {% for run in runs %}
<tr><td class="table__cell">{{ run.id }}</td><td class="table__cell">{{ run.status_display }}</td><td class="table__cell">{{ run.parsed_count }}</td><td class="table__cell">{{ run.error_count }}</td><td class="table__cell">{{ run.started_display }}</td></tr> <tr class="table__row" onclick="window.location.href='/admin/runs/{{ run.id }}'" onkeydown="if (event.key === 'Enter' || event.key === ' ') { event.preventDefault(); window.location.href='/admin/runs/{{ run.id }}'; }" role="link" tabindex="0"><td class="table__cell">{{ run.id }}</td><td class="table__cell">{{ run.status_display }}</td><td class="table__cell">{{ run.parsed_count }}</td><td class="table__cell">{{ run.error_count }}</td><td class="table__cell">{{ run.started_display }}</td></tr>
{% endfor %} {% endfor %}
</tbody> </tbody>
</table> </table>

View File

@@ -23,7 +23,7 @@
</section> </section>
{% for group, title in [("new", "Новые сотрудники"), ("missing_from_source", "Потеряшки"), ("dismissed", "Уволенные")] %} {% for group, title in [("new", "Новые сотрудники"), ("missing_from_source", "Потеряшки"), ("dismissed", "Уволенные")] %}
<section class="panel"> <section class="panel"{% if group == "new" %} id="new-employees"{% endif %}>
<h2 class="panel__title">{{ title }}</h2> <h2 class="panel__title">{{ title }}</h2>
{% set items = run.changes[group] %} {% set items = run.changes[group] %}
{% if items %} {% if items %}

View File

@@ -38,7 +38,7 @@
<thead><tr><th class="table__head">ID</th><th class="table__head">Статус</th><th class="table__head">Найдено</th><th class="table__head">Обработано</th><th class="table__head">Новые</th><th class="table__head">Ошибки</th><th class="table__head">Уволены</th><th class="table__head">Старт</th></tr></thead> <thead><tr><th class="table__head">ID</th><th class="table__head">Статус</th><th class="table__head">Найдено</th><th class="table__head">Обработано</th><th class="table__head">Новые</th><th class="table__head">Ошибки</th><th class="table__head">Уволены</th><th class="table__head">Старт</th></tr></thead>
<tbody> <tbody>
{% for run in runs %} {% for run in runs %}
<tr><td class="table__cell"><a class="admin__link" href="/admin/runs/{{ run.id }}">{{ run.id }}</a></td><td class="table__cell">{{ run.status_display }}</td><td class="table__cell">{{ run.found_count }}</td><td class="table__cell">{{ run.parsed_count }}</td><td class="table__cell">{{ run.new_count }}</td><td class="table__cell">{{ run.error_count }}</td><td class="table__cell">{{ run.dismissed_count }}</td><td class="table__cell">{{ run.started_display }}</td></tr> <tr class="table__row" onclick="window.location.href='/admin/runs/{{ run.id }}'" onkeydown="if (event.key === 'Enter' || event.key === ' ') { event.preventDefault(); window.location.href='/admin/runs/{{ run.id }}'; }" role="link" tabindex="0"><td class="table__cell">{{ run.id }}</td><td class="table__cell">{{ run.status_display }}</td><td class="table__cell">{{ run.found_count }}</td><td class="table__cell">{{ run.parsed_count }}</td><td class="table__cell">{{ run.new_count }}</td><td class="table__cell">{{ run.error_count }}</td><td class="table__cell">{{ run.dismissed_count }}</td><td class="table__cell">{{ run.started_display }}</td></tr>
{% endfor %} {% endfor %}
</tbody> </tbody>
</table> </table>

View File

@@ -1,3 +1,3 @@
APP_VERSION = "0.4.0" APP_VERSION = "0.4.5"
FRONTEND_VERSION = "0.4.0" FRONTEND_VERSION = "0.4.5"
BACKEND_VERSION = "0.4.0" BACKEND_VERSION = "0.4.5"

View File

@@ -20,7 +20,7 @@ services:
environment: environment:
DATABASE_URL: postgresql+psycopg://${POSTGRES_USER:-miem}:${POSTGRES_PASSWORD:-miem_password}@postgres:5432/${POSTGRES_DB:-miem_workers} DATABASE_URL: postgresql+psycopg://${POSTGRES_USER:-miem}:${POSTGRES_PASSWORD:-miem_password}@postgres:5432/${POSTGRES_DB:-miem_workers}
ports: ports:
- "127.0.0.1:8000:8000" - "127.0.0.1:${API_PORT:-8000}:8000"
depends_on: depends_on:
postgres: postgres:
condition: service_healthy condition: service_healthy
@@ -42,33 +42,7 @@ services:
environment: environment:
DATABASE_URL: postgresql+psycopg://${POSTGRES_USER:-miem}:${POSTGRES_PASSWORD:-miem_password}@postgres:5432/${POSTGRES_DB:-miem_workers} DATABASE_URL: postgresql+psycopg://${POSTGRES_USER:-miem}:${POSTGRES_PASSWORD:-miem_password}@postgres:5432/${POSTGRES_DB:-miem_workers}
ports: ports:
- "127.0.0.1:8001:8000" - "127.0.0.1:${MCP_PORT:-8001}:8000"
depends_on:
postgres:
condition: service_healthy
keycloak:
image: quay.io/keycloak/keycloak:latest
container_name: keycloak
restart: unless-stopped
environment:
KC_DB: postgres
KC_DB_URL: jdbc:postgresql://postgres:5432/${KEYCLOAK_DB_NAME}
KC_DB_USERNAME: ${KEYCLOAK_DB_USER}
KC_DB_PASSWORD: ${KEYCLOAK_DB_PASSWORD}
KEYCLOAK_ADMIN: ${KEYCLOAK_ADMIN}
KEYCLOAK_ADMIN_PASSWORD: ${KEYCLOAK_ADMIN_PASSWORD}
KC_HTTP_ENABLED: true
KC_PROXY_HEADERS: xforwarded
KC_HOSTNAME: ${KEYCLOAK_HOSTNAME}
KC_HEALTH_ENABLED: true
KC_METRICS_ENABLED: true
command: start
ports:
- "127.0.0.1:8080:8080"
depends_on: depends_on:
postgres: postgres:
condition: service_healthy condition: service_healthy

View File

@@ -1,6 +1,6 @@
[project] [project]
name = "miem-workers" name = "miem-workers"
version = "0.4.0" version = "0.4.5"
description = "MIEM employees parser, admin API, and MCP server" description = "MIEM employees parser, admin API, and MCP server"
requires-python = ">=3.11" requires-python = ">=3.11"
dependencies = [ dependencies = [
@@ -12,7 +12,6 @@ dependencies = [
"lxml>=5.2.0", "lxml>=5.2.0",
"psycopg[binary]>=3.2.0", "psycopg[binary]>=3.2.0",
"pydantic-settings>=2.4.0", "pydantic-settings>=2.4.0",
"PyJWT[crypto]>=2.9.0",
"python-multipart>=0.0.9", "python-multipart>=0.0.9",
"requests>=2.32.0", "requests>=2.32.0",
"sqlalchemy>=2.0.32", "sqlalchemy>=2.0.32",

View File

@@ -6,7 +6,6 @@ jinja2>=3.1.4
lxml>=5.2.0 lxml>=5.2.0
psycopg[binary]>=3.2.0 psycopg[binary]>=3.2.0
pydantic-settings>=2.4.0 pydantic-settings>=2.4.0
PyJWT[crypto]>=2.9.0
python-multipart>=0.0.9 python-multipart>=0.0.9
requests>=2.32.0 requests>=2.32.0
sqlalchemy>=2.0.32 sqlalchemy>=2.0.32

View File

@@ -8,6 +8,7 @@ def test_base_navigation_is_russian_and_has_no_legacy_employees_link():
assert "Сотрудники" in template assert "Сотрудники" in template
assert "Запуски" in template assert "Запуски" in template
assert "Выйти" in template assert "Выйти" in template
assert '<a class="admin__brand-link" href="/admin">MIEM Employees</a>' in template
assert ">Employees<" not in template assert ">Employees<" not in template
assert "/admin/employees" not in template assert "/admin/employees" not in template
@@ -34,17 +35,59 @@ def test_admin_employees_route_redirects_to_directory():
assert 'RedirectResponse("/admin/directory", status_code=303)' in source assert 'RedirectResponse("/admin/directory", status_code=303)' in source
def test_dashboard_limits_latest_runs_to_five():
source = Path("app/admin.py").read_text(encoding="utf-8")
assert "order_by(desc(CrawlRun.started_at)).limit(5)" in source
assert "order_by(desc(CrawlRun.started_at)).limit(10)" not in source
def test_runs_template_links_to_run_detail(): def test_runs_template_links_to_run_detail():
template = Path("app/templates/runs.html").read_text(encoding="utf-8") template = Path("app/templates/runs.html").read_text(encoding="utf-8")
assert 'href="/admin/runs/{{ run.id }}"' in template assert 'onclick="window.location.href=\'/admin/runs/{{ run.id }}\'"' in template
assert "onkeydown=\"if (event.key === 'Enter' || event.key === ' ')" in template
assert 'role="link"' in template
assert 'tabindex="0"' in template
assert 'data-row-href="/admin/runs/{{ run.id }}"' not in template
assert '<a class="admin__link" href="/admin/runs/{{ run.id }}">' not in template
def test_run_detail_template_extends_base_and_shows_change_groups(): def test_run_detail_template_extends_base_and_shows_change_groups():
template = Path("app/templates/run_detail.html").read_text(encoding="utf-8") template = Path("app/templates/run_detail.html").read_text(encoding="utf-8")
assert '{% extends "base.html" %}' in template assert '{% extends "base.html" %}' in template
assert 'id="new-employees"' in template
assert "Новые сотрудники" in template assert "Новые сотрудники" in template
assert "Потеряшки" in template assert "Потеряшки" in template
assert "Уволенные" in template assert "Уволенные" in template
assert "Детализация сотрудников для этого запуска недоступна" in template assert "Детализация сотрудников для этого запуска недоступна" in template
def test_dashboard_metric_cards_link_to_admin_targets():
template = Path("app/templates/dashboard.html").read_text(encoding="utf-8")
assert 'href="/admin/directory"' in template
assert 'href="/admin/directory?status=active"' in template
assert '/admin/runs/{{ latest_run.id }}#new-employees' in template
assert 'href="/admin/directory?status=dismissed"' in template
assert 'href="/admin/runs"' in template
def test_dashboard_latest_run_rows_link_to_run_detail():
template = Path("app/templates/dashboard.html").read_text(encoding="utf-8")
assert 'onclick="window.location.href=\'/admin/runs/{{ run.id }}\'"' in template
assert "onkeydown=\"if (event.key === 'Enter' || event.key === ' ')" in template
assert 'role="link"' in template
assert 'tabindex="0"' in template
assert 'data-row-href="/admin/runs/{{ run.id }}"' not in template
assert '<a class="admin__link" href="/admin/runs/{{ run.id }}">' not in template
def test_admin_js_supports_keyboard_activation_for_clickable_rows():
source = Path("app/static/admin.js").read_text(encoding="utf-8")
assert 'addEventListener("keydown"' in source
assert '"Enter"' in source
assert '" "' in source

View File

@@ -1,15 +1,10 @@
import time
from datetime import datetime, timezone from datetime import datetime, timezone
from types import SimpleNamespace
import jwt
from fastapi.testclient import TestClient from fastapi.testclient import TestClient
from cryptography.hazmat.primitives.asymmetric import rsa
from sqlalchemy import create_engine from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker from sqlalchemy.orm import sessionmaker
from sqlalchemy.pool import StaticPool from sqlalchemy.pool import StaticPool
import app.security as security
from app.config import Settings, get_settings from app.config import Settings, get_settings
from app.db import Base, get_db from app.db import Base, get_db
from app.main import app from app.main import app
@@ -23,10 +18,10 @@ def test_health_returns_versions():
response = client.get("/api/health") response = client.get("/api/health")
assert response.status_code == 200 assert response.status_code == 200
assert response.json()["backend_version"] == "0.4.0" assert response.json()["backend_version"] == "0.4.5"
def test_mcp_requires_token_and_lists_tools(): def test_mcp_lists_tools_without_auth_and_ignores_auth_header():
engine = create_engine( engine = create_engine(
"sqlite:///:memory:", "sqlite:///:memory:",
connect_args={"check_same_thread": False}, connect_args={"check_same_thread": False},
@@ -43,22 +38,20 @@ def test_mcp_requires_token_and_lists_tools():
session.close() session.close()
app.dependency_overrides[get_db] = override_db app.dependency_overrides[get_db] = override_db
app.dependency_overrides[get_settings] = lambda: Settings(
mcp_auth_mode="token", mcp_token="secret", session_secret="session-secret"
)
client = TestClient(app) client = TestClient(app)
unauthorized = client.post("/mcp", json={"jsonrpc": "2.0", "id": 1, "method": "tools/list", "params": {}}) without_auth = client.post("/mcp", json={"jsonrpc": "2.0", "id": 1, "method": "tools/list", "params": {}})
authorized = client.post( with_auth = client.post(
"/mcp", "/mcp",
headers={"Authorization": "Bearer secret"}, headers={"Authorization": "Bearer anything"},
json={"jsonrpc": "2.0", "id": 1, "method": "tools/list", "params": {}}, json={"jsonrpc": "2.0", "id": 1, "method": "tools/list", "params": {}},
) )
assert unauthorized.status_code == 401 assert without_auth.status_code == 200
assert authorized.status_code == 200 assert with_auth.status_code == 200
assert authorized.json()["result"]["tools"][0]["name"] == "search_employees" assert without_auth.json()["result"]["tools"][0]["name"] == "search_employees"
assert any(tool["name"] == "get_crawl_run_details" for tool in authorized.json()["result"]["tools"]) assert any(tool["name"] == "get_crawl_run_details" for tool in without_auth.json()["result"]["tools"])
assert with_auth.json()["result"]["tools"] == without_auth.json()["result"]["tools"]
app.dependency_overrides.clear() app.dependency_overrides.clear()
@@ -96,14 +89,10 @@ def test_mcp_search_employees_returns_matching_employee():
db.close() db.close()
app.dependency_overrides[get_db] = override_db app.dependency_overrides[get_db] = override_db
app.dependency_overrides[get_settings] = lambda: Settings(
mcp_auth_mode="token", mcp_token="secret", session_secret="session-secret"
)
client = TestClient(app) client = TestClient(app)
response = client.post( response = client.post(
"/mcp", "/mcp",
headers={"Authorization": "Bearer secret"},
json={ json={
"jsonrpc": "2.0", "jsonrpc": "2.0",
"id": 1, "id": 1,
@@ -164,14 +153,10 @@ def test_mcp_get_crawl_run_details_returns_changes():
db.close() db.close()
app.dependency_overrides[get_db] = override_db app.dependency_overrides[get_db] = override_db
app.dependency_overrides[get_settings] = lambda: Settings(
mcp_auth_mode="token", mcp_token="secret", session_secret="session-secret"
)
client = TestClient(app) client = TestClient(app)
response = client.post( response = client.post(
"/mcp", "/mcp",
headers={"Authorization": "Bearer secret"},
json={ json={
"jsonrpc": "2.0", "jsonrpc": "2.0",
"id": 1, "id": 1,
@@ -188,146 +173,12 @@ def test_mcp_get_crawl_run_details_returns_changes():
app.dependency_overrides.clear() app.dependency_overrides.clear()
def test_mcp_oauth_rejects_static_token(): def test_mcp_protected_resource_metadata_route_is_removed():
engine = create_engine(
"sqlite:///:memory:",
connect_args={"check_same_thread": False},
poolclass=StaticPool,
)
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
def override_db():
session = Session()
try:
yield session
finally:
session.close()
settings = Settings(
mcp_auth_mode="oauth",
mcp_token="secret",
session_secret="session-secret",
mcp_oauth_issuer="https://auth.example.com",
mcp_oauth_audience="miem-mcp",
mcp_oauth_jwks_url="https://auth.example.com/.well-known/jwks.json",
)
app.dependency_overrides[get_db] = override_db
app.dependency_overrides[get_settings] = lambda: settings
client = TestClient(app)
response = client.post(
"/mcp",
headers={"Authorization": "Bearer secret"},
json={"jsonrpc": "2.0", "id": 1, "method": "tools/list", "params": {}},
)
assert response.status_code == 401
assert response.headers["www-authenticate"] == (
'Bearer resource_metadata="http://localhost:8001/.well-known/oauth-protected-resource"'
)
app.dependency_overrides.clear()
def test_mcp_oauth_missing_auth_returns_metadata_challenge():
settings = Settings(
mcp_auth_mode="oauth",
mcp_resource_url="https://api.example.com/mcp",
mcp_oauth_issuer="https://auth.example.com",
mcp_oauth_audience="miem-mcp",
mcp_oauth_jwks_url="https://auth.example.com/.well-known/jwks.json",
)
app.dependency_overrides[get_settings] = lambda: settings
client = TestClient(app)
response = client.post("/mcp", json={"jsonrpc": "2.0", "id": 1, "method": "tools/list", "params": {}})
assert response.status_code == 401
assert response.headers["www-authenticate"] == (
'Bearer resource_metadata="https://api.example.com/.well-known/oauth-protected-resource"'
)
app.dependency_overrides.clear()
def test_mcp_accepts_valid_oauth_jwt(monkeypatch):
public_key, token = _oauth_key_and_token()
monkeypatch.setattr(security, "_get_mcp_oauth_signing_key", lambda _token, _settings: SimpleNamespace(key=public_key))
app.dependency_overrides[get_settings] = lambda: _oauth_settings()
client = TestClient(app)
response = client.post(
"/mcp",
headers={"Authorization": f"Bearer {token}"},
json={"jsonrpc": "2.0", "id": 1, "method": "tools/list", "params": {}},
)
assert response.status_code == 200
assert response.json()["result"]["tools"][0]["name"] == "search_employees"
app.dependency_overrides.clear()
def test_mcp_rejects_invalid_oauth_jwts(monkeypatch):
public_key, expired_token = _oauth_key_and_token(exp=int(time.time()) - 60)
_, wrong_issuer_token = _oauth_key_and_token(issuer="https://other.example.com")
_, wrong_audience_token = _oauth_key_and_token(audience="other-audience")
_, bad_signature_token = _oauth_key_and_token(public_key=public_key)
monkeypatch.setattr(security, "_get_mcp_oauth_signing_key", lambda _token, _settings: SimpleNamespace(key=public_key))
app.dependency_overrides[get_settings] = lambda: _oauth_settings()
client = TestClient(app)
for token in [expired_token, wrong_issuer_token, wrong_audience_token, bad_signature_token]:
response = client.post(
"/mcp",
headers={"Authorization": f"Bearer {token}"},
json={"jsonrpc": "2.0", "id": 1, "method": "tools/list", "params": {}},
)
assert response.status_code == 401
app.dependency_overrides.clear()
def test_mcp_rejects_oauth_jwt_without_required_scope(monkeypatch):
public_key, token = _oauth_key_and_token(scope="profile")
monkeypatch.setattr(security, "_get_mcp_oauth_signing_key", lambda _token, _settings: SimpleNamespace(key=public_key))
app.dependency_overrides[get_settings] = lambda: _oauth_settings()
client = TestClient(app)
response = client.post(
"/mcp",
headers={"Authorization": f"Bearer {token}"},
json={"jsonrpc": "2.0", "id": 1, "method": "tools/list", "params": {}},
)
assert response.status_code == 403
app.dependency_overrides.clear()
def test_mcp_protected_resource_metadata_uses_settings():
settings = Settings(
mcp_resource_url="https://api.example.com/mcp",
mcp_oauth_issuer="https://auth.example.com/",
mcp_oauth_required_scope="mcp:tools",
)
app.dependency_overrides[get_settings] = lambda: settings
client = TestClient(app) client = TestClient(app)
response = client.get("/.well-known/oauth-protected-resource") response = client.get("/.well-known/oauth-protected-resource")
assert response.status_code == 200 assert response.status_code == 404
assert response.json() == {
"resource": "https://api.example.com/mcp",
"authorization_servers": ["https://auth.example.com"],
"bearer_methods_supported": ["header"],
"scopes_supported": ["mcp:tools"],
"resource_documentation": "https://api.example.com/mcp",
}
app.dependency_overrides.clear()
def test_api_employees_and_stats_require_admin_session(): def test_api_employees_and_stats_require_admin_session():
@@ -397,35 +248,3 @@ def test_api_employees_and_stats_require_admin_session():
assert run_details.json()["changes"]["new"][0]["full_name"] == "Alpha Person" assert run_details.json()["changes"]["new"][0]["full_name"] == "Alpha Person"
app.dependency_overrides.clear() app.dependency_overrides.clear()
def _oauth_settings() -> Settings:
return Settings(
mcp_auth_mode="oauth",
mcp_resource_url="https://api.example.com/mcp",
mcp_oauth_issuer="https://auth.example.com",
mcp_oauth_audience="miem-mcp",
mcp_oauth_jwks_url="https://auth.example.com/.well-known/jwks.json",
session_secret="session-secret",
)
def _oauth_key_and_token(
*,
issuer: str = "https://auth.example.com",
audience: str = "miem-mcp",
scope: str = "mcp:tools",
exp: int | None = None,
public_key=None,
):
private_key = rsa.generate_private_key(public_exponent=65537, key_size=2048)
claims = {
"iss": issuer,
"aud": audience,
"scope": scope,
"sub": "mcp-client",
"iat": int(time.time()),
"exp": exp or int(time.time()) + 300,
}
token = jwt.encode(claims, private_key, algorithm="RS256", headers={"kid": "test-key"})
return public_key or private_key.public_key(), token

View File

@@ -1,6 +1,3 @@
import pytest
from pydantic import ValidationError
from app.config import Settings from app.config import Settings
@@ -14,8 +11,3 @@ def test_numeric_crawl_limit_is_parsed():
settings = Settings(crawl_limit="25") settings = Settings(crawl_limit="25")
assert settings.crawl_limit == 25 assert settings.crawl_limit == 25
def test_mcp_auth_mode_rejects_oauth_or_token_fallback():
with pytest.raises(ValidationError):
Settings(mcp_auth_mode="oauth_or_token")