Codex — Setup Guide

Prerequisites

Tool	Version	Purpose
Python	3.10+	API and CLI
Docker + Docker Compose	any recent	Neo4j database
pip	any	Python dependencies

Quick start

1. docker-compose up -d          # start Neo4j
2. cp .env.example .env          # configure credentials
3. pip install -r requirements.txt
4. python api.py                 # start the API
5. python cli.py                 # open the CLI

Step 1 — Start Neo4j

docker-compose up -d

This starts Neo4j 5.18 with the APOC plugin (required for fuzzy matching) and exposes two ports:

Port	Use
7474	Neo4j Browser — http://localhost:7474 (login: `neo4j` / `changeme`)
7687	Bolt — used by the API

Wait ~30 seconds for Neo4j to finish initialising before starting the API. Watch progress with:

docker-compose logs -f neo4j

Step 2 — Configure credentials

cp .env.example .env

The defaults match docker-compose.yml exactly. Edit .env only if you have changed the Neo4j credentials.

Step 3 — Install Python dependencies

pip install -r requirements.txt

Step 4 — Start the API

python api.py

Or with auto-reload during development:

uvicorn api:app --reload

Swagger UI is available at http://localhost:8000/docs

Step 5 — Verify

python cli.py
codex> health

Expected output:

  API   : ✓ running  (v2.0.0)
  Neo4j : ✓ connected

Step 6 — Load drug data

Upload one CSV file per language/country combination:

codex> csv upload sample_data/drugs_en_US.csv
codex> csv upload sample_data/drugs_es_MX.csv

Verify what was loaded:

codex> languages          # shows: en, es
codex> countries          # shows: MX (Spanish), US (English)
codex> csv list           # full catalogue sorted alphabetically

Step 7 — Translate a drug

codex> translate
  Drug name (generic or brand): Advil
  Source language code (e.g. en, es, fr): en
  Target language code (e.g. es, uk, fr): es
  Country code (e.g. US, MX) or Enter to search all: MX

Returns all matching brand names in the target language, sorted alphabetically:

  Brand Name      Generic Name    Original Language   Translated Language
  ──────────────────────────────────────────────────────────────────────
  Advil           Ibuprofeno      English             Spanish
  Anadvil         Ibuprofeno      English             Spanish

Both generic names and brand names are valid search inputs:

codex> translate → Ibuprofen / en → es   (same result as above)
codex> translate → Ibuprofeno / es → en  (reverse lookup)

CSV format

All drug data is managed via CSV files. One row per (drug, brand, country). Multiple brands for the same drug in the same country = multiple rows.

Example — English (US):

DrugBank ID,Generic Name,Brand Name,Country,Source Language,Language Code
DB00001,Ibuprofen,Advil,US,English,en
DB00001,Ibuprofen,Motrin,US,English,en
DB00002,Acetaminophen,Tylenol,US,English,en

Example — Spanish (MX):

DrugBank ID,Generic Name,Brand Name,Country,Source Language,Language Code
DB00001,Ibuprofeno,Advil,MX,Spanish,es
DB00001,Ibuprofeno,Anadvil,MX,Spanish,es
DB00002,Paracetamol,Tempra,MX,Spanish,es

The DrugBank ID column is the cross-language key: the API uses it to match Ibuprofen (English/US) with Ibuprofeno (Spanish/MX) when translating. Files without a DrugBank ID column still import correctly but cross-language translation will only work if the canonical names are identical.

Adding a new language / country

Create a new CSV file following the format above.

Upload it:

codex> csv upload /path/to/drugs_fr_FR.csv

Verify:

codex> countries    # FR should now appear
codex> languages    # fr should now appear

JSON envelope format

All API responses that return tabular data use this envelope:

{
  "metadata": {
    "generated_at":  "<ISO-8601 UTC timestamp>",
    "row_count":     10,
    "source":        "<filename | 'neo4j' | 'translate'>",
    "sort_order":    "generic_name_asc"
  },
  "csv": [ ... ]
}

API reference

Method	Path	Description
GET	`/health`	Liveness check — API + Neo4j
POST	`/translate`	Translate a drug name between languages
GET	`/audit/{term}`	Quality audit (missing translations / brands)
GET	`/languages`	List language codes loaded in Neo4j
GET	`/countries`	List supported countries and their languages
POST	`/csv/upload`	Upload a Codex CSV → Neo4j
GET	`/csv`	Export full drug catalogue (sorted)

POST /translate — request body

{
  "term":        "Advil",
  "source_lang": "en",
  "target_lang": "es",
  "country":     "MX"
}

term accepts both generic names (Ibuprofen) and brand names (Advil). country is optional — omit it to return results across all countries.

Stopping Neo4j

docker-compose down          # stop (data is preserved)
docker-compose down -v       # stop and wipe all data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Codex — Setup Guide

Prerequisites

Quick start

Step 1 — Start Neo4j

Step 2 — Configure credentials

Step 3 — Install Python dependencies

Step 4 — Start the API

Step 5 — Verify

Step 6 — Load drug data

Step 7 — Translate a drug

CSV format

Adding a new language / country

JSON envelope format

API reference

POST /translate — request body

Stopping Neo4j

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Codex — Setup Guide

Prerequisites

Quick start

Step 1 — Start Neo4j

Step 2 — Configure credentials

Step 3 — Install Python dependencies

Step 4 — Start the API

Step 5 — Verify

Step 6 — Load drug data

Step 7 — Translate a drug

CSV format

Adding a new language / country

JSON envelope format

API reference

POST /translate — request body

Stopping Neo4j