The JSCPD Server provides a RESTful API for detecting code duplications. It scans a codebase on startup and allows clients to check code snippets for duplications against the scanned codebase.
# Start server in current directory
jscpd-server
# Start server in specific directory
jscpd-server /path/to/project
# Start server on specific port
jscpd-server . --port 8080
# Start server with custom host
jscpd-server . --host localhost --port 3000
# Start server with persistent storage (recommended for production)
jscpd-server . --store leveldb-p, --port [number]- Port to run the server on (Default: 3000)-H, --host [string]- Host to bind the server to (Default: 0.0.0.0)
--store [string]- Persistent store for codebase data (e.g.,leveldbfor disk persistence). Without this option, the server uses in-memory storage which is lost on restart-c, --config [string]- Path to config file (Default is .jscpd.json in )-f, --format [string]- Format or formats separated by comma-i, --ignore [string]- Glob pattern for files to exclude--ignore-pattern [string]- Ignore code blocks matching regexp patterns-l, --min-lines [number]- Min size of duplication in code lines-k, --min-tokens [number]- Min size of duplication in code tokens-x, --max-lines [number]- Max size of source in lines-z, --max-size [string]- Max size of source in bytes-m, --mode [string]- Mode of quality of search (strict, mild, weak)-a, --absolute- Use absolute path in reports-n, --noSymlinks- Don't use symlinks for detection--ignoreCase- Ignore case of symbols in code-g, --gitignore- Ignore all files from .gitignore file--skipLocal- Skip duplicates in local folders
When running locally: http://localhost:3000/api
All API responses return Content-Type: application/json. The server accepts both application/json and application/x-www-form-urlencoded request bodies.
The server accepts request bodies up to 10MB in size.
Currently, no authentication is required. Future versions may add authentication support.
Check a code snippet for duplications against the scanned codebase.
Endpoint: POST /api/check
Request Headers:
Content-Type: application/jsonRequest Body:
{
"code": "string (required)",
"format": "string (required)"
}Parameters:
code(required, string): The code snippet to check for duplications. This field cannot be empty or whitespace-only.format(required, string): Programming language/format of the snippet (e.g., "javascript", "python", "java")
Example Request:
curl -X POST http://localhost:3000/api/check \
-H "Content-Type: application/json" \
-d '{
"code": "function hello() {\n console.log(\"Hello, World!\");\n}",
"format": "javascript"
}'Success Response (200 OK):
{
"duplications": [
{
"snippetLocation": {
"startLine": 1,
"endLine": 5,
"startColumn": 0,
"endColumn": 20
},
"codebaseLocation": {
"file": "src/utils/helper.js",
"startLine": 10,
"endLine": 14,
"startColumn": 0,
"endColumn": 20,
"fragment": "function hello() {\n console.log(\"Hello, World!\");\n}"
},
"linesCount": 4
}
],
"statistics": {
"totalDuplications": 1,
"duplicatedLines": 4,
"totalLines": 5,
"percentageDuplicated": 80.0
}
}"message": "Missing required field: code", "statusCode": 400 }
**400 Bad Request** - Validation error (wrong type):
```json
{
"error": "ValidationError",
"message": "Field \"code\" must be a string",
"statusCode": 400
}
400 Bad Request - Validation error (empty code):
{
"error": "ValidationError",
"message": "Field \"code\" cannot be empty",
"statusCode": 400
}400 Bad Request - Server not initialized:
{
"error": "Error",
"message": "Server not initialized. Please wait for initial scan to complete.",
"statusCode": 400
}400 Bad Request - Processing error:
{
"error": "CheckError",
"message": "Error details...",
"statusCode": 400
}The /api/check endpoint implements request isolation to ensure:
-
No Cross-Request Contamination: Snippet tokens are isolated per request and never stored in the shared project store. Each request compares only against the scanned codebase, not against tokens from previous snippet checks.
-
Automatic Cleanup: Snippet token data is automatically discarded when the request completes (via finally block), preventing unbounded memory growth.
-
Concurrent Request Safety: Multiple concurrent snippet checks are isolated from each other. The same snippet checked simultaneously will produce identical results without interference.
-
Consistent Results: Checking the same snippet multiple times will always produce identical duplication reports, as snippet tokens don't persist between requests.
Implementation Details:
The server uses an ephemeral hybrid store for each snippet check:
- Reads are delegated to the shared project store (to detect duplications against the codebase)
- Writes (snippet tokens) go to a temporary in-memory store
- The temporary store is discarded after the request completes
This architecture ensures snippet detection remains stateless and memory-safe while still detecting duplications against the full project.
Get overall duplication statistics for the scanned codebase.
Endpoint: GET /api/stats
Request Headers: None required
Example Request:
curl http://localhost:3000/api/statsSuccess Response (200 OK):
The response includes the full statistics object from the initial scan. The structure follows the standard jscpd statistics format.
{
"statistics": {
"detectionDate": "2025-11-17T10:30:00.000Z",
"total": {
"lines": 10000,
"tokens": 50000,
"sources": 50,
"duplicatedLines": 500,
"duplicatedTokens": 2500,
"clones": 10,
"percentage": 5.0,
"percentageTokens": 5.0,
"newDuplicatedLines": 0,
"newClones": 0
},
"formats": {
"javascript": {
"total": {
"lines": 5000,
"tokens": 25000,
"sources": 30,
"duplicatedLines": 300,
"duplicatedTokens": 1500,
"clones": 6,
"percentage": 6.0,
"percentageTokens": 6.0,
"newDuplicatedLines": 0,
"newClones": 0
},
"sources": {
"src/file1.js": {
"lines": 100,
"tokens": 500,
"sources": 1,
"duplicatedLines": 10,
"duplicatedTokens": 50,
"clones": 1,
"percentage": 10.0,
"percentageTokens": 10.0,
"newDuplicatedLines": 0,
"newClones": 0
}
}
}
}
},
"timestamp": "2025-11-17T10:30:00.000Z"
}Error Responses:
503 Service Unavailable - Statistics not ready:
{
"error": "NotReady",
"message": "Statistics not available yet. Server is still initializing.",
"statusCode": 503
}Check server health and initialization status.
Endpoint: GET /api/health
Request Headers: None required
Example Request:
curl http://localhost:3000/api/healthSuccess Response (200 OK):
{
"status": "ready",
"workingDirectory": "/path/to/project",
"lastScanTime": "2025-11-17T10:30:00.000Z"
}Status Values:
initializing- Server is scanning the codebaseready- Server is ready to accept requests
Get information about the API and available endpoints.
Endpoint: GET /
Request Headers: None required
Example Request:
curl http://localhost:3000/Success Response (200 OK):
{
"name": "jscpd-server",
"version": "1.0.0",
"endpoints": {
"POST /api/check": "Check code snippet for duplications",
"GET /api/stats": "Get overall project statistics",
"GET /api/health": "Server health check"
},
"documentation": "https://github.com/kucherenko/jscpd"
}The server also supports the Model Context Protocol (MCP), allowing it to fit into LLM-based workflows.
Endpoint: POST /mcp
The server handles MCP requests via the /mcp endpoint using the protocol's streamable HTTP transport.
To use with Claude Desktop, add the following to your claude_desktop_config.json:
{
"mcpServers": {
"jscpd": {
"type": "streamable-http",
"url": "http://localhost:3000/mcp"
}
}
}Ensure jscpd-server is running (e.g., npx jscpd-server . --port 3000) before starting Claude Desktop.
Check code snippet for duplications against the codebase.
- Input:
code(string): Source code snippet.format(string): Format/language of the code.
Get overall project duplication statistics.
- Input: None
Trigger a re-scan of the current working directory for duplications.
- Input: None
{
duplications: Array<{
snippetLocation: {
startLine: number;
endLine: number;
startColumn: number;
endColumn: number;
};
codebaseLocation: {
file: string;
startLine: number;
endLine: number;
startColumn: number;
endColumn: number;
fragment?: string;
};
linesCount: number;
}>;
statistics: {
totalDuplications: number;
duplicatedLines: number;
totalLines: number;
percentageDuplicated: number;
};
}{
error: string;
message: string;
statusCode: number;
}The server supports all languages that jscpd supports. When checking a snippet, specify the language using the format parameter with the language name or file extension.
Common format identifiers:
- JavaScript:
javascript,js - TypeScript:
typescript,ts - Python:
python,py - Java:
java - C/C++:
c,cpp - C#:
csharp,cs - PHP:
php - Ruby:
ruby,rb - Go:
go - Rust:
rust,rs
For a complete list, see the supported formats documentation.
All errors follow a consistent format with an HTTP status code and JSON body:
{
"error": "ErrorType",
"message": "Human-readable error message",
"statusCode": 400
}ValidationError(400) - Invalid request parameters (missing fields, wrong types, empty values)CheckError(400) - Error processing the check requestError(400) - Server not initialized or general errorsNotReady(503) - Statistics not available yet (server still initializing)NotFound(404) - Endpoint not foundStatsError(500) - Error retrieving statisticsInternalServerError(500) - Unexpected server error
Currently, no rate limiting is implemented. Consider implementing rate limiting in production environments.
By default, the server uses in-memory storage (MemoryStore) for the codebase scan results. This means:
- ✅ Fast - All data is in memory
- ❌ Volatile - Data is lost when the server restarts
- ❌ Memory-intensive - Large codebases consume significant RAM
With --store leveldb, the server uses disk-based storage (LevelDB):
- ✅ Persistent - Data survives server restarts
- ✅ Memory-efficient - Data is stored on disk
- ✅ No re-scanning - Server starts immediately with cached data
⚠️ Slightly slower - Disk I/O overhead (minimal impact)
# Install the LevelDB store package
npm install @jscpd/leveldb-store# Start server with LevelDB persistence
jscpd server /path/to/project --store leveldb --port 3000
# On first start, the server will:
# 1. Scan the codebase
# 2. Store results in .jscpd/ directory
# 3. Accept requests
# On subsequent restarts, the server will:
# 1. Load cached data from .jscpd/ directory
# 2. Accept requests immediately (no re-scan needed)LevelDB stores data in the .jscpd/ directory relative to where the server is started. This directory contains:
- Token databases for each file format
- Duplication detection data
Important: Add .jscpd/ to your .gitignore file:
echo ".jscpd/" >> .gitignoreTo clear cached data and force a fresh scan:
# Stop the server, then:
rm -rf .jscpd/const axios = require('axios');
const checkCode = async () => {
try {
const response = await axios.post('http://localhost:3000/api/check', {
code: `
function calculateSum(a, b) {
return a + b;
}
`,
format: 'javascript'
});
console.log('Duplications found:', response.data.duplications.length);
console.log('Percentage duplicated:', response.data.statistics.percentageDuplicated + '%');
} catch (error) {
console.error('Error:', error.response.data);
}
};
checkCode();import requests
def check_code():
url = 'http://localhost:3000/api/check'
payload = {
'code': '''
def hello_world():
print("Hello, World!")
''',
'format': 'python'
}
response = requests.post(url, json=payload)
if response.status_code == 200:
data = response.json()
print(f"Duplications found: {len(data['duplications'])}")
print(f"Percentage duplicated: {data['statistics']['percentageDuplicated']}%")
else:
print(f"Error: {response.json()}")
check_code()#!/bin/bash
# Get statistics
curl -s http://localhost:3000/api/stats | jq '.statistics.total'
# Output:
# {
# "lines": 10000,
# "tokens": 50000,
# "sources": 50,
# "duplicatedLines": 500,
# "duplicatedTokens": 2500,
# "clones": 10,
# "percentage": 5.0,
# ...
# }name: Check Code Duplication
on: [push, pull_request]
jobs:
check-duplication:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Setup Node.js
uses: actions/setup-node@v2
with:
node-version: '18'
- name: Install jscpd
run: npm install -g jscpd
- name: Start jscpd server
run: |
jscpd server . --port 3000 &
sleep 10 # Wait for server to initialize
- name: Check new code
run: |
# Check files changed in this PR
for file in $(git diff --name-only HEAD~1); do
if [[ -f "$file" ]]; then
# Extract file extension to determine format
ext="${file##*.}"
curl -X POST http://localhost:3000/api/check \
-H "Content-Type: application/json" \
-d "{\"code\": \"$(cat $file | jq -Rs .)\", \"format\": \"$ext\"}" \
| jq .
fi
done-
Initialize Once: The server scans the codebase on startup. For large codebases, this may take time. Check the
/api/healthendpoint to verify the server is ready before sending requests. -
Format Parameter: Always provide the
formatparameter with a valid language identifier or file extension. -
Snippet Size: Large snippets take longer to process. Maximum body size is 10MB.
-
Error Handling: Always handle errors appropriately, especially 400 errors when the server is not initialized or validation fails.
-
Production Use: For production use, consider:
- Using persistent storage with
--store leveldbto avoid rescanning on restarts - Adding authentication
- Implementing rate limiting
- Using a reverse proxy (nginx, Apache)
- Monitoring and logging
- Running behind HTTPS
- Using persistent storage with
- Port already in use: Try a different port with
--port - Permission denied: Use a port above 1024 or run with appropriate permissions
- Ensure
formatis correctly specified with a valid language identifier - Check that the codebase was successfully scanned (check server logs)
- Verify minimum thresholds (
--min-lines,--min-tokens)
- Wait for the initial scan to complete
- Check
/api/healthendpoint - status should bereadynotinitializing
- Ensure
codefield is a non-empty string - Ensure
formatfield is provided and is a string - Check that the request body is valid JSON
- Initial Scan Time: Depends on codebase size. Large codebases may take several minutes.
- Memory Usage: The server keeps the scanned codebase in memory by default. Monitor memory usage for large projects. Snippet checks use ephemeral stores that are automatically garbage-collected, preventing unbounded memory growth from snippet tokens.
- Check Response Time: Typically < 1 second for small snippets, longer for larger snippets.
- Concurrent Requests: Snippet checks are isolated and thread-safe. Multiple concurrent requests do not interfere with each other.
- Persistent Storage: Use
--store leveldbfor disk-based persistence. This allows the server to survive restarts without rescanning the codebase. LevelDB is recommended for large repositories. To use LevelDB, ensure@jscpd/leveldb-storeis installed:npm install @jscpd/leveldb-store
For issues, questions, or feature requests, please visit:
- GitHub Issues: https://github.com/kucherenko/jscpd/issues
- Documentation: https://github.com/kucherenko/jscpd