Skip to main content
POST
/
kyc
/
web-validation
Web Validation
curl --request POST \
  --url https://api.example.com/kyc/web-validation

Documentation Index

Fetch the complete documentation index at: https://docs.compliance.legaltalent.ai/llms.txt

Use this file to discover all available pages before exploring further.

Comprehensive website validation and risk assessment. Analyzes domains for legitimacy, security, content quality, sanctions, adverse media, and policy compliance.

Endpoint

POST /kyc/web-validation

Authentication

Requires web-validation:create permission. Include your Bearer token in the Authorization header.

Description

The web validation endpoint performs comprehensive analysis of websites including:
  1. URL & Domain Analysis: Validates URL structure, redirects, and domain registration (WHOIS)
  2. SSL/TLS Verification: Checks certificate validity, expiration, and security
  3. Industry Classification: AI-powered industry detection and risk classification
  4. Name/Industry Matching: Compares declared business info against website content
  5. Sanction List Check: Screens domain against global sanctions lists
  6. Adverse Media Analysis: Searches for negative news and fraud reports
  7. Social Media Validation: Verifies social media links are functional
  8. Tranco Rank Check: Evaluates website popularity/reputation
  9. Policy Compliance: Analyzes Terms, Return, Refund, and AML policies

Request Body Parameters

Request Examples

Basic Validation

curl -X POST https://stg.kyc.legaltalent.ai/kyc/web-validation \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com"
  }'

Validation Linked to a Customer

curl -X POST https://stg.kyc.legaltalent.ai/kyc/web-validation \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example-shop.com",
    "client_id": "your-external-id-123"
  }'
When client_id is provided, the response includes a customer_id field. If no customer with that client_id exists, one is auto-created. Use the same client_id across services (web validation, adverse media, sessions) to aggregate all results for one customer.

Full Validation with Business Details

curl -X POST https://stg.kyc.legaltalent.ai/kyc/web-validation \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example-shop.com",
    "client_id": "merchant-456",
    "declared_name": "Example Shop Inc",
    "declared_industry": "E-commerce",
    "include_adverse_media": true,
    "include_sanction_check": true,
    "include_social_links": true,
    "include_tranco_rank": true,
    "include_policies": true,
    "policy_types": ["terms", "return", "refund"]
  }'

With JavaScript Rendering

curl -X POST https://stg.kyc.legaltalent.ai/kyc/web-validation \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://spa-website.com",
    "enable_js_render": true
  }'

Response Format

Success Response

{
  "customer_id": "38ff436f-851e-4f9a-89fd-6eb19a02ba75",
  "result": {
    "reliability_score": 78.5,
    "validation_results": {
      "url_checker": true,
      "code_response": 200,
      "last_page_url": "https://example.com/",
      "social_media": false,
      "social_media_allowed": true,
      
      "ssl_checker": {
        "has_ssl": true,
        "is_valid": true,
        "issuer": "Let's Encrypt",
        "expiry_date": "2025-03-15",
        "days_until_expiry": 90
      },
      
      "whois_info": {
        "domain": "example.com",
        "registrar": "GoDaddy",
        "creation_date": "2010-01-15",
        "expiration_date": "2026-01-15",
        "domain_age_days": 5478,
        "registrant_country": "US"
      },
      
      "classifier": {
        "industry_name": "E-commerce",
        "industry_id": 12,
        "allowed_industry": true,
        "flagged_industry": false,
        "criterion": "Online retail store selling electronics"
      },
      
      "kyc_matches": {
        "declared_name_match": true,
        "declared_industry_match": true,
        "name_confidence": 0.92,
        "industry_confidence": 0.88
      },
      
      "page_status_classifier": {
        "page_status": "AVAILABLE",
        "has_content": true
      },
      
      "sanction_check": {
        "has_matches": false,
        "matches": [],
        "checked_lists": ["OFAC", "EU", "UN"]
      },
      
      "adverse_media": {
        "has_adverse_media": false,
        "risk_score": 15,
        "decision": "CLEAR",
        "sources": []
      },
      
      "social_media_links": {
        "social_links_found": true,
        "valid_count": 3,
        "invalid_count": 0,
        "links": [
          {"platform": "facebook", "url": "https://facebook.com/example", "valid": true},
          {"platform": "twitter", "url": "https://twitter.com/example", "valid": true},
          {"platform": "instagram", "url": "https://instagram.com/example", "valid": true}
        ]
      },
      
      "tranco_rank": {
        "rank": 45000,
        "in_top_million": true,
        "band": "top_100k",
        "score": 0.8
      },
      
      "policies": {
        "terms": {
          "status": "policy_analyzed",
          "url": "https://example.com/terms",
          "generic_format": false,
          "includes_is_of_age": true,
          "has_applicable_law": true,
          "dos_clause": true,
          "termination_clause": true,
          "privacy_policy": true
        },
        "return": {
          "status": "policy_analyzed",
          "url": "https://example.com/return-policy",
          "generic_format": false
        },
        "refund": {
          "status": "policy_analyzed",
          "url": "https://example.com/refund-policy",
          "generic_format": false
        },
        "execution_time_ms": 3200
      },
      
      "total_cost": 0.0023,
      "concurrent_time_taken_ms": 4500
    }
  }
}

Response Fields

Top Level

FieldTypeDescription
customer_idstringInternal customer UUID. Only present when client_id was provided in the request.
result.reliability_scorenumberOverall reliability score (0-100)
result.validation_resultsobjectDetailed results from all validators

URL & Basic Checks

FieldTypeDescription
url_checkerbooleanWhether URL is valid and accessible
code_responseintegerHTTP response code
last_page_urlstringFinal URL after redirects
social_mediabooleanWhether URL is a social media site
social_media_allowedbooleanWhether social media URLs are allowed

SSL Certificate

FieldTypeDescription
has_sslbooleanWhether site has SSL
is_validbooleanWhether certificate is valid
issuerstringCertificate issuer
expiry_datestringCertificate expiration date
days_until_expiryintegerDays until certificate expires

WHOIS Information

FieldTypeDescription
domainstringDomain name
registrarstringDomain registrar
creation_datestringDomain creation date
expiration_datestringDomain expiration date
domain_age_daysintegerAge of domain in days
registrant_countrystringCountry of registrant

Industry Classification

FieldTypeDescription
industry_namestringDetected industry name
industry_idintegerIndustry identifier
allowed_industrybooleanWhether industry is allowed
flagged_industrybooleanWhether industry is flagged for extra scrutiny (only true when allowed_industry is also true)
criterionstringClassification reasoning

KYC Matches

FieldTypeDescription
declared_name_matchbooleanWhether declared name matches
declared_industry_matchbooleanWhether declared industry matches
name_confidencenumberName match confidence (0-1)
industry_confidencenumberIndustry match confidence (0-1)

Sanction Check

FieldTypeDescription
has_matchesbooleanWhether sanctions matches found
matchesarrayList of sanction matches
checked_listsarrayLists that were checked

Adverse Media

FieldTypeDescription
has_adverse_mediabooleanWhether adverse media found
risk_scoreintegerRisk score (0-100)
decisionstringCLEAR, LOW_RISK, MEDIUM_RISK, HIGH_RISK
sourcesarrayAdverse media sources found
FieldTypeDescription
social_links_foundbooleanWhether social links found
valid_countintegerNumber of valid links
invalid_countintegerNumber of broken links
linksarrayDetailed link information

Tranco Rank

FieldTypeDescription
rankintegerTranco ranking position
in_top_millionbooleanWhether in top 1M sites
bandstringRanking band (top_1k, top_10k, etc.)
scorenumberPopularity score (0-1)

Policies

FieldTypeDescription
termsobjectTerms & Conditions analysis
returnobjectReturn policy analysis
refundobjectRefund policy analysis
amlobjectAML policy analysis
execution_time_msintegerPolicy analysis time

Reliability Score Calculation

The reliability score is calculated as a weighted average of individual validation scores:
CriterionDescriptionScore Range
SSLCertificate validity0-1
IndustryAllowed industry classification (0 = blocked, 0.5 = flagged, 1 = allowed)0, 0.5, or 1
Name MatchDeclared name verification0 or 1
Industry MatchDeclared industry verification0 or 1
Sanction CheckNo sanctions matches0 or 1
Adverse MediaInverse of risk score0-1
Social LinksRatio of valid links0-1
Tranco RankPopularity score by band0.2-1
Terms PolicyPolicy quality score0-1
Return PolicyPolicy presence score0-1
Refund PolicyPolicy presence score0-1
AML PolicyPolicy quality score0-1

Tranco Rank Bands

BandRank RangeScore
top_1k1 - 1,0001.0
top_10k1,001 - 10,0000.9
top_100k10,001 - 100,0000.8
top_500k100,001 - 500,0000.6
top_1m500,001 - 1,000,0000.4
not_ranked> 1,000,0000.2

Error Responses

400 Bad Request - Invalid URL

{
  "error": "Validation error: URL must start with http:// or https://"
}

400 Bad Request - Missing URL

{
  "error": "Validation error: \"url\" parameter is required"
}

403 Forbidden - Permission Denied

{
  "error": "Permission denied: web-validation:create required"
}

500 Internal Server Error

{
  "error": "Internal server error"
}

Status Codes

CodeDescription
200Success - Validation completed
400Bad Request - Invalid parameters
401Unauthorized - Missing or invalid token
403Forbidden - Insufficient permissions
500Internal Server Error

Usage Examples

Python - Full Validation

import requests

BASE_URL = "https://stg.kyc.legaltalent.ai"
TOKEN = "YOUR_TOKEN"

headers = {
    "Authorization": f"Bearer {TOKEN}",
    "Content-Type": "application/json"
}

def validate_website(url, declared_name=None, declared_industry=None, full_check=True):
    """Perform comprehensive website validation."""
    payload = {
        "url": url,
        "include_sanction_check": True,
        "include_social_links": True,
        "include_tranco_rank": True
    }
    
    if declared_name:
        payload["declared_name"] = declared_name
    if declared_industry:
        payload["declared_industry"] = declared_industry
    
    if full_check:
        payload["include_adverse_media"] = True
        payload["include_policies"] = True
        payload["policy_types"] = ["terms", "return", "refund", "aml"]
    
    response = requests.post(
        f"{BASE_URL}/kyc/web-validation",
        headers=headers,
        json=payload,
        timeout=90
    )
    response.raise_for_status()
    return response.json()

def analyze_result(result):
    """Analyze validation results."""
    data = result.get("result", {})
    score = data.get("reliability_score", 0)
    validation = data.get("validation_results", {})
    
    print(f"Reliability Score: {score:.1f}/100")
    
    # SSL Check
    ssl = validation.get("ssl_checker", {})
    print(f"\n🔒 SSL: {'Valid' if ssl.get('is_valid') else 'Invalid'}")
    if ssl.get('days_until_expiry'):
        print(f"   Expires in {ssl['days_until_expiry']} days")
    
    # Sanction Check
    sanction = validation.get("sanction_check", {})
    if sanction.get("has_matches"):
        print(f"\n⚠️ SANCTION MATCHES FOUND!")
    else:
        print(f"\n✅ No sanction matches")
    
    # Adverse Media
    adverse = validation.get("adverse_media", {})
    decision = adverse.get("decision", "N/A")
    print(f"\n📰 Adverse Media: {decision}")
    if adverse.get("risk_score"):
        print(f"   Risk Score: {adverse['risk_score']}/100")
    
    # Social Links
    social = validation.get("social_media_links", {})
    valid = social.get("valid_count", 0)
    invalid = social.get("invalid_count", 0)
    print(f"\n🔗 Social Links: {valid} valid, {invalid} broken")
    
    # Tranco Rank
    tranco = validation.get("tranco_rank", {})
    rank = tranco.get("rank")
    band = tranco.get("band", "not_ranked")
    print(f"\n📊 Tranco Rank: #{rank} ({band})" if rank else "\n📊 Tranco: Not ranked")
    
    # Policies
    policies = validation.get("policies", {})
    if policies:
        print("\n📋 Policies:")
        for policy_type in ["terms", "return", "refund", "aml"]:
            policy = policies.get(policy_type, {})
            status = policy.get("status", "not_found")
            print(f"   - {policy_type.title()}: {status}")
    
    return score >= 70

# Usage
result = validate_website(
    url="https://example-shop.com",
    declared_name="Example Shop Inc",
    declared_industry="E-commerce",
    full_check=True
)

is_reliable = analyze_result(result)
print(f"\n{'✅ Website appears reliable' if is_reliable else '⚠️ Website requires review'}")

JavaScript - With Risk Assessment

const BASE_URL = 'https://stg.kyc.legaltalent.ai';
const TOKEN = 'YOUR_TOKEN';

const headers = {
  'Authorization': `Bearer ${TOKEN}`,
  'Content-Type': 'application/json'
};

async function validateWebsite(url, options = {}) {
  const payload = {
    url,
    include_sanction_check: true,
    include_social_links: true,
    include_tranco_rank: true,
    ...options
  };

  const response = await fetch(`${BASE_URL}/kyc/web-validation`, {
    method: 'POST',
    headers,
    body: JSON.stringify(payload)
  });

  if (!response.ok) {
    const error = await response.json();
    throw new Error(error.error || 'Validation failed');
  }

  return response.json();
}

function assessRisk(result) {
  const data = result.result || {};
  const score = data.reliability_score || 0;
  const validation = data.validation_results || {};
  
  const risks = [];
  
  // Check SSL
  const ssl = validation.ssl_checker || {};
  if (!ssl.is_valid) {
    risks.push({ level: 'high', issue: 'Invalid SSL certificate' });
  } else if (ssl.days_until_expiry < 30) {
    risks.push({ level: 'medium', issue: 'SSL expires soon' });
  }
  
  // Check sanctions
  const sanction = validation.sanction_check || {};
  if (sanction.has_matches) {
    risks.push({ level: 'critical', issue: 'Sanction matches found' });
  }
  
  // Check adverse media
  const adverse = validation.adverse_media || {};
  if (adverse.decision === 'HIGH_RISK') {
    risks.push({ level: 'high', issue: 'High risk adverse media' });
  } else if (adverse.decision === 'MEDIUM_RISK') {
    risks.push({ level: 'medium', issue: 'Medium risk adverse media' });
  }
  
  // Check domain age (via WHOIS)
  const whois = validation.whois_info || {};
  if (whois.domain_age_days < 180) {
    risks.push({ level: 'medium', issue: 'Domain less than 6 months old' });
  }
  
  // Check Tranco rank
  const tranco = validation.tranco_rank || {};
  if (!tranco.in_top_million) {
    risks.push({ level: 'low', issue: 'Low website popularity' });
  }
  
  // Check social links
  const social = validation.social_media_links || {};
  if (social.invalid_count > social.valid_count) {
    risks.push({ level: 'medium', issue: 'Broken social media links' });
  }
  
  // Determine overall risk level
  const hasCritical = risks.some(r => r.level === 'critical');
  const hasHigh = risks.some(r => r.level === 'high');
  const hasMedium = risks.some(r => r.level === 'medium');
  
  let overallRisk = 'low';
  if (hasCritical) overallRisk = 'critical';
  else if (hasHigh) overallRisk = 'high';
  else if (hasMedium) overallRisk = 'medium';
  
  return {
    score,
    overallRisk,
    risks,
    recommendation: score >= 70 && !hasCritical && !hasHigh ? 'approve' : 'review'
  };
}

// Usage
async function main() {
  const result = await validateWebsite('https://example-shop.com', {
    declared_name: 'Example Shop Inc',
    declared_industry: 'E-commerce',
    include_adverse_media: true,
    include_policies: true
  });
  
  const assessment = assessRisk(result);
  
  console.log(`Score: ${assessment.score.toFixed(1)}/100`);
  console.log(`Overall Risk: ${assessment.overallRisk.toUpperCase()}`);
  console.log(`Recommendation: ${assessment.recommendation.toUpperCase()}`);
  
  if (assessment.risks.length > 0) {
    console.log('\nRisk Factors:');
    assessment.risks.forEach(risk => {
      console.log(`  [${risk.level}] ${risk.issue}`);
    });
  }
}

Best Practices

Performance

  • Basic validation (without policies/adverse media): ~3-5 seconds
  • Full validation (all checks enabled): ~8-15 seconds
  • Use enable_js_render: true only for JavaScript-heavy sites (adds ~2-3 seconds)

Reliability Score Thresholds

Score RangeRecommendation
80-100Low risk, safe to proceed
60-79Medium risk, review recommended
40-59Higher risk, manual verification needed
0-39High risk, exercise caution

Configuration Tips

  1. For E-commerce/Merchant Onboarding:
    • Enable all checks including policies
    • Verify declared_name and declared_industry
    • Check include_adverse_media: true
  2. For Quick Screening:
    • Use basic validation without policies
    • Focus on SSL, sanction check, and Tranco rank
  3. For High-Risk Industries:
    • Enable AML policy check
    • Use tenant-configured risk matrix weights

Tenant Configuration

Settings can be pre-configured per tenant:
  • Default validation options
  • Risk matrix weights
  • Allowed/blocked industries
  • Policy check requirements
Contact support to configure tenant-specific settings.