Skip to main content
POST
/
kyc
/
web-validation
Web Validation
curl --request POST \
  --url https://api.example.com/kyc/web-validation
Comprehensive website validation and risk assessment. Analyzes domains for legitimacy, security, content quality, sanctions, adverse media, and policy compliance.

Endpoint

POST /kyc/web-validation

Authentication

Requires web-validation:create permission. Include your Bearer token in the Authorization header.

Description

The web validation endpoint performs comprehensive analysis of websites including:
  1. URL & Domain Analysis: Validates URL structure, redirects, and domain registration (WHOIS)
  2. SSL/TLS Verification: Checks certificate validity, expiration, and security
  3. Industry Classification: AI-powered industry detection and risk classification
  4. Name/Industry Matching: Compares declared business info against website content
  5. Sanction List Check: Screens domain against global sanctions lists
  6. Adverse Media Analysis: Searches for negative news and fraud reports
  7. Social Media Validation: Verifies social media links are functional
  8. Tranco Rank Check: Evaluates website popularity/reputation
  9. Policy Compliance: Analyzes Terms, Return, Refund, and AML policies

Request Body Parameters

Request Examples

Basic Validation

curl -X POST https://stg.kyc.legaltalent.ai/kyc/web-validation \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com"
  }'

Full Validation with Business Details

curl -X POST https://stg.kyc.legaltalent.ai/kyc/web-validation \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example-shop.com",
    "declared_name": "Example Shop Inc",
    "declared_industry": "E-commerce",
    "include_adverse_media": true,
    "include_sanction_check": true,
    "include_social_links": true,
    "include_tranco_rank": true,
    "include_policies": true,
    "policy_types": ["terms", "return", "refund"]
  }'

With JavaScript Rendering

curl -X POST https://stg.kyc.legaltalent.ai/kyc/web-validation \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://spa-website.com",
    "enable_js_render": true
  }'

Response Format

Success Response

{
  "result": {
    "reliability_score": 78.5,
    "validation_results": {
      "url_checker": true,
      "code_response": 200,
      "last_page_url": "https://example.com/",
      "social_media": false,
      "social_media_allowed": true,
      
      "ssl_checker": {
        "has_ssl": true,
        "is_valid": true,
        "issuer": "Let's Encrypt",
        "expiry_date": "2025-03-15",
        "days_until_expiry": 90
      },
      
      "whois_info": {
        "domain": "example.com",
        "registrar": "GoDaddy",
        "creation_date": "2010-01-15",
        "expiration_date": "2026-01-15",
        "domain_age_days": 5478,
        "registrant_country": "US"
      },
      
      "classifier": {
        "industry_name": "E-commerce",
        "industry_id": 12,
        "allowed_industry": true,
        "criterion": "Online retail store selling electronics"
      },
      
      "kyc_matches": {
        "declared_name_match": true,
        "declared_industry_match": true,
        "name_confidence": 0.92,
        "industry_confidence": 0.88
      },
      
      "page_status_classifier": {
        "page_status": "AVAILABLE",
        "has_content": true
      },
      
      "sanction_check": {
        "has_matches": false,
        "matches": [],
        "checked_lists": ["OFAC", "EU", "UN"]
      },
      
      "adverse_media": {
        "has_adverse_media": false,
        "risk_score": 15,
        "decision": "CLEAR",
        "sources": []
      },
      
      "social_media_links": {
        "social_links_found": true,
        "valid_count": 3,
        "invalid_count": 0,
        "links": [
          {"platform": "facebook", "url": "https://facebook.com/example", "valid": true},
          {"platform": "twitter", "url": "https://twitter.com/example", "valid": true},
          {"platform": "instagram", "url": "https://instagram.com/example", "valid": true}
        ]
      },
      
      "tranco_rank": {
        "rank": 45000,
        "in_top_million": true,
        "band": "top_100k",
        "score": 0.8
      },
      
      "policies": {
        "terms": {
          "status": "policy_analyzed",
          "url": "https://example.com/terms",
          "generic_format": false,
          "includes_is_of_age": true,
          "has_applicable_law": true,
          "dos_clause": true,
          "termination_clause": true,
          "privacy_policy": true
        },
        "return": {
          "status": "policy_analyzed",
          "url": "https://example.com/return-policy",
          "generic_format": false
        },
        "refund": {
          "status": "policy_analyzed",
          "url": "https://example.com/refund-policy",
          "generic_format": false
        },
        "execution_time_ms": 3200
      },
      
      "total_cost": 0.0023,
      "concurrent_time_taken_ms": 4500
    }
  }
}

Response Fields

Top Level

FieldTypeDescription
reliability_scorenumberOverall reliability score (0-100)
validation_resultsobjectDetailed results from all validators

URL & Basic Checks

FieldTypeDescription
url_checkerbooleanWhether URL is valid and accessible
code_responseintegerHTTP response code
last_page_urlstringFinal URL after redirects
social_mediabooleanWhether URL is a social media site
social_media_allowedbooleanWhether social media URLs are allowed

SSL Certificate

FieldTypeDescription
has_sslbooleanWhether site has SSL
is_validbooleanWhether certificate is valid
issuerstringCertificate issuer
expiry_datestringCertificate expiration date
days_until_expiryintegerDays until certificate expires

WHOIS Information

FieldTypeDescription
domainstringDomain name
registrarstringDomain registrar
creation_datestringDomain creation date
expiration_datestringDomain expiration date
domain_age_daysintegerAge of domain in days
registrant_countrystringCountry of registrant

Industry Classification

FieldTypeDescription
industry_namestringDetected industry name
industry_idintegerIndustry identifier
allowed_industrybooleanWhether industry is allowed
criterionstringClassification reasoning

KYC Matches

FieldTypeDescription
declared_name_matchbooleanWhether declared name matches
declared_industry_matchbooleanWhether declared industry matches
name_confidencenumberName match confidence (0-1)
industry_confidencenumberIndustry match confidence (0-1)

Sanction Check

FieldTypeDescription
has_matchesbooleanWhether sanctions matches found
matchesarrayList of sanction matches
checked_listsarrayLists that were checked

Adverse Media

FieldTypeDescription
has_adverse_mediabooleanWhether adverse media found
risk_scoreintegerRisk score (0-100)
decisionstringCLEAR, LOW_RISK, MEDIUM_RISK, HIGH_RISK
sourcesarrayAdverse media sources found
FieldTypeDescription
social_links_foundbooleanWhether social links found
valid_countintegerNumber of valid links
invalid_countintegerNumber of broken links
linksarrayDetailed link information

Tranco Rank

FieldTypeDescription
rankintegerTranco ranking position
in_top_millionbooleanWhether in top 1M sites
bandstringRanking band (top_1k, top_10k, etc.)
scorenumberPopularity score (0-1)

Policies

FieldTypeDescription
termsobjectTerms & Conditions analysis
returnobjectReturn policy analysis
refundobjectRefund policy analysis
amlobjectAML policy analysis
execution_time_msintegerPolicy analysis time

Reliability Score Calculation

The reliability score is calculated as a weighted average of individual validation scores:
CriterionDescriptionScore Range
SSLCertificate validity0-1
IndustryAllowed industry classification0 or 1
Name MatchDeclared name verification0 or 1
Industry MatchDeclared industry verification0 or 1
Sanction CheckNo sanctions matches0 or 1
Adverse MediaInverse of risk score0-1
Social LinksRatio of valid links0-1
Tranco RankPopularity score by band0.2-1
Terms PolicyPolicy quality score0-1
Return PolicyPolicy presence score0-1
Refund PolicyPolicy presence score0-1
AML PolicyPolicy quality score0-1

Tranco Rank Bands

BandRank RangeScore
top_1k1 - 1,0001.0
top_10k1,001 - 10,0000.9
top_100k10,001 - 100,0000.8
top_500k100,001 - 500,0000.6
top_1m500,001 - 1,000,0000.4
not_ranked> 1,000,0000.2

Error Responses

400 Bad Request - Invalid URL

{
  "error": "Validation error: URL must start with http:// or https://"
}

400 Bad Request - Missing URL

{
  "error": "Validation error: \"url\" parameter is required"
}

403 Forbidden - Permission Denied

{
  "error": "Permission denied: web-validation:create required"
}

500 Internal Server Error

{
  "error": "Internal server error"
}

Status Codes

CodeDescription
200Success - Validation completed
400Bad Request - Invalid parameters
401Unauthorized - Missing or invalid token
403Forbidden - Insufficient permissions
500Internal Server Error

Usage Examples

Python - Full Validation

import requests

BASE_URL = "https://stg.kyc.legaltalent.ai"
TOKEN = "YOUR_TOKEN"

headers = {
    "Authorization": f"Bearer {TOKEN}",
    "Content-Type": "application/json"
}

def validate_website(url, declared_name=None, declared_industry=None, full_check=True):
    """Perform comprehensive website validation."""
    payload = {
        "url": url,
        "include_sanction_check": True,
        "include_social_links": True,
        "include_tranco_rank": True
    }
    
    if declared_name:
        payload["declared_name"] = declared_name
    if declared_industry:
        payload["declared_industry"] = declared_industry
    
    if full_check:
        payload["include_adverse_media"] = True
        payload["include_policies"] = True
        payload["policy_types"] = ["terms", "return", "refund", "aml"]
    
    response = requests.post(
        f"{BASE_URL}/kyc/web-validation",
        headers=headers,
        json=payload,
        timeout=90
    )
    response.raise_for_status()
    return response.json()

def analyze_result(result):
    """Analyze validation results."""
    data = result.get("result", {})
    score = data.get("reliability_score", 0)
    validation = data.get("validation_results", {})
    
    print(f"Reliability Score: {score:.1f}/100")
    
    # SSL Check
    ssl = validation.get("ssl_checker", {})
    print(f"\n🔒 SSL: {'Valid' if ssl.get('is_valid') else 'Invalid'}")
    if ssl.get('days_until_expiry'):
        print(f"   Expires in {ssl['days_until_expiry']} days")
    
    # Sanction Check
    sanction = validation.get("sanction_check", {})
    if sanction.get("has_matches"):
        print(f"\n⚠️ SANCTION MATCHES FOUND!")
    else:
        print(f"\n✅ No sanction matches")
    
    # Adverse Media
    adverse = validation.get("adverse_media", {})
    decision = adverse.get("decision", "N/A")
    print(f"\n📰 Adverse Media: {decision}")
    if adverse.get("risk_score"):
        print(f"   Risk Score: {adverse['risk_score']}/100")
    
    # Social Links
    social = validation.get("social_media_links", {})
    valid = social.get("valid_count", 0)
    invalid = social.get("invalid_count", 0)
    print(f"\n🔗 Social Links: {valid} valid, {invalid} broken")
    
    # Tranco Rank
    tranco = validation.get("tranco_rank", {})
    rank = tranco.get("rank")
    band = tranco.get("band", "not_ranked")
    print(f"\n📊 Tranco Rank: #{rank} ({band})" if rank else "\n📊 Tranco: Not ranked")
    
    # Policies
    policies = validation.get("policies", {})
    if policies:
        print("\n📋 Policies:")
        for policy_type in ["terms", "return", "refund", "aml"]:
            policy = policies.get(policy_type, {})
            status = policy.get("status", "not_found")
            print(f"   - {policy_type.title()}: {status}")
    
    return score >= 70

# Usage
result = validate_website(
    url="https://example-shop.com",
    declared_name="Example Shop Inc",
    declared_industry="E-commerce",
    full_check=True
)

is_reliable = analyze_result(result)
print(f"\n{'✅ Website appears reliable' if is_reliable else '⚠️ Website requires review'}")

JavaScript - With Risk Assessment

const BASE_URL = 'https://stg.kyc.legaltalent.ai';
const TOKEN = 'YOUR_TOKEN';

const headers = {
  'Authorization': `Bearer ${TOKEN}`,
  'Content-Type': 'application/json'
};

async function validateWebsite(url, options = {}) {
  const payload = {
    url,
    include_sanction_check: true,
    include_social_links: true,
    include_tranco_rank: true,
    ...options
  };

  const response = await fetch(`${BASE_URL}/kyc/web-validation`, {
    method: 'POST',
    headers,
    body: JSON.stringify(payload)
  });

  if (!response.ok) {
    const error = await response.json();
    throw new Error(error.error || 'Validation failed');
  }

  return response.json();
}

function assessRisk(result) {
  const data = result.result || {};
  const score = data.reliability_score || 0;
  const validation = data.validation_results || {};
  
  const risks = [];
  
  // Check SSL
  const ssl = validation.ssl_checker || {};
  if (!ssl.is_valid) {
    risks.push({ level: 'high', issue: 'Invalid SSL certificate' });
  } else if (ssl.days_until_expiry < 30) {
    risks.push({ level: 'medium', issue: 'SSL expires soon' });
  }
  
  // Check sanctions
  const sanction = validation.sanction_check || {};
  if (sanction.has_matches) {
    risks.push({ level: 'critical', issue: 'Sanction matches found' });
  }
  
  // Check adverse media
  const adverse = validation.adverse_media || {};
  if (adverse.decision === 'HIGH_RISK') {
    risks.push({ level: 'high', issue: 'High risk adverse media' });
  } else if (adverse.decision === 'MEDIUM_RISK') {
    risks.push({ level: 'medium', issue: 'Medium risk adverse media' });
  }
  
  // Check domain age (via WHOIS)
  const whois = validation.whois_info || {};
  if (whois.domain_age_days < 180) {
    risks.push({ level: 'medium', issue: 'Domain less than 6 months old' });
  }
  
  // Check Tranco rank
  const tranco = validation.tranco_rank || {};
  if (!tranco.in_top_million) {
    risks.push({ level: 'low', issue: 'Low website popularity' });
  }
  
  // Check social links
  const social = validation.social_media_links || {};
  if (social.invalid_count > social.valid_count) {
    risks.push({ level: 'medium', issue: 'Broken social media links' });
  }
  
  // Determine overall risk level
  const hasCritical = risks.some(r => r.level === 'critical');
  const hasHigh = risks.some(r => r.level === 'high');
  const hasMedium = risks.some(r => r.level === 'medium');
  
  let overallRisk = 'low';
  if (hasCritical) overallRisk = 'critical';
  else if (hasHigh) overallRisk = 'high';
  else if (hasMedium) overallRisk = 'medium';
  
  return {
    score,
    overallRisk,
    risks,
    recommendation: score >= 70 && !hasCritical && !hasHigh ? 'approve' : 'review'
  };
}

// Usage
async function main() {
  const result = await validateWebsite('https://example-shop.com', {
    declared_name: 'Example Shop Inc',
    declared_industry: 'E-commerce',
    include_adverse_media: true,
    include_policies: true
  });
  
  const assessment = assessRisk(result);
  
  console.log(`Score: ${assessment.score.toFixed(1)}/100`);
  console.log(`Overall Risk: ${assessment.overallRisk.toUpperCase()}`);
  console.log(`Recommendation: ${assessment.recommendation.toUpperCase()}`);
  
  if (assessment.risks.length > 0) {
    console.log('\nRisk Factors:');
    assessment.risks.forEach(risk => {
      console.log(`  [${risk.level}] ${risk.issue}`);
    });
  }
}

Best Practices

Performance

  • Basic validation (without policies/adverse media): ~3-5 seconds
  • Full validation (all checks enabled): ~8-15 seconds
  • Use enable_js_render: true only for JavaScript-heavy sites (adds ~2-3 seconds)

Reliability Score Thresholds

Score RangeRecommendation
80-100Low risk, safe to proceed
60-79Medium risk, review recommended
40-59Higher risk, manual verification needed
0-39High risk, exercise caution

Configuration Tips

  1. For E-commerce/Merchant Onboarding:
    • Enable all checks including policies
    • Verify declared_name and declared_industry
    • Check include_adverse_media: true
  2. For Quick Screening:
    • Use basic validation without policies
    • Focus on SSL, sanction check, and Tranco rank
  3. For High-Risk Industries:
    • Enable AML policy check
    • Use tenant-configured risk matrix weights

Tenant Configuration

Settings can be pre-configured per tenant:
  • Default validation options
  • Risk matrix weights
  • Allowed/blocked industries
  • Policy check requirements
Contact support to configure tenant-specific settings.