Want to subscribe to topics you're interested in?
Become a Member

Email Verification And Email Cleaning Services Discussion

Discussion in 'Domains, DNS, Email & SSL Certificates' started by eva2000, May 8, 2024.

  1. eva2000

    eva2000 Administrator Staff Member

    51,987
    11,976
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,473
    Local Time:
    2:31 AM
    Nginx 1.25.x
    MariaDB 10.x
    A Xenforo forum discussion brought up the topic of using email verification or email validation services to clean up a 530,000 member email list which has a very high 50% email bounce rate! Cleaning such email lists for bad emails and bounced invalid email addresses is important so that forum email sending doesn't increase your email bounce rate and damage your email sending domain's reputation with high bounce and/or complaint rates :)

    I've consulted with paying clients for this and do this myself on this very forum's member email list :D But it's been partly done manually.

    So I thought I'd code a self-hosted script validate_emails.py that can do the email syntax, DNS and SMTP check verifications locally on own servers but also added API support for 5 paid commercial email verification providers:
    Links to services may be affiliate links ;) The validate_emails.py email validation script was written by me for my paid consulting clients usage. Info at GitHub - centminmod/validate-emails is public documentation for the script only.

    I added Xenforo support to my script too. Generates SQL queries for updating user status user_state in XenForo forum based email validation results. Allowing you to clean up your Xenforo user database's email addresses by disabling email sending to those specific bad email addresses. Just cleaned up this forum's email member list as well so if members have a bad email address, you'd be moved to bounce email user state and not receive forum mailings until you update to a valid email address :)

    I thought I'd share my experiences here as folks might find it useful and folks can chime in of their own experiences with email cleaning or email verification services.


    My personal experience with is with Wordpress and vbulletin/Xenforo forum communities for handling mass emails. So cleaning forum member email lists is an important task. Here's the cost comparison table and demo email verification API comparison results I recently did for these above 5 email verification providers.

    I also added to my script API Merge support via -apimerge argument to merge EmailListVerify + MillionVerifier API results together for more accurate email verification results. So querying 2 API services at once :cool:

    Example of Merging EmailListVerify + MillionVerifier API results for both into one JSON result output for per email verification checks for more accuracy :D

    Code (Text):
    time python validate_emails.py -f user@domain1.com -l emaillist.txt -tm all -api emaillistverify -apikey $elvkey -api millionverifier -apikey_mv $mvkey -apimerge
    [
        {
            "email": "user@mailsac.com",
            "elv_status": "disposable",
            "elv_status_code": null,
            "elv_free_email": "yes",
            "elv_disposable_email": "yes",
            "mv_status": "disposable",
            "mv_status_code": null,
            "mv_free_email": "yes",
            "mv_disposable_email": "yes",
            "mv_free_email_api": false,
            "mv_role_api": true
        },
        {
            "email": "xyz@centmil1.com",
            "elv_status": "invalid",
            "elv_status_code": null,
            "elv_free_email": "no",
            "elv_disposable_email": "no",
            "mv_status": "invalid",
            "mv_status_code": null,
            "mv_free_email": "no",
            "mv_disposable_email": "no",
            "mv_free_email_api": false,
            "mv_role_api": false
        },
        {
            "email": "user+to@domain1.com",
            "elv_status": "ok",
            "elv_status_code": null,
            "elv_free_email": "no",
            "elv_disposable_email": "no",
            "mv_status": "ok",
            "mv_status_code": null,
            "mv_free_email": "no",
            "mv_disposable_email": "no",
            "mv_free_email_api": false,
            "mv_role_api": false
        },
        {
            "email": "user@tempr.email",
            "elv_status": "disposable",
            "elv_status_code": null,
            "elv_free_email": "no",
            "elv_disposable_email": "yes",
            "mv_status": "disposable",
            "mv_status_code": null,
            "mv_free_email": "no",
            "mv_disposable_email": "yes",
            "mv_free_email_api": false,
            "mv_role_api": true
        },
        {
            "email": "info@domain2.com",
            "elv_status": "ok",
            "elv_status_code": null,
            "elv_free_email": "no",
            "elv_disposable_email": "no",
            "mv_status": "ok",
            "mv_status_code": null,
            "mv_free_email": "no",
            "mv_disposable_email": "no",
            "mv_free_email_api": false,
            "mv_role_api": true
        },
        {
            "email": "xyz@domain1.com",
            "elv_status": "invalid",
            "elv_status_code": null,
            "elv_free_email": "no",
            "elv_disposable_email": "no",
            "mv_status": "invalid",
            "mv_status_code": null,
            "mv_free_email": "no",
            "mv_disposable_email": "no",
            "mv_free_email_api": false,
            "mv_role_api": false
        },
        {
            "email": "abc@domain1.com",
            "elv_status": "invalid",
            "elv_status_code": null,
            "elv_free_email": "no",
            "elv_disposable_email": "no",
            "mv_status": "invalid",
            "mv_status_code": null,
            "mv_free_email": "no",
            "mv_disposable_email": "no",
            "mv_free_email_api": false,
            "mv_role_api": true
        },
        {
            "email": "123@domain1.com",
            "elv_status": "invalid",
            "elv_status_code": null,
            "elv_free_email": "no",
            "elv_disposable_email": "no",
            "mv_status": "invalid",
            "mv_status_code": null,
            "mv_free_email": "no",
            "mv_disposable_email": "no",
            "mv_free_email_api": false,
            "mv_role_api": false
        },
        {
            "email": "pop@domain1.com",
            "elv_status": "invalid",
            "elv_status_code": null,
            "elv_free_email": "no",
            "elv_disposable_email": "no",
            "mv_status": "invalid",
            "mv_status_code": null,
            "mv_free_email": "no",
            "mv_disposable_email": "no",
            "mv_free_email_api": false,
            "mv_role_api": false
        },
        {
            "email": "pip@domain1.com",
            "elv_status": "invalid",
            "elv_status_code": null,
            "elv_free_email": "no",
            "elv_disposable_email": "no",
            "mv_status": "invalid",
            "mv_status_code": null,
            "mv_free_email": "no",
            "mv_disposable_email": "no",
            "mv_free_email_api": false,
            "mv_role_api": false
        },
        {
            "email": "user@gmail.com",
            "elv_status": "ok",
            "elv_status_code": null,
            "elv_free_email": "yes",
            "elv_disposable_email": "no",
            "mv_status": "ok",
            "mv_status_code": null,
            "mv_free_email": "yes",
            "mv_disposable_email": "no",
            "mv_free_email_api": true,
            "mv_role_api": false
        },
        {
            "email": "op999@gmail.com",
            "elv_status": "invalid",
            "elv_status_code": null,
            "elv_free_email": "yes",
            "elv_disposable_email": "no",
            "mv_status": "invalid",
            "mv_status_code": null,
            "mv_free_email": "yes",
            "mv_disposable_email": "no",
            "mv_free_email_api": true,
            "mv_role_api": false
        },
        {
            "email": "user@yahoo.com",
            "elv_status": "ok",
            "elv_status_code": null,
            "elv_free_email": "yes",
            "elv_disposable_email": "no",
            "mv_status": "ok",
            "mv_status_code": null,
            "mv_free_email": "yes",
            "mv_disposable_email": "no",
            "mv_free_email_api": true,
            "mv_role_api": false
        },
        {
            "email": "user1@outlook.com",
            "elv_status": "ok",
            "elv_status_code": null,
            "elv_free_email": "yes",
            "elv_disposable_email": "no",
            "mv_status": "ok",
            "mv_status_code": null,
            "mv_free_email": "yes",
            "mv_disposable_email": "no",
            "mv_free_email_api": true,
            "mv_role_api": false
        },
        {
            "email": "user2@hotmail.com",
            "elv_status": "ok",
            "elv_status_code": null,
            "elv_free_email": "yes",
            "elv_disposable_email": "no",
            "mv_status": "ok",
            "mv_status_code": null,
            "mv_free_email": "yes",
            "mv_disposable_email": "no",
            "mv_free_email_api": true,
            "mv_role_api": false
        }
    ]
    
    real    0m15.946s
    user    0m1.017s
    sys     0m0.037s
     
  2. eva2000

    eva2000 Administrator Staff Member

    51,987
    11,976
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,473
    Local Time:
    2:31 AM
    Nginx 1.25.x
    MariaDB 10.x

    Cloudflare HTTP Forward Proxy Worker Cache



    Updated my validate_emails.py script with Cloudflare HTTP Forward Proxy Cache With KV Storage support for EmailListVerify per email check API routines.

    Cloudflare HTTP forward proxy Worker cache configuration which can take the script's API request and forward it to EmailListVerify's API endpoint. The Cloudflare Worker script will then save the API result into Cloudflare KV storage on their edge servers and save with a date timestamp. This can potentially reduce your overall EmailListVerify per email verification costs if you need to run validate_emails.py a few times back to back bypassing having to need to call validate_emails.py API itself.

    Uncached usual run via the script usual result response would be unknown

    Code (Text):
    time python validate_emails.py -f user@domain1.com -e hnyfmw5@canadlan-drugs.com -tm all -api emaillistverify -apikey $elvkey
    [
       {
           "email": "hnyfmw5@canadlan-drugs.com",
           "status": "invalid",
           "status_code": null,
           "free_email": "unknown",
           "disposable_email": "no"
       }
    ]
    
    real    0m2.600s
    user    0m0.279s
    sys     0m0.020s
    


    Via Cloudflare HTTP forward proxy caching KV worker with -apicachettl 120 argument set returns email address status = unknown reducing time to return the result from 2.6s to 0.397s

    Code (Text):
    time python validate_emails.py -f user@domain1.com -e hnyfmw5@canadlan-drugs.com -tm all -api emaillistverify -apikey $elvkey -apicachettl 120
    [
       {
           "email": "hnyfmw5@canadlan-drugs.com",
           "status": "invalid",
           "status_code": null,
           "free_email": "unknown",
           "disposable_email": "no"
       }
    ]
    
    real    0m0.397s
    user    0m0.294s
    sys     0m0.025s
    


    Log inspection
    Code (Text):
    cat email_verification_log_2024-05-08_15-08-05.log | tail -3
    2024-05-08 15:08:06,816 - INFO - Checking cache for email: hnyfmw5@canadlan-drugs.com
    2024-05-08 15:08:07,047 - INFO - Cache check response status code: 200
    2024-05-08 15:08:07,047 - INFO - Cache result: unknown
    


    Cloudflare HTTP forward proxy caching KV worker console logged

    Code (Text):
    [DEBUG] Incoming request: https://cfcachedomain.com/?email=hnyfmw5@canadlan-drugs.com&cachettl=120
    [DEBUG] Email: hnyfmw5@canadlan-drugs.com
    [DEBUG] Cache Key: emaillistverify:hnyfmw5@canadlan-drugs.com
    [DEBUG] Cache TTL: 120
    [DEBUG] Cache Check: null
    [DEBUG] API URL: https://apps.emaillistverify.com/api/verifyEmail?secret=APIKEY&email=hnyfmw5@canadlan-drugs.com&timeout=15
    [DEBUG] Response from Cloudflare CDN cache: Hit
    [DEBUG] Skipping KV cache update as response is served from Cloudflare CDN cache
    [DEBUG] Returning final response with headers: {"cache-control":"max-age=120","content-type":"text/plain"}
    


    Query the KV storage cache entries count via -apicachecheck count

    Code (Text):
    time python validate_emails.py -f user@domain1.com -e hnyfmw5@canadlan-drugs.com -tm all -api emaillistverify -apikey $elvkey -apicachettl 120 -apicachecheck count
    API cache count: 1
    


    Query the KV storage cache entries listings via -apicachecheck list

    Code (Text):
    time python validate_emails.py -f user@domain1.com -e hnyfmw5@canadlan-drugs.com -tm all -api emaillistverify -apikey $elvkey -apicachettl 120 -apicachecheck list
    API cache list:
    {'email': 'hnyfmw5@canadlan-drugs.com', 'result': 'unknown', 'timestamp': 1715175271549, 'age': 16, 'ttl': 120}
    


    S3 Storage Support



    FYI, commercial email verification providers usually only store your file-based uploaded or bulk file API uploaded files for a defined duration, i.e. 30 days before they are deleted. And per email check API results are usually not stored at all. So if you need to store your per email check or bulk file API email verification results for longer, my validate_emails.py script now supports saving your results to S3 object storage providers - Cloudflare R2 or Amazon AWS S3 :D

    example

    Send validate_emails.py script results to Cloudflare R2 S3 object storage via -store r2 argument. Using EmailListVerify per email check API -api emaillistverify -apikey $elvkey + Cloudflare cached for 120 seconds -apicache emaillistverify -apicachettl 120

    Code (Text):
    time python validate_emails.py -f user@domain1.com -e hnyfmw@canadlan-drugs.com,hnyfmw2@canadlan-drugs.com,hnyfmw3@canadlan-drugs.com -api emaillistverify -apikey $elvkey -apicache emaillistverify -apicachettl 120 -tm all -store r2
    
    Output stored successfully in R2: emailapi-emaillistverify-cached/output_20240511051940.json
    [
        {
            "email": "hnyfmw@canadlan-drugs.com",
            "status": "unknown",
            "status_code": null,
            "free_email": "no",
            "disposable_email": "no"
        },
        {
            "email": "hnyfmw2@canadlan-drugs.com",
            "status": "unknown",
            "status_code": null,
            "free_email": "no",
            "disposable_email": "no"
        },
        {
            "email": "hnyfmw3@canadlan-drugs.com",
            "status": "unknown",
            "status_code": null,
            "free_email": "no",
            "disposable_email": "no"
        }
    ]
    
    real    0m1.663s
    user    0m0.391s
    sys     0m0.039s
     
  3. duderuud

    duderuud Premium Member Premium Member

    202
    74
    28
    Dec 5, 2020
    The Netherlands
    Ratings:
    +153
    Local Time:
    6:31 PM
    1.25 x
    10.6
    Sounds interesting but also complicated. I have 300k+ mailaccounts to check so the paid options are too expensive imo.
    So it is also possible to use this locally, can you point me in the right direction for that?

    FYI: Is use MXRoute for all my mails but also have an Amazon SES account. Never used my local SMTP/Mailserver.
     
  4. eva2000

    eva2000 Administrator Staff Member

    51,987
    11,976
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,473
    Local Time:
    2:31 AM
    Nginx 1.25.x
    MariaDB 10.x
    Yes can be expensive but in light of Google and Yahoo's new restrictive email/spam policies, you don't want to risk having to high an email bounce/complaint rate and damage your reputation.

    Yes you can run the script or do local syntax, dns and SMTP checks on Centmin Mod servers via the native Postfix MTA mail server as is provided you have proper setup main hostname and sending from email domain SPF/DKIM/DMARC setup as outlined at https://community.centminmod.com/th...ver-email-doesnt-end-up-in-spam-inboxes.6999/. I setup Amazon SES smtp as Postfix level SMTP relay so I can use from email domain that is Amazon SES verified already. Suppose you could use MXRoute SMTP for Postfix relay but MXRoute has 300 emails/hr rate limit so for 300K email testing, you'd be processing for a while LOL. But you risk damaging the sending domain's reputation this way. Hence, why I use 3rd party pain email verification services where possible and why I developed my above validate_emails.py script with commercial email verification provider API support :)
     
  5. duderuud

    duderuud Premium Member Premium Member

    202
    74
    28
    Dec 5, 2020
    The Netherlands
    Ratings:
    +153
    Local Time:
    6:31 PM
    1.25 x
    10.6
    Okay, bit the bullit and bought a package from Proofy.io

    They have a promotion with 35% off, code: R18HE27T35P1
    Now let's find out how to integrate this into Xenforo :)

    Edit: Where can I download the script itself?
     
    Last edited: May 9, 2024
  6. eva2000

    eva2000 Administrator Staff Member

    51,987
    11,976
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,473
    Local Time:
    2:31 AM
    Nginx 1.25.x
    MariaDB 10.x
    oh thought folks would read GitHub - centminmod/validate-emails - will update above post :)

     
  7. duderuud

    duderuud Premium Member Premium Member

    202
    74
    28
    Dec 5, 2020
    The Netherlands
    Ratings:
    +153
    Local Time:
    6:31 PM
    1.25 x
    10.6
    I read that already but I guess I misunderstood :)
     
  8. eva2000

    eva2000 Administrator Staff Member

    51,987
    11,976
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,473
    Local Time:
    2:31 AM
    Nginx 1.25.x
    MariaDB 10.x
  9. eva2000

    eva2000 Administrator Staff Member

    51,987
    11,976
    113
    May 24, 2014
    Brisbane, Australia
    Ratings:
    +18,473
    Local Time:
    2:31 AM
    Nginx 1.25.x
    MariaDB 10.x
    Updated my PHP Wrapper with single and multiple email support via validate_emails.py per email verification routines and added validate_emails.py supported Cloudflare Cache (enabled for EmailListVerify and Zerobounce) and also support for S3 storage to store email verification results to either Amazon AWS S3 or Cloudflare R2 object storage buckets.

    Note: Timings reported include time for S3 storage - in this case saving to Cloudflare R2 bucket

    validate_email_php_wrapper_multi-style2-cloudflare-cache-s3-02.png validate_email_php_wrapper_multi-style2-cloudflare-cache-s3-02a.png validate_email_php_wrapper_multi-style2-cloudflare-cache-s3-02b.png validate_email_php_wrapper_multi-style2-cloudflare-cache-s3-02c.png validate_email_php_wrapper_multi-style2-cloudflare-cache-s3-02d.png validate_email_php_wrapper_multi-style2-cloudflare-cache-s3-02e.png