登录查看更多内容

Google v Captcha - the AI arms race

Duncan Brigginshaw

Co-Founder and Technical Director, Odin Technology Ltd / Scriptworks.io. Speaker, Technology Enthusiast.

发布日期: 2018年9月12日

So how many of you geeks have tried to break the internet? Well, Googling "Google" won't do it (despite my kids thinking it will). Well I did my own thought experiment today, I pitted the old text based captcha against Google's vision API. And guess what? The response from Google is shown above. Look at the answer from Google in Block 2... Google wins, solving the Captcha first try.

Fascinating stuff. So what's next, well Captcha has moved on, we now have to identify ourselves as Human by recognising images in a square, well Google is in Beta with some pretty powerful stuff which will solve that now, see the image below processed by Google's Beta API:

It picks out the elements of the picture and labels them, it really is cool stuff and does anyone else feel we are in an AI arms race. Where will Captcha go next, DNA samples?

Anyway, so back to test automation... the reason I post this is that we now have some pretty powerful tools available to us that will solve seemingly unsolvable challenges from only a few years ago. OCR (Optical Character Recognition) has been a mainstay in Automation (love it or hate it) for years, relying on some sophisticated OCR libraries to read text from an image, screenshot etc. to obtain actual values from an SUT, or locate text for navigation. Annoyingly HP's UFT always seemed to outperform the best from the Open Source or paid community, but even then I'd say it was 70% accurate at most, sensitive to fonts, screen resolution and other factors, not enough to make part of a core automation strategy.

But then in comes Google's vision API and others, AI based OCR, which is astonishingly accurate and even beats systems designed to beat OCR, and we can use it relatively easily.

Below is a simple bit of JS code that uses the Google Puppeteer API and the Google Vision API, to launch a webpage, grab a screenshot, use AI to read the text on the page (an image on a button in this example) and click on it. 30 lines of code give or take.

const pup = require('puppeteer')
const rp = require('request-promise')
const jp = require('jsonpath')

let url = 'https://decohere.herokuapp.com/planets';
let texttoclick = 'Calculate';


(async () => {
    const browser = await pup.launch({headless: false});
    const page = await browser.newPage();
    await page.goto(url);
    let ss = await page.screenshot({encoding: 'base64'});

    let body = {"requests": [{"image": {"content": ss},"features": [{"type": "TEXT_DETECTION"}]}]}

    var options = {
        method: 'POST',
        uri: 'https://vision.googleapis.com/v1/images:annotate?key=mykey',
        body: body,
        json: true // Automatically stringifies the body to JSON
    };

    let resp = await rp.post(options);
    console.log(JSON.stringify(resp));
    let textobj = jp.query(resp,`$..textAnnotations[?(@.description=="`+ texttoclick + `")]`);
    console.log(JSON.stringify(textobj));

    page.mouse.click(textobj[0].boundingPoly.vertices[0].x + 5,textobj[0].boundingPoly.vertices[0].y + 5)
    await new Promise(r => setTimeout(r, 2000));

    await browser.close();
})();

It's a brave new world out there, and this one of the few concrete examples of how AI can really enrich the test automation space!

Jason Arbon

??♂? CEO founder, testers.ai

6 年

Smart and awesome examples!

要查看或添加评论，请登录

Duncan Brigginshaw的更多文章

Harry Potter did my Exploratory testing!

2019年5月14日

Harry Potter did my Exploratory testing!

AI and Exploratory testing in Gaming Worlds In late 2006 I was privileged enough to consult for a brief ‘Spell” (Yes, I…
Is Selenium to blame for dwindling automation rates?

2019年5月14日

Is Selenium to blame for dwindling automation rates?

The recent 2018-2019 World Quality Report based on a survey of 1700 organisations states that: "the percentage of QA…

5 条评论
Why AI will make traditional Selenium and Appium selectors obsolete in 2019

2019年5月9日

Why AI will make traditional Selenium and Appium selectors obsolete in 2019

In the recent World quality report, 61% of respondents claimed that their biggest barrier to success with automated UI…

4 条评论
Automating the unexpected!

2019年2月8日

Automating the unexpected!

A practical Selenium test automation strategy for dealing with structured search results and lists with…

1 条评论
AI vs really "Tough" test automation!

2018年10月20日

AI vs really "Tough" test automation!

The basics are pretty straightforward, click this, enter data into that… There is more often than not a reliable…

2 条评论
Invest in AI or discipline your Devs?

2018年9月20日

Invest in AI or discipline your Devs?

I love the recent progress in AI and its application in test automation. As a tech nerd and computer science graduate…

1 条评论
Is AI based Test Automation the new Capture-Replay?

2018年9月19日

Is AI based Test Automation the new Capture-Replay?

Test Automation is ‘broken’, it has been for years. Even today, 25 years into the evolution of a discipline, 20-25%…

4 条评论
Dammit Jim, I'm a Doctor not a Software Developer!

2018年9月7日

Dammit Jim, I'm a Doctor not a Software Developer!

Let’s face facts people, we aren’t all cut out to be doctors. And like wise we aren’t all cut out to be coders…

1 条评论
Enter the era of Visual Automation Frameworks

2018年7月11日

Enter the era of Visual Automation Frameworks

I believe we are rapidly entering a new era in automation frameworks. Whilst the term 'framework' is abused, confused…
The bitter taste of lonely coffee - Don't set up a Selenium Java Island

2018年2月1日

The bitter taste of lonely coffee - Don't set up a Selenium Java Island

It's not fun drinking coffee alone (especially without wifi!), so here's my caution, be wary of isolating yourself from…

1 条评论

See all articles

Google v Captcha - the AI arms race

Duncan Brigginshaw

Co-Founder and Technical Director, Odin Technology Ltd / Scriptworks.io. Speaker, Technology Enthusiast.

Duncan Brigginshaw的更多文章

社区洞察

其他会员也浏览了

?? Daily News in AI Agents: Key Updates 12/12

42% of retailers have already adopted AI, should you hop on the trend as well?

You’ve Probably Heard About O3... but what comes next

o1-preview - What you need to know about OpenAI’s newest model

Genie – The "World’s Best AI Software Engineer", The Dawn of Automated Science, Grok-2 Released … and more

?? Welcome to AI Insights Unleashed! ?? - Vol. 57

Weekly AI News: 3 March 2025 - AIforBusiness.net

Multi-Agent Week Recap - MolMo Model, Letta, Core Bench.

Model Monsoon: Navigating the Surge of AI Advancements

Gen AI news - 2023 0914

Duncan Brigginshaw的更多文章

Harry Potter did my Exploratory testing!

Is Selenium to blame for dwindling automation rates?

Why AI will make traditional Selenium and Appium selectors obsolete in 2019

Automating the unexpected!

AI vs really "Tough" test automation!

Invest in AI or discipline your Devs?

Is AI based Test Automation the new Capture-Replay?

Dammit Jim, I'm a Doctor not a Software Developer!

Enter the era of Visual Automation Frameworks

The bitter taste of lonely coffee - Don't set up a Selenium Java Island

社区洞察

其他会员也浏览了

?? Daily News in AI Agents: Key Updates 12/12

42% of retailers have already adopted AI, should you hop on the trend as well?

You’ve Probably Heard About O3... but what comes next

o1-preview - What you need to know about OpenAI’s newest model

Genie – The "World’s Best AI Software Engineer", The Dawn of Automated Science, Grok-2 Released … and more

?? Welcome to AI Insights Unleashed! ?? - Vol. 57

Weekly AI News: 3 March 2025 - AIforBusiness.net

Multi-Agent Week Recap - MolMo Model, Letta, Core Bench.

Model Monsoon: Navigating the Surge of AI Advancements

Gen AI news - 2023 0914