登录查看更多内容

AI generates a Test for logical expressions

Gregory Solovey, PhD

Senior Director of Quality Assurance

发布日期: 2024年12月28日

The graph-analytical method provides a practical approach to testing logical expressions, focusing on non-syntax errors that are not typically detected by syntax analyzers, and excluding certain programming errors (e.g., variable-type mismatches) that are usually addressed by unit tests. It specifically targets:

·???????? Boolean variable substitution errors: Variables may be incorrectly replaced with other variables or with constants, aka “stuck-at-zero” or “stuck-at-one” errors.

·???????? Logical operator replacement errors: Operators such as “and” and “or” may be improperly substituted for one another.

Under this method, the logical expression is represented as a directed graph, with a start point (SP) and an end point (EP). Each Boolean variable is associated with an edge, OR operations appear as parallel edges, AND operations appear as sequential edges. For each target edge (i.e., each variable), the algorithm locates a path from SP to EP that includes this edge. The expression’s value is then examined with different truth assignments to the target variable. If altering the variable’s value does not affect the expression’s outcome, it indicates a stuck-at error. In this way, all stuck-at-zero and stuck-at-one errors are identified.

Many years ago, I wrote a Perl script—several thousand lines of code—that automated the graph-analytical method. Recently, I asked ChatGPT and Grok to construct the graph and generate the necessary tests using only English-language prompts. Both AI systems repeatedly requested a more formal specification or sample Python code, but I continued refining my prompts in plain English, building on an article (test_design_methogs), with the goal of creating a fully automated test-generation process. Over the course of about 100 iterations, I addressed various unsatisfactory outcomes—often related to errors in graph construction or test generation—by applying the AI’s suggestions to refine the prompts. Eventually, this iterative approach produced the satisfactory result presented below.

AI recommended using Mermaid syntax to present the graph. Below are several examples of input-output pairs, including the logical expression, the corresponding Mermaid syntax, a preview of the resulting graph (see mermaid.live/edit), and the final test results.

Note: The test set shown is not minimal. In Example 2, the expression contains 12 variables. Therefore:

A complete permutation-based test would require 2^{12} = 4096 tests.
A minimal set, consisting of 4 paths and 6 cuts covering all edges, would need only 10 test cases.
A practical test from the proposed algorithm results in 24 test cases—twice the number of edges in the graph.

Algorithm for Generating Mermaid Graphs from Logical Expression

## Strict Algorithmic Adherence:

? Sole Reference: Follow this algorithm exactly as described - it is your only reference; do not use any external knowledge, common practices, or logic unless explicitly mentioned here.

? Literal Interpretation: Interpret each step literally; do not infer, add, or modify any actions beyond what is directly specified in the algorithm.

? No Deviations: Any deviation requires explicit permission within the algorithm itself; do not execute any action or step unless it is explicitly listed in the algorithm.

## Input: A logical expression involving Boolean variables, logical operators (AND, OR, NOT) and parentheses.

## Output: A Mermaid graph representation showing the expression's structure, with only single-variable edges or negated single-variable edges.

## Definitions:

? "Mermaid Graph Code" (MGC) refers to the Mermaid syntax, which defines the structure and behavior of the graph

? Log includes each step of the decomposition process, so the exact sequence can be reviewed.

? Node Naming Convention:

Start Point: SP((SP))

End Point: EP((EP))

Intermediate Nodes: O1((o)), O2((o)), O3((o)), etc.

## Template for Decomposing OR Operations:

%% If the parent term is: O3((o)) -->| d OR e OR m| O4((o))

%% The replacement will look like:

O3((o)) -->|d| O4((o))

O3((o)) -->|e| O4((o))

O3((o)) -->|m| O4((o))

## Template for Decomposing AND Operations:**

%% If the parent term is: O3((o)) -->|d AND e AND m| O4((o))

%% The replacement will look like:

O3((o)) -->|d| O5((o))

O5((o)) -->|e| O6((o))

O6((o)) -->|m| O4((o))

## Steps:

1. Begin the MGC with line 1 - "graph LR" and line 2 - a comment (%%) containing the original logical term.

2. Simplify Global Negations in LOG

2.1 Apply De Morgan's laws to remove negations of composite terms:

NOT(A AND B) == (NOT A OR NOT B)

NOT(A OR B) == (NOT A AND NOT B)

2.2. Continue until no negations directly apply to parenthesized sub-expressions.

3. Initialize the Graph in LOG

Represent the input term as a single edge from a Start Point (SP) to an End Point (EP) and labeled by this term:

SP((SP)) -->|term| EP((EP))

4. Decompose the term

4.1. Print to LOG the current edge as per template: "Log: step 4.1 decomposing the edge 'O1((o)) -->|term| O2((o))'".

4.2. If the term is covered by the outermost parentheses, then

remove ONLY these outermost parentheses, but keep local parentheses for each sub-expression until it's time to decompose them.

execute step 4 with the inner structure term.

4.3. If the term is a single variable or a negated variable, insert this edge to the MGC and stop recursion for that term.

4.4. Always decompose OR before AND when both operators are present at the same top-level.

4.5. When encountering a top-level OR operation (A OR B OR ...):

4.5.1. DO NOT introduce junction, "temporarily", intermediate nodes for OR operations!

4.5.2. The children edges must share the same start-point and the same end-point as their parent edge.

4.5.3. Identify all sub-terms connected by top-level OR operators. If part of the OR is an "AND term", treat it as a single term until you decompose it further in its context.

4.5.4. Replace the parent edge with multiple parallel edges from the parent's start-point to the parent's end-point, each labeled with a sub-term.

4.5.5. Print to LOG after decomposition all parallel edges per this template: "Log: step 4.5 new edge 'O1((o)) -->|term| O2((o))'".

4.5.6. Execute step 4 for each sub-term within those parallel edges.

4.5.7. Verify that the MGC has not been altered during substeps of the step 4.5.

4.5.8. Verify that all parallel edges have the parent's start-point, the parent's end-point, the label with a sub-term.

4.6. When encountering a top-level AND operation:

4.6.1. DO NOT introduce new end-point - it should be only the parent's end-point!

4.6.2. Identify all sub-terms connected by top-level AND operators.

4.6.3. Decompose the parent term into a sequential chain where each edge represents one sub-term that will be further decomposed.

4.6.4. Replace the parent edge with a chain of intermediate nodes ((o)) linking each sub-term (A,B,...) sequentially;

4.6.5. The chain must start from parent's start-point, continue through a chain of intermediate nodes connected to the parent's end-point.

4.6.6. DO NOT introduce junction nodes more than the number of sub-terms

4.6.7. Assign nodes sequentially in the order of their creation. Nodes for SP always lead to O1. The next level generates O2, and so on.

4.6.8. DO NOT introduce new end-point - it should be parent's end-point!

4.6.9. Print to LOG after decomposition all sequential edges - for example: "Log: step 4.6 new edge 'O1((o)) -->|term| O2((o))'".

4.6.10. Execute step 4 for each sub-term within those sequential edges.

4.5.11. Verify that the MGC has not been altered during substeps of the step 4.6.

4.5.12. Verify that the the chain of all sequential edges starts from the parent's start-point and end in the parent's end-point, and have the label with a sub-term.

6. Verification step

6.1. Are only edges labeled with a single variable or its negation in MGC?

6.2. Are all variable from the input expression and its negations are presented in MGC?

6.3. Is there only one SP node in MGC?

6.4. Is there only one EP node in MGC?

6.5. Do all junction intermediate nodes have both incoming and outgoing edges in MGC?

6.6. Does the reverse engineering expression from the graph equal to the original logical expression?

7. Add to MGC

%% Style definitions

classDef invisibleNodes stroke:none, font-size:6px;

class {comma separated set of intermediate nodes} invisibleNodes;

8. Return complete MGC and LOG containing only lines according to 4.1., 4.5.5., 4.6.9.

领英推荐

??Top ML Papers of the Week

DAIR.AI 6 个月前

Llama 3.2: On-device 1B/3B and Multimodal 11B/90B…

Clarifai 5 个月前

6 Essential Headline Skill Areas for (Gen)AI work: A…

Stack Digital 11 个月前

Test Design Algorithm for Logical Expressions

The algorithm systematically generates test cases to identify errors related to elements of logical expressions. It focuses on non-syntax errors (which are typically caught by syntax analyzers) or specific programming errors, such as misuse of variable types (which are typically caught by unit test).

## Object to Be Tested is a logical expression:

? Boolean constants (1/ 0 – true/ false) and variables are logical terms

? If a and b are terms, then the expression a ∧ b (a AND b), a ∨ b (a OR b) are the terms.

? If a is a term, then its negation ?a (NOT a) is also a term.

? (skip for now) relational expressions (variables and constants, joined together using relational operators such as >, >=, =, <>, <, <=) are logical terms.

## Error Classes to Be Covered by Tests

? Boolean Variable Replacement Errors. Variables may be mistakenly replaced with other variables’ name or with constants (stuck-in-zero or stuck-in-one errors).

? Logical Operator Replacement Errors. Relational operators might be incorrectly replaced by other operators (e.g., “and” replaced with “or”).

? Assumption: Only a single error can exist, and therefore each error can be found on a single path or cut.

## Strict Algorithmic Adherence:

? Sole Reference: Follow this algorithm exactly as described - it is your only reference; do not use any external knowledge, common practices, or logic unless explicitly mentioned here.

? Literal Interpretation: Interpret each step literally; do not infer, add, or modify any actions beyond what is directly specified in the algorithm.

? No Deviations: Any deviation requires explicit permission within the algorithm itself; do not execute any action or step unless it is explicitly listed in the algorithm.

## Input:

Mermaid Graph Code (MGC): Defines the structure and behavior of the graph with single-variable edges.

? Nodes are defined with id(text) or just id .

? Edges between nodes are defined by --> with labels: B -->|label| C

## Output:

? Test Table: Organizes test cases.

? LOG: Records all steps of test generation.

? ERROR LOG: Records all errors occur in test generation.

## Definitions:

Graph Points: Graph has Start Point (SP) and End Point (EP).

Connected Edge: Edge is connected if:

? Variable label = 'TRUE'

? Negated variable label = 'FALSE'

Cut Edge: Edge is cut if:

? Variable label = 'FALSE'

? Negated variable label = 'TRUE'

Graph Path: Sequence of connected edges from SP to EP, others cut; expression 'TRUE'.

Graph Cut: Set of cut edges blocking all paths from SP to EP; expression 'FALSE'.

Practical Test: Two tests per edge: one with the edge belong to path and connected, one with the edge belong to path and cut.

Minimal Test: Minimum paths and cuts covering all edges. currently is not implemented.

## Algorithm Steps:

### Step 1: Input Verification

? Verify that edges labeled only with single variables or their negations.

? Verify that single SP node does not have an incoming edge.

? Verify that single EP node does not have an outgoing edge.

? Verify that All intermediate nodes have both incoming and outgoing edges.

### Step 2: Edge Identification

1. List unique edges by start and end points from MGC.

2. Assign unique identifiers to each edge.

3. Log each identified edge. “edge (e.g., O6((o)) -->|g| EP((EP))) that is identified as (e.g., O6_g_EP)”

4. Verify that edge count in input graph and number of edge identifiers is the same.

### Step 3: Test Generation (for each edge)

1. Identify Target Edge: Define by start node, end node, and label.

2. Build Path, verify path, store path:

Use BFS for backward and forward traversal: Stop the BFS algorithm of path selection when the first path is found.

? Do Backward traversal: From SP to start of target edge.

? Do Forward traversal: From end of target edge to EP.

? Combine Traversals: Merge into a complete path.

? Verify that the path includes the target edge and reaches from SP to EP.

? Store path in a PathStructure object or list for reference.

3. Build Path Test:

Set Variables:

## make all edges on the path “connected edges”

For each edge on the path:

? make variable = 1 (TRUE) if positive label.

? make variable = 0 (FALSE) if negated label.

## make all undefined graph edges “cut edges” to isolate the path

For all other edges:

? make variable = 0 (FALSE) if positive label.

? make variable = 1 (TRUE) if negated label.

Verify All Variables Set: Use a checklist to ensure all variables are accounted for.

Set Expected Result: 1 (TRUE)

Log and Store: Log details, store in test table.

4. Build Cut Test:

Copy from Path Test: Copy all settings except for the target edge:

Set Target edge:

? make variable = 0 (FALSE) if positive label.

? make variable = 1 (TRUE) if negated label.

Set Expected Result: 0 (FALSE)

Log and Store**: Log details, store in test table.

5. Consistency Check:

Verify that only the target edge's state changed between path and cut tests.

Ensure all other edges maintain their state from the path test.

### Step 4: Output Verification

Verify:

1. Each unique edge has both Path and Cut tests.

2. No edge has more than two tests.

3. No tests unrelated to edges.

4. Variable settings for each test case match between the LOG and the Test Table.

### Step 5: Output

Print Compile Test Table: Columns for:

? "Test#" - sequential number of the Test Case

? "Edge": Include both the Edge ID (e.g., E1) and the full edge label (e.g., (SP -a-> O1)).

? a columns list for each variable (a, b, c, ...). Each variable’s assigned value (0 or 1) should be there.

? "Exp" - Expected Result (1/0)

? "Description" (Path or Cut; path labels)(e.g., "Path: a,c" or "E1 cut").

? Save Format LOG: Detailed, including path validations. Print per user request

? Save Error LOG: Any verification failures. Print per user request

要查看或添加评论，请登录

Gregory Solovey, PhD的更多文章

AI-Generated Test for Finite State Machines

2025年2月11日

AI-Generated Test for Finite State Machines

In previous articles I tried to show that AI excels more in coding than in designing tests: My Travails Teaching AI to…
My Travails Teaching AI to Build Tests Like a Pro

2024年12月28日

My Travails Teaching AI to Build Tests Like a Pro

As per a previous post (AI Test Generation - a False Sense of Security ) if we ask AI to build a test without…

4 条评论
AI generates a Test for Functional and Relational expressions

2024年12月28日

AI generates a Test for Functional and Relational expressions

The paper [Test Design TechniquesTest Design Techniques. Professional Tester magazine · Oct 25, 2014] presents methods…
Keyword Driven BDD - Data framework

2023年12月25日

Keyword Driven BDD - Data framework

By Gregory Solovey, Simon Krischer This is the fourth article in the series ‘Keyword Driven BDD’. In this article, we…
Keyword Driven BDD — UI framework

2023年12月1日

Keyword Driven BDD — UI framework

By Dominic Sagayaraj, Gregory Solovey, Muthuvel Balu This is the third article in the series ‘Keyword Driven BDD’. In…
Expose the defect

2021年7月29日

Expose the defect

Dedication I would like to dedicate this article to my teacher, Victor Vasilyevich Danilov. I met him forty years ago…

1 条评论
The Test Framework according to Confucius

2019年12月31日

The Test Framework according to Confucius

In medieval China, in order to make an idea widely acceptable, the author had to declare: "As Confucius said ..
Test automation process: who is guilty and what to do?

2019年12月17日

Test automation process: who is guilty and what to do?

Two main Russian questions are “who is guilty and what to do?” Mid-sized companies are missing out on the benefits of…

See all articles

AI generates a Test for logical expressions

Gregory Solovey, PhD

Senior Director of Quality Assurance

Algorithm for Generating Mermaid Graphs from Logical Expression

领英推荐

Test Design Algorithm for Logical Expressions

Gregory Solovey, PhD的更多文章

社区洞察

其他会员也浏览了

Crafting Cypher Queries from Question Phrases

Mastering the Ingestion Phase of Retriever Augmented Generation (RAG)

Interesting Content in AI, Software, Business, and Tech- 5/31/2023

Building an AI Assistant with DSPy

Building a simple Chatbot using LangChain

Creating a Web App for Inventory Management with Claude Sonnet 3.5 and GPT-4o

The 100% A Strategy - A Data Analysis of SPM 2023 with Python and ChatGPT 4o

?? Boost Data Annotation Efficiency with Python Transformers ??

A Comprehensive Guide : How to Test APIs and Large Language Models (LLMs)

A RAG Prototype

Algorithm for Generating Mermaid Graphs from Logical Expression

领英推荐

Test Design Algorithm for Logical Expressions

Gregory Solovey, PhD的更多文章

AI-Generated Test for Finite State Machines

My Travails Teaching AI to Build Tests Like a Pro

AI generates a Test for Functional and Relational expressions

Keyword Driven BDD - Data framework

Keyword Driven BDD — UI framework

Expose the defect

The Test Framework according to Confucius

Test automation process: who is guilty and what to do?

社区洞察

其他会员也浏览了

Crafting Cypher Queries from Question Phrases

Mastering the Ingestion Phase of Retriever Augmented Generation (RAG)

Interesting Content in AI, Software, Business, and Tech- 5/31/2023

Building an AI Assistant with DSPy

Building a simple Chatbot using LangChain

Creating a Web App for Inventory Management with Claude Sonnet 3.5 and GPT-4o

The 100% A Strategy - A Data Analysis of SPM 2023 with Python and ChatGPT 4o

?? Boost Data Annotation Efficiency with Python Transformers ??

A Comprehensive Guide : How to Test APIs and Large Language Models (LLMs)

A RAG Prototype