Benchmarking AI Code Builders

Benchmarking AI Code Builders

Complexity of software development is increasing exponentially and consequently is becoming ever more time consuming. Artificial intelligence tools are providing capability for developers to automate areas of software development, such as coding, testing, and debugging. In this article, we will assess different automated AI code generating tools on the market.

Artificial intelligence has been around for a long time and the know how to deliver such intelligence. However, until recently there has been missing ingredients. The intersection of cloud computing, big data, increased processing capabilities have all converged, making artificial intelligence tools more powerful and more capable.

We have run an analysis of a number of well known, and less well known AI tools for software development, including

Co Pilot

Chat GPT 4o

Gemini Pro 1.5

Sonar-8x7B

SES

Claude 3



Requirement?

"make an apex function to validate a postcode and email then format postcode and insert an account with these details and finally create and attach a file to the Account passed to the function"

?

SES?

Response, see this video Link?

SES re-used code from the system, correctly produced an Account and a File and validated both the postcode and email.


Co Pilot

Asking Co Pilot the same question that was asked to SES in the video

?

Response

?

The response produced is not what was asked and isn't a working solution and so Co Pilot failed to produce what was required. Even if Co Pilot had produced something that could be used the developer would have to construct the function, something that the developer wouldn't have to do using SES. Secondly, Co Pilot would produce new code instead of re-using existing code and therefore compounding the problem of bloated and duplicated code in the system; whereas SES avoids making the same mistakes.

?

?Chat GPT 4o?Response

Asking Chat GPT 4o the same question provided a better response, but didn't produce any code for 3 of 4 required functions. It also decided to create an Attachment not a File which is a poor choice.

?

Gemini Pro 1.5 Response

import com.google.cloud.functions.BackgroundFunction;import com.google.cloud.functions.Context;import functions.eventpojos.Account;import functions.eventpojos.EmailAndPostcode;import functions.eventpojos.OutputFile;import functions.utils.EmailValidator;import functions.utils.PostcodeValidator;import java.util.logging.Logger;?public class ValidateAndInsert implements BackgroundFunction<EmailAndPostcode> { private static final Logger logger = Logger.getLogger(ValidateAndInsert.class.getName());? @Override public void accept(EmailAndPostcode event, Context context) { // Extract the email and postcode from the event payload. String email = event.getEmail(); String postcode = event.getPostcode();? // Validate the email and postcode. boolean isValidEmail = EmailValidator.isValid(email); boolean isValidPostcode = PostcodeValidator.isValid(postcode);? // If the email and postcode are valid, format the postcode and insert an account with these // details. if (isValidEmail && isValidPostcode) { String formattedPostcode = PostcodeValidator.formatPostcode(postcode); Account account = new Account(email, formattedPostcode); // Insert the account into the database.? // Create and attach a file to the Account. OutputFile file = new OutputFile("file.txt", "Hello, " + account.getEmail() + "!"); // Attach the file to the Account. } else { // Log an error message if the email or postcode is invalid. logger.severe("Invalid email or postcode: " + email + ", " + postcode); } }        

?

Asking Gemini Pro 1.5, their AI created a Java function which is not what was asked and could not be adapted for Salesforce

?

Sonar-8x7B Response

public class PostcodeEmailValidation { public static void validateAndInsertAccount(String postcode, String email) { // Validate postcode if (!isValidPostcode(postcode)) { throw new DmlException('Invalid postcode'); }? // Validate email if (!isValidEmail(email)) { throw new DmlException('Invalid email'); }? // Format postcode String formattedPostcode = formatPostcode(postcode);? // Insert account Account newAccount = new Account( BillingPostalCode = formattedPostcode, PersonEmail = email ); insert newAccount;? // Create and attach a file to the account Blob fileBody = Blob.valueOf('This is a sample file content'); String fileName = 'sampleFile.txt'; ContentVersion cv = new ContentVersion( Title = fileName, PathOnClient = '/' + fileName, VersionData = fileBody, FirstPublishLocationId = newAccount.Id ); insert cv; }? private static Boolean isValidPostcode(String postcode) { // Implement your postcode validation logic here // For example, in the UK, a valid postcode is a string of 5-7 characters // followed by a space and then a string of 1-4 characters Pattern pattern = Pattern.compile('[A-Z]{1,2}[0-9R][0-9A-Z]? [0-9][ABD-HJLNP-UW-Z]{2}'); Matcher matcher = pattern.matcher(postcode); return matcher.matches(); }? private static Boolean isValidEmail(String email) { // Implement your email validation logic here // For example, a valid email is a string that matches the pattern // [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,} Pattern pattern = Pattern.compile('[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}'); Matcher matcher = pattern.matcher(email); return matcher.matches(); }? private static String formatPostcode(String postcode) { // Implement your postcode formatting logic here // For example, uppercase the postcode and remove any spaces return postcode.toUpperCase().replaceAll(' ', ''); }}?try { PostcodeEmailValidation.validateAndInsertAccount('SW1A 1AA', '[email protected]');} catch (DmlException e) { System.debug('Error: ' + e.getMessage());}        

?

Asking Sonar inserts an Account and creates a File correctly. Sonar has also validated the postcode and email addresses and formatted the postcode. However, didnt use a try{} catch{} or a database savepoint and the user is not able to pass the File content to the function.

?

Claude 3 Response

?

public static void createAccountWithFile(String postcode, String email) { // Validate postcode if (postcode == null || postcode.trim().isEmpty()) { throw new IllegalArgumentException('Postcode cannot be null or empty.'); }? // Validate email if (email == null || email.trim().isEmpty()) { throw new IllegalArgumentException('Email cannot be null or empty.'); }? // Format postcode String formattedPostcode = postcode.trim().toUpperCase();? // Create account Account acc = new Account( Name = 'New Account', BillingPostalCode = formattedPostcode, Email__c = email );? try { insert acc;? // Create file ContentVersion cv = new ContentVersion( Title = 'Sample File', PathOnClient = 'sample.txt', VersionData = Blob.valueOf('This is a sample file.'), IsMajorVersion = true ); insert cv;? // Get content document ID Id conDocId = [SELECT ContentDocumentId FROM ContentVersion WHERE Id =:cv.Id].ContentDocumentId;? // Create content document link ContentDocumentLink cdl = new ContentDocumentLink( LinkedEntityId = acc.Id, ContentDocumentId = conDocId, ShareType = 'V' ); insert cdl;? System.debug('Account created with ID: ' + acc.Id); System.debug('File attached to the account.');? } catch (DmlException e) { System.debug('Error creating account or attaching file: ' + e.getMessage()); }}        

?

Asking Claude 3 inserts an Account and creates a File correctly, but has not validated the postcode or email addresses.

?

Sonar probably performed the best from 5 AI's tested. However, all produced new code which would create bloated and duplicated code in the system and the customer would in the future have to remove most of the code produced by these AI systems and therefore all of the above AI tools create technical debt for customers. Only SES was able to complete the requirements, but also re-use existing code from system, and therefore minimise the creation of new code. This is significant because reduced code size and especially reduced duplicated code will mean fewer defects, faster code and more manageable code.

?

?

?

要查看或添加评论,请登录

Steve Fouracre的更多文章

  • Can AI Estimate The Build Cost Of A Property

    Can AI Estimate The Build Cost Of A Property

    Rarely am I truly impressed with the accuracy of AI. For sure AI is evolving at an exponential rate and it is amazing…

    1 条评论
  • The Good Bad And Ugly Of AgentForce

    The Good Bad And Ugly Of AgentForce

    We’ve all seen the hype generated by AgentForce recently. In this article we will learn: ● what is AgentForce ● how can…

    1 条评论
  • How Safe Is DeepSeek?

    How Safe Is DeepSeek?

    DeepSeek’s business model is an opensource system, meaning it will not make any money from licensing its AI engine. It…

    1 条评论
  • New Salesforce ISV Community Group

    New Salesforce ISV Community Group

    I am pleased to announce Salesforce has accepted my application to host the official Salesforce ISV Community Group…

    4 条评论
  • The Foolish Fermi Paradox

    The Foolish Fermi Paradox

    After reading numerous articles asking 1 of the most enigmatic questions since the dawn of science, "Where are all the…

  • What Are The Mechanics Of AI

    What Are The Mechanics Of AI

    The following mechanisms are the main processes used in modern AIs. Supervised Learning A type of machine learning…

  • What Would The Presidency Of The First AI President Be Like?

    What Would The Presidency Of The First AI President Be Like?

    First read the previous episode https://www.linkedin.

  • History Is Made AI Will Stand As The Next US President

    History Is Made AI Will Stand As The Next US President

    It is the year 2124, and the results of US elections have made history by electing the first AI US President. Is this…

    8 条评论
  • The True Einstein AgentForce Pricing

    The True Einstein AgentForce Pricing

    If you are thinking of purchasing Einstein AgentForce or Co-Pilot that uses the Chat GPT Large Language Models (LLM);…

  • OpenAI Launches A Game Changing New Feature

    OpenAI Launches A Game Changing New Feature

    Before we dive into OpenAI's new canvas feature watch this video which provides a good overview of the new feature…

    7 条评论

社区洞察

其他会员也浏览了