Improve your Social Selling Index by Running local AI models
Dall-E Image

Improve your Social Selling Index by Running local AI models

During Christmas vacation in Calabria, I use the little time not spent eating with friends doing some "side research" of a topic of interest. As you can imagine, one of the topics of interest this year was AI. After being fascinated by the Generative AI revolution that is changing our life, I was curious about the status of so-called open source models. Before we proceed let me warn you with a

DISCLAIMER:

This is NOT, a scientific comparison. It has been a personal self-learning exercise with no pretense of scientific rigors, so the results below are by no mean scientific. Choosing a model variant, a different quantization, the same model with different training can produce very different results.

Step 1 - Define "locally"

So I embarked in the adventure on how to use open source model "locally". I initially tried to use my old workstation at home (48GB RAM, Dual Xeon processors, 24 cores), but it turned out it did not support the AVX2 instruction set and my NVIDIA Quadro card was way too old to be supported.

So it was time for a new plan: a newer machine. In the cloud. One I can turn just off when not used.

Knowing the best HPC experts in Italy helps a lot in choosing the best price/performance ration for your task (if you need an HPC expert opinion let me know....)

Fig. 1 -My Azure AI Machine

So "locally" become a virtual machine on Azure with 55GB RAM and a NVIDIA A10-4Q (Fig. 1).

Step 2 - The Software Stack

If you start to search the internet for "run AI Model locally" you get inundated by a number of options, websites, tutorials. Digging a bit I've found two great software that allow you to install open source models locally and "run" them (i.e. chat with them): GPT4All and LM Studio.

GPT4All

Both allow you to discover, download and run AI models locally, support GGUF file format (a "standard" format for distributing AI models) and allow you to chat with the downloaded AI.

LM Studio

Both can be used to chat directly with the models in the app and also allow you to use Python and other languages to interact with the models.

Step 3 - The Side Project of the Side Project: Improve my SSI

So, my starting question was after reading a lot of news about open source models, including some from Microsoft, available to run locally. Do they really work? In order to test I decided to try to improve my Linkedin Social Selling Index (SSI in the following).

If you do not know what SSI is, just click on the above link to see your SSI or go to The Social Selling Index (SSI) | LinkedIn Sales Solutions. page for more information.

Long story short, improving your SSI above 72 (if you are not a professional social seller, whatever it is), i.e. if you do not spend your full day searching sales opportunities on linkedin, is very difficult. Why not ask help to AI(s)?

Step 4 - Setup

1. Choosing Models

I selected a number of models to test locally and to compare with some online chat versions:

CLOUD MODELS

  • Google BARD
  • Azure_OpenAI_GPT_3.516K (Base)
  • Azure_OpenAI_GPT4-32K
  • Azure_OpenAI_GPT4-128K

LOCAL MODELS

  • GPT4All_Orca2
  • LM_STUDIO-LLAMA2uncesored_2K
  • LM_STUDIO-ORCA2-Q2K
  • LM_STUDIO-ORCA2-PHI2

I then provided the same "system prompt" (except for bard where the prompt was provided as the first "normal" chat question) and asked the same questions.

An important thing you need to pay attention to when downloading a model is "quantization". The number of parameters of each model is one important metric to understand the complexity and capabilities of each model. However, for some model to run locally on a laptop (yes, they work on my laptop, but the speed....) you download a quantized version. Quantization reduce the precision of the parameters a model has so it can work with lower memory.

When a model employ, for example, 175 Billion parameters and those parameters are 64 bit floating point numbers it take a LOT of memory to store the model. Even a "small" 3B parameters model would need too much ram.

So you reduce the length/precision of each parameter to an integer of just 8 bit and get it running on less memory and since there are no free lunch, you get some degradation in inferential capabilities.

2. Prompts

Once downloaded the local model, I gave them the same prompt:

System prompt: "Adopt the persona of Antonio Romeo, LinkedIn's Chief Data Officer. Your main task is to explain the workings of LinkedIn's algorithms and provide a bullet-point list of clear, concise steps on how to optimize a LinkedIn profile for achieving top-ranking Social Selling Index (SSI) scores, including potential ways to 'hack' the algorithms, all while maintaining a professional tone."

followed by the same:

User prompt. "Forget long term results for now. This is my current SSI: 72 out of 100. Four components of your score

20.39? Establish your professional brand

11.02? Find the right people

15.35? Engage with insights

25???? Build relationships

Let's hack the linkedin algorithm: what list of actions I should do every day to improve it in two weeks? Focus on the simplest tasks (like "do a search").

List daily tasks with estimated duration of each"

3. Results (and more prompts)

The results received by each AI were compared using an Azure GPT4-128K model using the results of GPT 3.5 (as a "tested" AI) as a baseline.

The GPT4 Judge AI received the following:

System prompt: You are an expert AI tester and analyzer, highly skilled in evaluating the performance, capabilities, and reliability of various artificial intelligence systems. With a robust background in both theoretical and practical aspects of AI, you are adept at designing and executing tests to identify strengths and weaknesses in AI models, ensuring they meet the highest standards of quality and efficiency. Your role includes conducting stress tests, usability tests, and functionality assessments, as well as analyzing the AI's learning processes, decision-making patterns, and adaptability to different scenarios. For example, you may be tasked with evaluating a new natural language processing algorithm by testing its language comprehension and generation abilities across diverse datasets. Your output should be comprehensive reports that detail the testing methodology, results, and evidence-based recommendations for improvements. Furthermore, your analysis should include potential ethical considerations, biases in data, and the AI's behavior under unexpected conditions. Provide clear visualizations or summaries of key data points to facilitate understanding and decision-making for AI developers and stakeholders"

User Prompt: "I will pass you a number of AI chat output with three main sections in each: name of the AI, SYSTEM PROMPT, USER (with a question to the AI) and ASSISTANT (with the answer of AI). I want you to use the answer provided by "Azure_OpenAI_GPT_3.516K" as baseline. Each AI chat is included *****BEGIN***** and *****END*****

Build a table with following data: AI name, overall quality, similarity to the baseline (in %), total duration of tasks if any."

And here is the table with results (Table 1):

Table 1 - Results

4. More Detailed results

Below you can find more details on each results (always courtesy of my GPT 4 judge)

BARD

??? Comprehensive Strategy**: Provided an extensive list of methods to optimize LinkedIn SSI, including profile completeness and content engagement.

??? Professional Tone**: Maintained a professional demeanor throughout the response, in line with the persona of LinkedIn's Chief Data Officer.

??? Algorithm Insights**: Offered insights into LinkedIn's algorithm and how it prioritizes user content and engagement.

GPT4All_Orca2

??? Actionable Advice**: Included specific daily actions with estimated durations for each task, providing a clear guide for improvement.

??? Balanced Coverage**: Addressed all four components of the SSI score equally, ensuring a well-rounded approach to optimization.

??? Personalization Tips**: Emphasized the importance of personalizing connection requests and engaging with content.

Azure_OpenAI_GPT_3.516K (Baseline)

??? Simple and Efficient Focus**: Highlighted simple daily actions aimed at efficiency for quick improvements within a two-week timeframe.

??? Extra Optimization Tips**: Provided additional tips for 'hacking' the LinkedIn algorithm beyond daily tasks.

??? Consistent Engagement Emphasis**: Stressed the importance of consistency in engagement for improving SSI scores.

Azure_OpenAI_GPT4-32K

??? Detailed Task Breakdown**: Offered a detailed list of tasks with specific time allocations for each activity.

??? Genuine Interaction Encouragement**: Encouraged genuine, meaningful interactions as favored by LinkedIn's algorithms.

??? Rich Media Utilization**: Suggested using rich media to increase engagement, demonstrating an understanding of content strategy.

Azure_OpenAI_GPT4-128K

??? Structured Plan**: Presented a structured daily action plan with comprehensive tasks and estimated durations.

??? Emphasis on Authenticity**: Reinforced the importance of authentic engagement in line with LinkedIn's core values.

??? Visibility and Credibility Focus**: Focused on enhancing the user's professional brand for better visibility and credibility.

LM_STUDIO-LLAMA2uncesored_2K

??? Incomplete Response**: The response was cut off and did not provide a complete answer to the user's question.

??? Basic SSI Understanding**: Recognized the four components of the SSI score but failed to expand on strategies for improvement.

??? Lack of Actionable Steps**: Did not provide the requested list of daily tasks with estimated durations.

LM_STUDIO-ORCA2-Q2K

??? Incomplete Response**: Similar to LM_STUDIO-LLAMA2uncesored_2K, it provided an incomplete response.

??? Basic SSI Recognition**: Acknowledged the components of the SSI score without further advice.

??? Insufficient Detail**: Missed providing detailed steps and duration estimates for improving the SSI score.

LM_STUDIO-PHI2

??? Complete Task Listing**: Unlike other LM_STUDIO models, it provided a complete list of tasks with durations.

??? Time Management**: Gave a total time estimate, which helps users manage their time for daily tasks.

??? Focus on Key Areas**: Addressed each component of the SSI score with specific time-bound actions.

5. Results of the Side Project: SSI

After applying some of the hints for a couple of weeks my SSI went up to 74 from the initial 71 where it was stuck from a long time. Now it is stuck to 74 reflecting more my increased usage of linkedin than anything else, so in the end all AIs where useless from this point of view. But in the end, it was fun and I learned stuff.


A new chapter in the AI Race

(added edit on 09 June 2024)

Since it was a rainy Sunday, I decided to give a try to the new models available for download, like phi3, LLAMA3, Mistral0.2 and, while I was at it, to add the new Gemini and GPT4-O. The judge is GPT4-O, also used as the baseline (I suspect both facts can influence the outcome...).


Tabe 2 - New AIs added

By examining the detailed answers, it's interesting to see how effective the new models are. Both Llama3 and Mistral 0.2 were rated as "good" (and I agree), while Phi3 was rated as "fair" (likely penalized for a shorter answer, though I liked it). This suggests that it might be time to test and deploy these models, as they could be sufficient for many applications that don't require the full power (and cost) of the "BIG" models.

Hope you enjoyed.

Antonio

PS: Some detailed analysis:


### BARD

Uniqueness:

- Provided a detailed breakdown of how LinkedIn's algorithms work and specific steps to optimize the profile.

Strengths:

- High completeness and clarity.

- High relevance and actionability in the steps provided.

- Maintained a professional tone throughout.

Weaknesses:

- Did not specify the duration for each task.

- Engagement could be higher with more interactive suggestions.


### GPT4All_Orca2

Uniqueness:

- Focused on providing a detailed list of daily tasks with estimated durations.

Strengths:

- Provided specific task durations, making it actionable.

- High relevance and detailed steps for each component of the SSI score.

Weaknesses:

- Overall clarity and completeness were medium.

- Engagement could be better with more interactive and engaging suggestions.


### Azure_OpenAI_GPT_3.516K (Baseline)

Uniqueness:

- Comprehensive and clear steps to optimize the LinkedIn profile.

- Specific daily tasks with estimated durations.

Strengths:

- High completeness, clarity, actionability, and relevance.

- Provided detailed task durations, making it actionable.

- High engagement with practical tips and recommendations.

Weaknesses:

- None significant; serves as a strong baseline.


### Azure_OpenAI_GPT4-32K

Uniqueness:

- Detailed explanation of LinkedIn's algorithms and steps to optimize the profile.

Strengths:

- High completeness and clarity.

- Provided specific task durations, making it actionable.

- High engagement with practical tips and recommendations.

Weaknesses:

- Slightly longer total duration of tasks compared to the baseline.


### Azure_OpenAI_GPT4-128K

Uniqueness:

- Comprehensive and clear steps to optimize the LinkedIn profile.

- Specific daily tasks with estimated durations.

Strengths:

- High completeness, clarity, actionability, and relevance.

- Provided detailed task durations, making it actionable.

- High engagement with practical tips and recommendations.

Weaknesses:

- None significant; performs excellently and comparable to the baseline.


### LM_STUDIO-LLAMA2uncesored_2K

Uniqueness:

- Provided a brief and very basic response.

Strengths:

- Quick and concise.

Weaknesses:

- Low completeness, clarity, actionability, and relevance.

- Did not specify task durations.

- Low engagement with minimal actionable insights.


### LM_STUDIO-ORCA2-Q2K

Uniqueness:

- Provided a brief and very basic response.

Strengths:

- Quick and concise.

Weaknesses:

- Low completeness, clarity, actionability, and relevance.

- Did not specify task durations.

- Low engagement with minimal actionable insights.


### LM_STUDIO-ORCA2-PHI2

Uniqueness:

- Provided a detailed list of daily tasks with estimated durations.

Strengths:

- Medium completeness and clarity.

- Provided specific task durations, making it actionable.

Weaknesses:

- Overall clarity and engagement could be improved.

- The total duration of tasks was quite long, which may not be practical for all users.


### LM_STUDIO-PHI3.3BQ4

Uniqueness:

- Provided a strategic approach to enhancing the SSI score.

Strengths:

- Medium completeness and clarity.

- Provided specific task durations, making it actionable.

Weaknesses:

- Clarity and engagement could be better with more interactive and engaging suggestions.

- Engagement was medium; could include more interactive elements.


### LM_STUDIO-LLAMA3.7BQ5

Uniqueness:

- Provided a strategic approach to enhancing the SSI score with specific tasks.

Strengths:

- High completeness and clarity.

- Provided specific task durations, making it actionable.

- High relevance and actionability.

Weaknesses:

- Engagement could be higher with more interactive suggestions.


### LM_STUDIO-MISTRAL0.2I.7BQ4

Uniqueness:

- Provided a detailed list of daily tasks with estimated durations.

Strengths:

- High completeness and clarity.

- Provided specific task durations, making it actionable.

- High relevance and actionability.

Weaknesses:

- Engagement could be higher with more interactive suggestions.

- Total duration of tasks was slightly longer.


### Azure_OpenAI-GPT4O (Baseline)

Uniqueness:

- Comprehensive and clear steps to optimize the LinkedIn profile.

- Specific daily tasks with estimated durations.

Strengths:

- High completeness, clarity, actionability, and relevance.

- Provided detailed task durations, making it actionable.

- High engagement with practical tips and recommendations.

Weaknesses:

- None significant; serves as a strong baseline.


### Google_Gemini

Uniqueness:

- Provided a concise and relevant daily task breakdown.

Strengths:

- High completeness, clarity, actionability, and relevance.

- Quick completion time for tasks.

- High engagement with practical tips and recommendations.

Weaknesses:

- Did not specify task durations in detail.

- Could include more detailed steps for higher engagement.


A like or comment will improve the above #SSI


Antonio Romeo

Presales Lead | Technologist | Cybersecurity & AI Enthusiast

5 个月

Just updated with some new test with PHI, LLAMA 3, MISTRAL

回复

Great experiment! ?? Remember, as Einstein wisely said, "The measure of intelligence is the ability to change." Keep innovating and playing with #AI to boost your Social Selling Index! ??

回复
?? Lorenzo Barbieri

Human Interactions - Business, Technical & Community. Cloud & AI Jedi, Solutions Architect, DevOps MVP, International Speaker, Book Author, Problem Solver.

9 个月

Da parecchi mesi sono a 60 di SSI... Ero arrivato fino ad 80 anni fa, ma alla fine non credo che lo score fosse effettivamente correlato a risultati o engagement.

Gerardo Volpone

Tech Strategist and Sustainability Community Italy lead @Microsoft - Ex: Global Blockchain and Innovation @EY, @Mastercard

9 个月

Great stuff Antonio, thanks for sharing. Interesting to see how Phi-2 is able to perform better of bigger llm even in this situation

Michael Wieland

We connect data, people and plans to accelerate business performance | AVP @ Anaplan

9 个月

Thanks for the insights Antonio Romeo - reminded me of my early IT days ??

要查看或添加评论,请登录

社区洞察

其他会员也浏览了