A Local Task with Hugging Face's Meta-Llama 3.1-8B-Instruct Model to Mask Sensitive Data
Introduction
This demo is to explore the use of Hugging Face's Meta-Llama 3.1-8B-Instruct model for a text processing task focused on desensitizing data. The goal was to create a Python job that locally uses this model to replace sensitive information, such as AWS EC2 instance IDs, with 'XXX' while retaining the rest of the content.
Steps
1. Requesting Access to the Model
Requeste access to the model on Hugging Face's platform.
2. Creating an Access Token
Once access was granted, create an access token from Hugging Face profile. This token is essential for authenticating API calls to Hugging Face services.
3. Setting Up the Python Environment
On local (in my case, I used Mac), set up a Python virtual environment using the following commands
python -m venv .env
activate it
source .env/bin/activate
4. Install the necessary libs
pip install --upgrade transformers accelerate huggingface_hub
5. Logging into Hugging Face
Use the huggingface_hub library to log into Hugging Face
领英推荐
from huggingface_hub import login
login()
Run it to open the huggingface_hub, providing the access token when prompted.
6. Defining the Prompt
Craft a system and user message to instruct the model on the task of desensitizing data. The prompt was tokenized, and a special attention mask was created to handle input properly.
messages = [
{"role": "system", "content": (
"You are a natural language processor. "
"Please replace the AWS EC2 instance id with X, "
"and output the rest of the information as it is. "
"For example, change [ec2-285dct67i5 is in our cloud] to [ec2-XXX is in our cloud]"
)},
{"role": "user", "content": (
"The down AWS EC2 instance id is ec2-01845dct67i, please page the on-call engineer."
)},
]
7. Loading the Model and Tokenizer
Load the tokenizer and model using the transformers library by running the Python Job. The model was downloaded and set up with the appropriate configuration
8. Generating the Output
Generate the output using the model's generate function, setting parameters like max_new_tokens, temperature, and top_p to control the generation process.
inputs = inputs.to(model.device)
attention_mask = attention_mask.to(model.device)
eos_token_id = tokenizer.eos_token_id
if eos_token_id is None:
eos_token_id = tokenizer.convert_tokens_to_ids("")
outputs = model.generate(
inputs,
attention_mask=attention_mask,
max_new_tokens=256,
eos_token_id=eos_token_id,
do_sample=True,
temperature=0.5,
top_p=0.9,
)
9. Extracting and Decoding the Response
The generated response was extracted and decoded to present the final output.
response_text = tokenizer.decode(response_ids, skip_special_tokens=True)
Showcasing the model's ability to desensitize specific data in text
Conclusion
This demo demonstrated the capabilities of the Meta-Llama 3.1-8B-Instruct model in handling text processing tasks, like in the area of data desensitization. The process involved setting up a development environment, accessing the model, and implementing a Python script to generate the desired output. The success of this project paves the way for further exploration and integration of models in various applications.
Source code can be found at https://github.com/GuilinDev/AIPOC/tree/main/Desensitization_llama3.1_8b.