Google Cloud Vision and Language APIs in Action
Juliano Souza
Director of Information Technology | Technology Mentor for Startups in the EMEA Region.
In my journey of exploration into the kingdom of artificial intelligence (AI), I've discovered a gateway to understanding and harnessing its power through Google Cloud APIs. As I delved into the world of Generative AI and cloud computing, these APIs emerged as invaluable tools, guiding me along the path of AI development with their wealth of features and capabilities.
With a thirst for knowledge and a desire to unlock the mysteries of Generative AI, I embarked on a quest to study and experiment with various AI models and frameworks. However, I soon realized that the true magic lay in the fusion of AI with cloud technologies, enabling me to access vast computational resources and sophisticated AI algorithms at my fingertips.
Enter Google Cloud APIs - the enchanted gateways to a world of AI-driven possibilities. Through these APIs, I discovered the power of Google's AI and machine learning services, offering a seamless integration of cutting-edge AI models and cloud infrastructure. From image recognition to natural language processing, these APIs provided me with the tools to explore the depths of AI development with ease and efficiency.
As I immersed myself in the study of Generative AI, I found that the Google Cloud APIs offered a myriad of features that brought me closer to my goals. Whether it was the Vision API for analyzing images, the Language API for understanding text, or the Speech API for processing audio, each API served as a stepping stone on my journey towards mastering AI development.
In this article, I will share some quick example and insights gained from using Google Cloud APIs to study Generative AI. Through practical examples and tutorials, we will explore how these APIs can be leveraged to unlock the potential of AI and pave the way for groundbreaking innovations in AI development. Join me as we embark on a fascinating journey into the world of AI-powered creativity and discovery.
Google Cloud Vision API: A Gateway to Visual Understanding Picture this: You have an image, and you want to unravel its secrets. With Google Cloud Vision API, it's like waving a wand to reveal the hidden treasures within images. Let's explore its enchanting features:
Magical Use Cases:
Google Cloud Language API: Unraveling the Mysteries of Language Now, let's turn our attention to the Language API - the master of understanding the complexities of human language. It's like having a wise sage by your side, decoding the meanings behind every word:
Mystical Use Cases:
Let's Dive into the Magic: A Tutorial Now, let's experience the magic firsthand with a captivating tutorial. We'll use the Vision API to perform OCR on an image, extract text, and then analyze its sentiment using the Language API. Ready to witness the magic unfold?
import io
import os
from google.cloud import vision
from google.auth import exceptions
from google.cloud import language_v1
# Set the path to your service account key JSON file
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = './gcp-ocr-key.json'
try:
# Set up Google Cloud Vision API client
client_vision = vision.ImageAnnotatorClient()
except exceptions.DefaultCredentialsError as e:
print("Error: {}".format(e))
# Read the document image
with io.open('./nlp3.jpg', 'rb') as image_file:
content = image_file.read()
image = vision.Image(content=content)
# Perform text detection
response = client_vision.text_detection(image=image)
texts = response.text_annotations
for text in texts:
print('\n"{}"'.format(text.description))
# Extract OCR text
ocr_text = texts[0].description if texts else ''
# Initialize Natural Language API client
client_nlp = language_v1.LanguageServiceClient()
# Pass OCR text to NLP for analysis
# document = {"content": ocr_text, "type": language_v1.Document.Type.PLAIN_TEXT}
document = {
"content": ocr_text,
"type": language_v1.Document.Type.PLAIN_TEXT,
"language": "en" # Specify the language code of the OCR text (e.g., 'en' for English)
}
# Analyze sentiment
response = client_nlp.analyze_sentiment(request={'document': document})
# Print sentiment score
print('Sentiment Score: ', response.document_sentiment.score)
# Read the document image
with io.open('./nlp3.jpg', 'rb') as image_file:
content = image_file.read()
image = vision.Image(content=content)
# Perform label detection using Google Cloud Vision API
response = client_vision.label_detection(image=image)
labels = response.label_annotations
# Print detected labels
print('Labels detected:')
for label in labels:
print(label.description)
Here's a breakdown of what the code does:
领英推荐
Running the code
This is the picture that I want to explore Google Vision capabilities:
Running my python code:
The output of the code reveals the following results:
The sentiment score is a numerical value that represents the overall sentiment or emotional tone conveyed by a piece of text. It is calculated based on the analysis of the language used in the text and can range from negative to positive, with zero typically representing a neutral sentiment.
In the context of the code you provided, the sentiment score is obtained through the sentiment analysis performed using the Google Cloud Natural Language API. This API examines the text extracted from the image and assigns a sentiment score to indicate whether the text expresses a positive, negative, or neutral sentiment.
A sentiment score of:
In the case of your results with a sentiment score of 0.0, it suggests that the text extracted from the image does not convey a strong emotional sentiment and is considered neutral.
Overall, the code has effectively performed Optical Character Recognition (OCR) on the given image (nlp3.jpg), extracted the text, and then analyzed the sentiment of the extracted text using the Google Cloud Vision and Language APIs. The sentiment score indicates a neutral sentiment in this case.
Labels: The output "Labels detected" represents the objects, concepts, or entities that have been identified within the image through label detection using the Google Cloud Vision API. Let's break down what each label represents:
Conclusion
This simple yet insightful example underscores the remarkable capabilities of Google Cloud APIs, demonstrating their potential to unravel the intricacies of image data through OCR and glean valuable insights from text through sentiment analysis. Through this demonstration, we've only scratched the surface of the vast possibilities that Google Cloud APIs offer, hinting at the boundless opportunities for innovation and exploration in the realm of AI-driven solutions. As we continue to delve deeper into the realms of artificial intelligence and cloud computing, let this example serve as a testament to the transformative power of technology and the endless horizons it presents for those embarking on the journey of AI development.
Digital Marketing Analyst @ Sivantos
8 个月Excited to see the power of Google Cloud Vision and Language APIs in action! ???? #InnovativeTechnology #InsightfulAnalysis
??"Suggested Term" Optimization for Home Care/Health |??Sculpting Success With Fully Automated Marketing Process |??200+ businesses auto-suggested by Google | ???Effortlessly get online customer reviews | ??Near Me
8 个月Exciting journey into the world of Google Cloud Vision and Language APIs! ??
I help companies resuscitate dead leads and sell using AI ?????????????? #copywriting #emailmarketing #coldemail #content #databasereactivation
8 个月Such an insightful exploration! The integration of Google Cloud Vision and Language APIs truly opens doors to a world of valuable data insights.
DevOps Engineer | GCP Certified | AWS Cloud Engineer
8 个月Amazing journey Juliano Souza!! Way to go!! ??