登录查看更多内容

I'm Feeling Lucky

Yogesh Haribhau Kulkarni

AI Advisor (Helping organizations in their AI journeys) | PhD (Geometric Modeling) | Tech Columnist (Marathi)

发布日期: 2022年6月9日

You don't have to be lucky anymore if you wish to have a precise search result.?

We all are familiar with keyword based search. It works well for structured data, like, if you want to know actors names from a particular movie, such information is typically stored against the movie name/id. Almost like a table. But nowadays, we have more and more data of various types, including unstructured data, and thus searches are different now, like, searching within videos, finding similar shapes, etc. That's Neural Search. E.g. it would let you search through YouTube videos describing the scene you remember, making it more flexible and convenient than traditional keyword-based Search.

Such neural search is being made available by Jina AI. It is a cloud-native, open source, cross/multi-modal, data type-agnostic neural search engine which offers anything-to-anything search, ie. Text-to-text, image-to-image, video-to-video, or whatever else you can feed it.

Concepts

Lets understand some basic concepts/components in Jina AI.?

Modality:

Single Modality: Types of input and output are same. e.g. find similar images by providing a reference image.
Cross Modality: Find relevant documents of modality A (let's say—“image”) by querying with documents from modality B (let's say—“text”).
Multi-modality: A query with combining different modalities (text + image) to get the output.

Jina AI ecosystem consists of:

DocArray - A standard data structure for all kinds of data
Jina - Core neural search framework with plug and play search pipeline
Jina Hub - A marketplace to share and reuse the components
Finetuner - Fine-tuning any deep learning model

Jina consists of following components:

Document: Base/primitive data type. Both inputs and outputs are documents.
Executor: Processes the documents. Processes/algorithms such as encoding images into vectors, storing vectors on the disk, ranking results, etc. Examples:

Crafter: Pre-processes documents into chunks.
Encoder: Takes the input pre-processed chuck of documents from the crafter and encodes them into embedding vectors.
Indexer: Takes the encoded vectors as input and indexes and stores the vectors in a key-value fashion.
Ranker: Ranker runs on the indexed storage and sorts the results based on a certain ranking.

Flow: Streamlines and distributes the Executors. It allows you to chain together the DocumentArray and Executors to build and serve an application out of it. It consists of pods (segmenting, encoding, and ranking., etc), a Context manager and?abstraction of high-level tasks, e.g. indexing, searching, training. Jina core typically comprises of two main flows, which are the heart and soul of the semantic search engine:

Indexing Flow: Makes the whole corpus search-able by sentence. The input documents are fed in, processed, and output at the other end is stored as search-able indexes.
Querying Flow: Takes the user query as an input document (primitive Jina data type) and returns a list of ranked matches based on the similarity score within the word embeddings.

We can understand these concepts better with an example.

Example

Let’s implement a simple Neural Search service to demonstrate searching a similar image, based on human perception (Source:[4]). Dataset used is `Totally-Looks-Like Dataset` (Source: [5]). Here are a couple of pairs out of 6016 image pairs.

Lets define a few Executors and then compose them into a sequential pipeline?called, Flow.

Executor to pre-process input images :

from docarray import Document, DocumentArra

from jina import Executor, Flow, requests


class PreprocImg(Executor):

??@requests

??async def foo(self, docs: DocumentArray, **kwargs):

????for d in docs:

??????(

????????d.load_uri_to_image_tensor(200, 200)?# load

????????.set_image_tensor_normalization()?# normalize color

????????.set_image_tensor_channel_axis(

??????????-1, 0

????????)?# switch color axis for the PyTorch model later

??????)

Executor to embed preprocessed images via Resnet.

领英推荐

OpenAI Deep Research: The End of Human Expertise?

Pascal BORNET 1 个月前

Is the Era of Big AI Already Over? | The Singularity…

Singularity University 1 年前

How Good Are Multimodal AI Models Like GPT-4? Explore…

Data Science Dojo 1 年前

class EmbedImg(Executor)

??def __init__(self, **kwargs):

????super().__init__(**kwargs)

????import torchvision

????self.model = torchvision.models.resnet50(pretrained=True)



??@requests

??async def foo(self, docs: DocumentArray, **kwargs):

????docs.embed(self.model):

Executor to perform matching:

class MatchImg(Executor)

??_da = DocumentArray()



??@requests(on='/index')

??async def index(self, docs: DocumentArray, **kwargs):

????self._da.extend(docs)

????docs.clear()?# clear content to save bandwidth



??@requests(on='/search')

??async def foo(self, docs: DocumentArray, **kwargs):

????docs.match(self._da, limit=9)

????del docs[...][:, ('embedding', 'tensor')]?# save bandwidth, not needed:

Use the Flow to connect all the Executors and use `plot` to visualize it.

f = Flow(port_expose=12345).add(uses=PreprocImg).add(uses=EmbedImg, replicas=3).add(uses=MatchImg

f.plot('flow.svg'))

Get the data, download image dataset and pass them to the Flow defined above:

index_data = DocumentArray.pull('demo-leftda', show_progress=True)

Using Flow `f` index the documents

with f

??f.post(

????'/index',

????index_data,

????show_progress=True,

????request_size=8,

??)

??f.block():

Search-able data and corresponding service is ready now. Use a Python client to access the service.

from jina import Clien


c = Client(port=12345)?# connect to localhost:12345

print(c.post('/search', index_data[0])['@m'])?# '@m' is the matches-selectort

To access the service on a web browser, we can use this URL: `https://0.0.0.0:12345/docs`.

Straight-forward, isn't it?

More code-level examples can be seen at github.

Video Search (https://github.com/jina-ai/example-video-search)
Multi-modal Search (https://github.com/jina-ai/example-multimodal-fashion-search)
Meme Search (https://github.com/jina-ai/example-meme-search)

Fascinating!!

Conclusion

Humans think multi-modal. We have description, visuals, sound (and may be, smell and taste also) etc in our minds while looking for something specific. Can AI (Artificial Intelligence) be as effectives as humans? As more and more data-types get merged like demonstrated above, though for a narrow task of `search`, we are getting more closer to Human Intelligence, ie towards AGI (Artificial General Intelligence), isn't it?

Give Jina AI a try and feel free to share your comments...

References

Getting started with Jina AI (https://kunalkushwaha.com/getting-started-with-jina-ai)
Getting started documentation (https://docs.jina.ai/get-started/create-app/)
Next-Gen Search powered by Jina (https://towardsai.net/p/l/next-gen-search-powered-by-jina)
An Overview of Jina AI (https://www.section.io/engineering-education/an-overview-of-jina-ai/)
Totally-Looks-Like Dataset (https://sites.google.com/view/totally-looks-like-dataset)
Audio to Audio example (https://github.com/jina-ai/examples/blob/master/audio-to-audio-search/.github/demo.png)
Official website (https://jina.ai/)

Sofia V.

Senior Product Marketing Manager

2 年

Thank you for taking the time to create such an insightful and well-crafted article. Your efforts are greatly appreciated.

Kapil Hetamsaria

Enterprise AI | ex-McKinsey, Microsoft | entrepreneur, CEO, investor

2 年

Fady El-Rukby

1 次回应

Debdatta Acharya

Driving Customer Value | Supply Chain Management | Transforming Businesses | Contract Lifecycle Management

2 年

Fantastic Yogesh Sir. Lucid explanation of the concept. Love to read your articles.

1 次回应

查看更多评论

要查看或添加评论，请登录

Yogesh Haribhau Kulkarni的更多文章

Intro to Neo4j

2022年10月21日

Intro to Neo4j

Graphs are inherently present in many domains such as logistics, social networks, etc. Graphs are nothing but nodes and…

7 条评论
????? ???? ???? ???? (Might is Right)

2022年9月1日

????? ???? ???? ???? (Might is Right)

Imagine a group of friends is deciding about which movie to watch, a horror or a comedy!!. Usual way is by majority…

1 条评论
I believe I can fly

2022年7月7日

I believe I can fly

Humans always wanted to fly. Like birds.

7 条评论
Do-BERT

2022年7月5日

Do-BERT

BERT (Bidirectional Encoder Representations from Transformers) has taken the world of NLP (Natural Language Processing)…

7 条评论
Cred-ibility

2022年7月5日

Cred-ibility

You would expect a successful Indian entrepreneur to be from IITs, IIMs or if not that at least from STEM (Science…

4 条评论
Mathematics Can Be Fun

2022年6月24日

Mathematics Can Be Fun

My childhood days, apart from playing cricket and doing paintings were also filled with reading wonderful books from a…

3 条评论
AI, generally speaking ...

2022年6月23日

AI, generally speaking ...

Artificial Intelligence (AI) covers wide range of technologies, right from ruled based expert systems to latest…

5 条评论
Transformation by Hugging Face

2022年6月2日

Transformation by Hugging Face

Are you lost in the storm of these BERTs ie ALBERT, DistilBERT, RoBERTa etc? And these GPTs (1-2-3)? Don't understand…

3 条评论
Reversing the Interview

2022年3月6日

Reversing the Interview

One of the popular topics in the interviews for Software Engineering roles, is 'Data Structures and Algorithms' (DSA)…

17 条评论
We are the world, we are the sensors.

2022年3月1日

We are the world, we are the sensors.

Typical Artificial Intelligence (AI) approaches model the data they are fed with. Data is what we can measure and store.

See all articles

I'm Feeling Lucky

Yogesh Haribhau Kulkarni

AI Advisor (Helping organizations in their AI journeys) | PhD (Geometric Modeling) | Tech Columnist (Marathi)

Concepts

Example

领英推荐

Conclusion

References

Yogesh Haribhau Kulkarni的更多文章

社区洞察

其他会员也浏览了

GPT-4: A Potential Stepping Stone on the Path to Artificial General Intelligence AGI

Key Differences Between Data Science and Artificial Intelligence?

Artificial Intelligence News

?? The Future of Designing AI Agents

Issue #252 - The ML Engineer

Understanding the fashion and chronology of algorithms

Generative AI towards Data Science: Navigating Opportunities and Challenges.

What is a Vector Databases / Vector Search?

Decoding Synthetic Data: An Asset or Liability in Machine Learning?

Synthetic Data – Can AI Learn from What Never Happened?

Concepts

Example

领英推荐

Conclusion

References

Yogesh Haribhau Kulkarni的更多文章

Intro to Neo4j

????? ???? ???? ???? (Might is Right)

I believe I can fly

Do-BERT

Cred-ibility

Mathematics Can Be Fun

AI, generally speaking ...

Transformation by Hugging Face

Reversing the Interview

We are the world, we are the sensors.

社区洞察

其他会员也浏览了

GPT-4: A Potential Stepping Stone on the Path to Artificial General Intelligence AGI

Key Differences Between Data Science and Artificial Intelligence?

Artificial Intelligence News

?? The Future of Designing AI Agents

Issue #252 - The ML Engineer

Understanding the fashion and chronology of algorithms

Generative AI towards Data Science: Navigating Opportunities and Challenges.

What is a Vector Databases / Vector Search?

Decoding Synthetic Data: An Asset or Liability in Machine Learning?

Synthetic Data – Can AI Learn from What Never Happened?