I'm Feeling Lucky
(Source: Pixabay)

I'm Feeling Lucky

You don't have to be lucky anymore if you wish to have a precise search result.?

We all are familiar with keyword based search. It works well for structured data, like, if you want to know actors names from a particular movie, such information is typically stored against the movie name/id. Almost like a table. But nowadays, we have more and more data of various types, including unstructured data, and thus searches are different now, like, searching within videos, finding similar shapes, etc. That's Neural Search. E.g. it would let you search through YouTube videos describing the scene you remember, making it more flexible and convenient than traditional keyword-based Search.

Such neural search is being made available by Jina AI. It is a cloud-native, open source, cross/multi-modal, data type-agnostic neural search engine which offers anything-to-anything search, ie. Text-to-text, image-to-image, video-to-video, or whatever else you can feed it.

Concepts

Lets understand some basic concepts/components in Jina AI.?

Modality:

  • Single Modality: Types of input and output are same. e.g. find similar images by providing a reference image.
  • Cross Modality: Find relevant documents of modality A (let's say—“image”) by querying with documents from modality B (let's say—“text”).
  • Multi-modality: A query with combining different modalities (text + image) to get the output.

Jina AI ecosystem consists of:

  • DocArray - A standard data structure for all kinds of data
  • Jina - Core neural search framework with plug and play search pipeline
  • Jina Hub - A marketplace to share and reuse the components
  • Finetuner - Fine-tuning any deep learning model

Jina consists of following components:

  • Document: Base/primitive data type. Both inputs and outputs are documents.
  • Executor: Processes the documents. Processes/algorithms such as encoding images into vectors, storing vectors on the disk, ranking results, etc. Examples:

  1. Crafter: Pre-processes documents into chunks.
  2. Encoder: Takes the input pre-processed chuck of documents from the crafter and encodes them into embedding vectors.
  3. Indexer: Takes the encoded vectors as input and indexes and stores the vectors in a key-value fashion.
  4. Ranker: Ranker runs on the indexed storage and sorts the results based on a certain ranking.

  • Flow: Streamlines and distributes the Executors. It allows you to chain together the DocumentArray and Executors to build and serve an application out of it. It consists of pods (segmenting, encoding, and ranking., etc), a Context manager and?abstraction of high-level tasks, e.g. indexing, searching, training. Jina core typically comprises of two main flows, which are the heart and soul of the semantic search engine:

  1. Indexing Flow: Makes the whole corpus search-able by sentence. The input documents are fed in, processed, and output at the other end is stored as search-able indexes.
  2. Querying Flow: Takes the user query as an input document (primitive Jina data type) and returns a list of ranked matches based on the similarity score within the word embeddings.

We can understand these concepts better with an example.

Example

Let’s implement a simple Neural Search service to demonstrate searching a similar image, based on human perception (Source:[4]). Dataset used is `Totally-Looks-Like Dataset` (Source: [5]). Here are a couple of pairs out of 6016 image pairs.

No alt text provided for this image

Lets define a few Executors and then compose them into a sequential pipeline?called, Flow.

Executor to pre-process input images :

from docarray import Document, DocumentArra

from jina import Executor, Flow, requests


class PreprocImg(Executor):

??@requests

??async def foo(self, docs: DocumentArray, **kwargs):

????for d in docs:

??????(

????????d.load_uri_to_image_tensor(200, 200)?# load

????????.set_image_tensor_normalization()?# normalize color

????????.set_image_tensor_channel_axis(

??????????-1, 0

????????)?# switch color axis for the PyTorch model later

??????)        

Executor to embed preprocessed images via Resnet.

class EmbedImg(Executor)

??def __init__(self, **kwargs):

????super().__init__(**kwargs)

????import torchvision

????self.model = torchvision.models.resnet50(pretrained=True)



??@requests

??async def foo(self, docs: DocumentArray, **kwargs):

????docs.embed(self.model):        

Executor to perform matching:

class MatchImg(Executor)

??_da = DocumentArray()



??@requests(on='/index')

??async def index(self, docs: DocumentArray, **kwargs):

????self._da.extend(docs)

????docs.clear()?# clear content to save bandwidth



??@requests(on='/search')

??async def foo(self, docs: DocumentArray, **kwargs):

????docs.match(self._da, limit=9)

????del docs[...][:, ('embedding', 'tensor')]?# save bandwidth, not needed:        

Use the Flow to connect all the Executors and use `plot` to visualize it.

f = Flow(port_expose=12345).add(uses=PreprocImg).add(uses=EmbedImg, replicas=3).add(uses=MatchImg

f.plot('flow.svg'))        

Get the data, download image dataset and pass them to the Flow defined above:

index_data = DocumentArray.pull('demo-leftda', show_progress=True)        

Using Flow `f` index the documents

with f

??f.post(

????'/index',

????index_data,

????show_progress=True,

????request_size=8,

??)

??f.block():        

Search-able data and corresponding service is ready now. Use a Python client to access the service.

from jina import Clien


c = Client(port=12345)?# connect to localhost:12345

print(c.post('/search', index_data[0])['@m'])?# '@m' is the matches-selectort        

To access the service on a web browser, we can use this URL: `https://0.0.0.0:12345/docs`.

Straight-forward, isn't it?

More code-level examples can be seen at github.

  • Video Search (https://github.com/jina-ai/example-video-search)
  • Multi-modal Search (https://github.com/jina-ai/example-multimodal-fashion-search)
  • Meme Search (https://github.com/jina-ai/example-meme-search)

Fascinating!!

Conclusion

Humans think multi-modal. We have description, visuals, sound (and may be, smell and taste also) etc in our minds while looking for something specific. Can AI (Artificial Intelligence) be as effectives as humans? As more and more data-types get merged like demonstrated above, though for a narrow task of `search`, we are getting more closer to Human Intelligence, ie towards AGI (Artificial General Intelligence), isn't it?

Give Jina AI a try and feel free to share your comments...

References

  1. Getting started with Jina AI (https://kunalkushwaha.com/getting-started-with-jina-ai)
  2. Getting started documentation (https://docs.jina.ai/get-started/create-app/)
  3. Next-Gen Search powered by Jina (https://towardsai.net/p/l/next-gen-search-powered-by-jina)
  4. An Overview of Jina AI (https://www.section.io/engineering-education/an-overview-of-jina-ai/)
  5. Totally-Looks-Like Dataset (https://sites.google.com/view/totally-looks-like-dataset)
  6. Audio to Audio example (https://github.com/jina-ai/examples/blob/master/audio-to-audio-search/.github/demo.png)
  7. Official website (https://jina.ai/)

Sofia V.

Senior Product Marketing Manager

2 年

Thank you for taking the time to create such an insightful and well-crafted article. Your efforts are greatly appreciated.

回复
Kapil Hetamsaria

Enterprise AI | ex-McKinsey, Microsoft | entrepreneur, CEO, investor

2 年
Debdatta Acharya

Driving Customer Value | Supply Chain Management | Transforming Businesses | Contract Lifecycle Management

2 年

Fantastic Yogesh Sir. Lucid explanation of the concept. Love to read your articles.

要查看或添加评论,请登录

Yogesh Haribhau Kulkarni的更多文章

  • Intro to Neo4j

    Intro to Neo4j

    Graphs are inherently present in many domains such as logistics, social networks, etc. Graphs are nothing but nodes and…

    7 条评论
  • ????? ???? ???? ???? (Might is Right)

    ????? ???? ???? ???? (Might is Right)

    Imagine a group of friends is deciding about which movie to watch, a horror or a comedy!!. Usual way is by majority…

    1 条评论
  • I believe I can fly

    I believe I can fly

    Humans always wanted to fly. Like birds.

    7 条评论
  • Do-BERT

    Do-BERT

    BERT (Bidirectional Encoder Representations from Transformers) has taken the world of NLP (Natural Language Processing)…

    7 条评论
  • Cred-ibility

    Cred-ibility

    You would expect a successful Indian entrepreneur to be from IITs, IIMs or if not that at least from STEM (Science…

    4 条评论
  • Mathematics Can Be Fun

    Mathematics Can Be Fun

    My childhood days, apart from playing cricket and doing paintings were also filled with reading wonderful books from a…

    3 条评论
  • AI, generally speaking ...

    AI, generally speaking ...

    Artificial Intelligence (AI) covers wide range of technologies, right from ruled based expert systems to latest…

    5 条评论
  • Transformation by Hugging Face

    Transformation by Hugging Face

    Are you lost in the storm of these BERTs ie ALBERT, DistilBERT, RoBERTa etc? And these GPTs (1-2-3)? Don't understand…

    3 条评论
  • Reversing the Interview

    Reversing the Interview

    One of the popular topics in the interviews for Software Engineering roles, is 'Data Structures and Algorithms' (DSA)…

    17 条评论
  • We are the world, we are the sensors.

    We are the world, we are the sensors.

    Typical Artificial Intelligence (AI) approaches model the data they are fed with. Data is what we can measure and store.

社区洞察

其他会员也浏览了