Automate the Google search using Python
Soumyabrata Roy
Generative AI | Machine Learning | AI Engineer @ Deloitte ???? Ex-Cognizant ?? YouTuber @ DataDrivenDecision ?? Writer @ Medium
Whenever we need help, we take simple help from Google. With 5.4 billion searches every day, Google is the biggest search engine on the planet right now. But sometimes, it is quite daunting to search manually. what if, we could automate the whole process using python. In this article, I will help you with that. After finishing the article, you will be good to automate the whole process of google search using Python.
For this, we will take help from a Python library called Google Search. It is free, simple and pretty straight forward to use. To install the library, go to your command prompt or Anaconda prompt and type:
pip install google
After installing the google, just import the search module from google-search
from googlesearch import search
What search does is that it will take your query and search over google, find out the relevant URLs and save it in a python list. To get the saved results, you just need to iterate the objects in the list and get the URLs.
# as an example: query = "iPhone" for i in search (query, tld='com', lang='en', tbs='0', safe='off', num=2, start=0, stop=2, domains=None, pause=2.0, tpe='', country='', extra_params=None, user_agent=None): print(i)
Here tld: domain name i.e. google.com or google.in, lang: search language, num: Number of results per page, start: 1st result to retrieve, stop: last result to retrieve and so on. In your Jupyter notebook, if you click on shift+Tab and click on the plus sign, you will get all the information for the module.
Let's create a simple search algorithm about the stock market company analysis which will give you a quick overall glimpse of the company condition right now
# importing the module from googlesearch import search # stored queries in a list query_list = ["News","Share price forecast","Technical Analysis"] # save the company name in a variable company_name = input("Please provide the stock name:") # iterate through different keywords, search and print for j in query_list: for i in search(company_name+j, tld='com', lang='en', num=1, start=0, stop=1, pause=2.0): print (i)
Here I have created the search keyword like this "company_name+query". It takes the 1st query and adds the company name to it and search using google for the latest search results. When it completes the search, it prints out the URLs.
For simplicity, I have taken only the first URL from the search. If you would like you can take many more URLs for your search. Please run it using collab: https://colab.research.google.com/drive/17ZOtFIBoPoJYxantOFEWUCUHtpP3O4ay
The current algorithm only searches in the default mode. If you like to search category wise like News, Images, Videos, etc. etc, you can specify it easily with tpe: parameter. Just choose the correct category and you are good to go. Below I am running the same code but with Google news searches only
# only added tpe="nws" for news only for j in query_list: for i in search(company_name+j, tld='com', lang='en', num=1, start=0, stop=1, pause=2.0, tpe="nws"): print (i)
Similarly for videos add tpe="vde"; images, tpe="isch"; books, tpe="bks" etc.
If you would like to go in a much more broad example of what you can do using google search, here is an example.
from googlesearch import search import requests from lxml.html import fromstring # Link URL Title retriever usin request and formstring def Link_title(URL): x = requests.get(URL) tree = fromstring(x.content) return tree.findtext('.//title') company_name = input("Please provide the company name:") query = int(input("Please give the appropriate value. 1 for Fundamental Analysis, 2 for News, 3 for Technical Analysis & 4 for Share Price Forecast:")) if query == 1: print (company_name+" "+"Fundamental Analysis:") print (" ") for i in search(company_name, tld='com', lang='en', num=1, start=0, stop=1, domains=['https://www.tickertape.in/'], pause=2.0): print ("\t"+i) elif query == 2: print (company_name+" "+"News:") print (" ") for i in search(company_name+ 'News', tld='com', lang='en', num=3, start=0, stop=3, pause=2.0, tpe='nws'): print ("\t"+"#"+" "+Link_title(i)) print("\t"+i) print(" ") elif query == 3: print (company_name+" "+"Technical Analysis:") print (" ") for i in search(company_name+ 'Technical Analysis', tld='com', lang='en', num=3, start=0, stop=3, pause=2.0): print ("\t"+"#"+" "+Link_title(i)) print("\t"+i) print(" ") else: print (company_name+" "+"Share Price Forecast:") print (" ") for i in search(company_name+ 'share price forecast', tld='com', lang='en', num=3, start=0, stop=3, pause=2.0): print ("\t"+"#"+" "+Link_title(i)) print("\t"+i) print(" ")
Here I have used the same logic as the above examples. Two things I have added here is to create a function Link_title() to show the title of a particular link coming from the search query and for fundamental analysis, I have specified the relevant URL using domain option.
In this example, to analyze a stock, I have considered four categories: fundamental analysis, news, technical analysis, share price forecast. To keep that in mind, I have used the if Else condition to showcase the appropriate category URLs. So whenever you run the code, It will ask you the stock name and the search category, and then it will show you the correct URL with the title along with it. Please run the code using collab below:
https://colab.research.google.com/drive/1w6dXfUlEahI7M7HG2zSlW7QfkL-xA6vO
The stock analysis is just one of many applications you could use it for. It could be anything you want using google. I hope you have liked my article. You can access the code using Github link here: https://github.com/soumyabrataroy/Google-stock-search-using-Python.git
Senior Client Partner - Driving Revenue Growth Through Digital Transformation & State-of-the-Art Tech Enablement
5 年Excellent work Soumyabrata Roy. Happy to see your colab. Why don't you put a recommendation engine in the algorithm of the stock analysis? And specify the BSE link with a parameter (increased value). Won't that give you segmented results for only stocks with increasing values?