Google Translate is a free multilingual translation service, based on statistical and neural machine translation, developed by Google. It is widely used to translate complete websites or webpages from one languages to another.
We will be creating a python terminal application which will take the source language, target language, a phrase to translate and return translated text. We will be implementing unit testing and web scraping techniques with selenium in python. Web scraping is a concept of capturing the required data from a website. Selenium is an industry grade library used for web-scraping and unit testing of various software. As a prerequisite, we will be requiring the following tools to be installed in our system.
- Python 3.x: A version of python 3.0 or above should be installed.
- Selenium library: A python library required for scraping the websites. Copy the following statement to install selenium on your system.
Installation: python3 -m pip install selenium - Webdriver: An instance of a web browser required by selenium to open webpages. Download the latest version of Chrome Webdriver from the link below and save it in the same folder in which your main program is.
Link: https://chromedriver.chromium.org/downloads
We will divide the code section into three portions:
- Setting up the selenium and chrome webdriver tool.
- Taking input and testing for error in input.
- Translating using Google Translate.
Part 1: Setting the selenium tool and webdriver settings.
python3
from selenium import webdriver from selenium.webdriver.common.keys import Keys from selenium.common.exceptions import NoSuchElementException from selenium.common.exceptions import JavascriptException # local variables from selenium.webdriver.chrome.options import Options as ChromeOptions chrome_op = ChromeOptions() chrome_op.add_argument( '--headless' ) browser = webdriver.Chrome(executable_path = 'chromedriver' , options = chrome_op) |
- Importing webdriver object to connect to the chrome browser instance.
- Importing keys library to connect the basic keyboard commands to the browser instance.
- Importing exception handlers for browser instance.
- Import browser options and set ‘–headless’ property to run the browser instance in background. Comment the ”chrome_op.add_argument(‘–headless’)” statement to bring the webdriver to foreground processes.
Part 2: Taking input and testing for in input.
python3
def takeInput(): languages = {"English": 'en' , "French": 'fr' , "Spanish": 'es' , "German": 'de' , "Italian": 'it' } print ("Select a source and target language (enter codes)") print ("Language", " ", "Code") for x in languages: print (x, " ", languages[x]) print ("\n\nSource: ", end = "") src = input () sflag = 0 for x in languages: if (languages[x] = = src and not sflag): sflag = 1 break if ( not sflag): print ("Source code not from the list , Exiting....") exit() print ("Target: ", end = "") trg = input () tflag = 0 for x in languages: if (languages[x] = = trg and not tflag): tflag = 1 break if ( not tflag): print ("Target code not from the list , Exiting....") exit() if (src = = trg): print ("Source and Target cannot be same, Exiting...") exit() print ("Enter the phrase: ", end = "") phrase = input () return src, trg, phrase |
This is a demo code so the languages code are kept limited to {English, Spanish, German, Italian, French}. You can add more languages and their codes later.
- Taking input for source language and target language code.
- Checking if the codes entered are supported or not.
- Source language and target language code should not be same.
Part 3: Translating using Google Translate:
python3
def makeCall(url, script, default): response = default try : browser.get(url) while (response = = default): response = browser.execute_script(script) except JavascriptException: print (JavascriptException.args) except NoSuchElementException: print (NoSuchElementException.args) if (response ! = default): return response else : return 'Not Available' def googleTranslate(src, trg, phrase): src + '&tl =' + trg + '&text =' + phrase script = 'return document.getElementsByClassName("tlid-translation")[0].textContent' return makeCall(url, script, None ) |
- googleTranslate() function receives the three parameters i.e. source code, target code and phrase. It generates the URL for the browser to request for.
- script contains a javascript statement which searches for an HTML element with class = “tlid-translation” and returns it’s text contents.
- makeCall() function makes a request with the URL created, executes the script when the webpage is ready and returns the fetched text.
Combining the above three parts.
python3
if __name__ = = "__main__": src, trg, phrase = takeInput() print ("\nResult: ", googleTranslate(src, trg, phrase)) |
Paste all the parts shown above in a single .py file and execute it using Python3.
Execution: python3 <filename.py>
Output:
Input section :
If you have commented the ‘–headless’ property statement, then a browser window like below, will appear:
The result will appear on the terminal window like below:
Note: This is demo project so language supported are limited. You can increase the language support by adding more language codes in the declaration.