Linkedin is a professional tool that helps connect people of certain industries together, and jobseekers with recruiters. Overall, it is the need of an hour. Do you have any such requirement in which need to extract data from various LinkedIn profiles? If yes, then you must definitely check this article.
Stepwise Implementation:
Step 1: First of all, import the library’s selenium and time. And most importantly we would require a web driver to access the login page and all.
Python3
from time import sleep from selenium import webdriver from selenium.webdriver.chrome.service import Service from selenium.webdriver.common.by import By from webdriver_manager.chrome import ChromeDriverManager |
Step 2: Now, declare the webdriver and make it headless, i.e., run the web driver in the background. Here we will be using the ChromeOptions feature from the web driver by making it headless().
Python3
options = webdriver.ChromeOptions() options.add_argument( "headless" ) exe_path = ChromeDriverManager().install() service = Service(exe_path) driver = webdriver.Chrome(service = service, options = options) |
Step 3: Then, open the website from which you want to obtain table data and make Python sleep for some time so that the page gets loaded quickly. Sleep is used to stop the program for that much time so, that the website that has been loaded gets loaded completely.
Python3
sleep( 5 ) |
Now let’s define a LinkedIn user id and LinkedIn password for that mail id.
Python3
# login credentials linkedin_username = "#Linkedin Mail I'd" linkedin_password = "#Linkedin Password" |
Step 4: Further, automate entering your Linkedin account email and password and make Python sleep for a few seconds. This step is very crucial because if the login failed anyhow then none of the code after this will work and the desired output will not be shown.
Python3
driver.find_element_by_xpath( "/html/body/div/main/div[2]/div[1]/form/div[1]/input" ).send_keys(linkedin_username) driver.find_element_by_xpath( "/html/body/div/main/div[2]/div[1]/form/div[2]/input" ).send_keys(linkedin_password) sleep( # Number of seconds) driver.find_element_by_xpath( "/html/body/div/main/div[2]/div[1]/form/div[3]/button" ).click() |
Step 5: Now, create a list containing the URL of the profiles from where you want to get data.
Python3
profiles = [ '#Linkedin URL-1' , '#Linkedin URL-2' , '#Linkedin URL-3' ] |
Step 6: Create a loop to open all the URLs and extract the information from them. For this purpose, we will use the driver function which will automate the search work by performing the operations iteratively without any external interference.
Python3
for i in profiles: driver.get(i) sleep( 5 ) |
Step 6.1: Next, obtain the title and description from all the profiles.
Python3
title = driver.find_element_by_xpath( "//h1[@class='text-heading-xlarge inline t-24 v-align-middle break-words']" ).text print (title) |
Step 6.2: Now, obtain the description of each person from the URLs.
Python3
description = driver.find_element_by_xpath( "//div[@class='text-body-medium break-words']" ).text print (description) sleep( 4 ) |
Step 7: Finally, close the browser.
Python3
driver.close() |
Example:
Python3
from time import sleep from selenium import webdriver from selenium.webdriver.chrome.service import Service from selenium.webdriver.common.by import By from webdriver_manager.chrome import ChromeDriverManager options = webdriver.ChromeOptions() options.add_argument( "headless" ) exe_path = ChromeDriverManager().install() service = Service(exe_path) driver = webdriver.Chrome(service = service, options = options) sleep( 6 ) linkedin_username = "#Linkedin Mail I'd" linkedin_password = "#Linkedin Password" driver.find_element(By.XPATH, " / html / body / div / main / div[ 2 ] / div[ 1 ] / form / div[\ 1 ] / input ").send_keys(linkedin_username) driver.find_element(By.XPATH, " / html / body / div / main / div[ 2 ] / div[ 1 ] / form / div[\ 2 ] / input ").send_keys(linkedin_password) sleep( 3 ) driver.find_element(By.XPATH, " / html / body / div / main / div[ 2 ] / div[ 1 ] / form / div[\ 3 ] / button").click() for i in profiles: driver.get(i) sleep( 5 ) title = driver.find_element(By.XPATH, "//h1[@class='text-heading-xlarge inline t-24 v-align-middle break-words']" ).text print (title) description = driver.find_element(By.XPATH, "//div[@class='text-body-medium break-words']" ).text print (description) sleep( 4 ) driver.close() |
Output: